I just hacked into my app's flow to upload a "scan" of the isolated puzzle to my server instead of slicing it and sending the component images to CoreML.
Then I sat there and flipped through page after page of Sudoku puzzles and scanned them from a few different angles each, sliced them in bulk on the server, and voila: data!
Sorry I’m still confused. You took roughly 7000 pictures in two afternoons? What do you mean by sliced them in bulk? If you took them from different angles how do you slice them in bulk?
By "slicing in bulk" I mean the server was the one that split that out into 81 smaller images rather than the app doing the slicing and uploading 81 small images.
Taking them from different angles was done because the perspective correction adds distortions that I didn't want my model to be sensitive to.
7000 pictures at 5 seconds per picture is "only" 10 hours of work. Possibly per-picture time can be lower than that too. Seems quite doable over 2-4 afternoons.
Props for doing the project end2end, including the non-trivial (and typically skipped) part of collecting training data.
600,000?!? Even divided by 81 that's over 7000! How long did this take?