I had a similar experience, asked to design a sudoku solver.
Coming from an AI background, I coded up an algorithm for state-space search.
They were expecting me to approach it as a person would, reasoning about how you rule out certain numbers (the constraint propagation), which is a second-order consideration for a game the size of sudoku.
I felt pretty smug when Peter Norvig shortly thereafter put up a blog post that essentially mirrored my solution.
Nothing really clever though - he only applies the two most basic sudoku strategies:
If a square has only one possible value, eliminate it from the square's peers.
If a unit has only one possible place for a value, then put the value there.
And then he notes that those simple strategies combined with search yields very reasonable run times.
Actually, for 9x9 Sudoku doing any more advanced propagation is pointless, since it will slow down the time to find the solution (assuming a reasonably fast search process). If one wants to solve 9x9 Sudokus without search, full propagation combined with shaving (hypothetical reasoning: if I assigned this square this value, woould it lead to an inconsistency) suffices as far as is known currently (no proof, just experiments, http://www.4c.ucc.ie/~hsimonis/sudoku.pdf).
For 16x16, it is more of a toss-up if one should use more advanced propagation. The solve-times are still so low though, that it doesn't really matter (tens of milliseconds). For 25x25 it starts to get interesting. In my experience, full propagation on lines, rows, and regions is needed, but more than that slows it down. Without a good heuristic (some learning process, prefferably coupled with randomized restarts), the solve-time easily goes up into hours. Wih a good heuristic, minutes seems to be a reasonable time-span to hope for.
It might not be that clever, but it is able to considerably reduce the running time of the algorithm (even if the speedup in a 9x9 puzzle is not perceived).
In fact, if you ask people not from a CS background to explain the steps they would take to solve a puzzle, they probably wouldn't be able to think of a DFS, but they would state that they should rule out from each cell the numbers already present in its respective column, line and block. And for a Sudoku solver I don't think you really need more constraint propagation than that.
Anyway, I think you approached the problem the right away, I just think that you were so close to succeed in that test and it wasn't something that difficult to add to your solution.
EDIT: wrote the above before your edit. It's still appropriate though.
Coming from an AI background, I coded up an algorithm for state-space search.
They were expecting me to approach it as a person would, reasoning about how you rule out certain numbers (the constraint propagation), which is a second-order consideration for a game the size of sudoku.
I felt pretty smug when Peter Norvig shortly thereafter put up a blog post that essentially mirrored my solution.