This reminds me of an early ML study about detecting skin cancer from pictures with a high accuracy rate.
The problem was, that with the ML, they ended up building a ruler classifier, because most of the pictures with skin cancer happened to also have a ruler in them to measure the size.
Or the commercial model that identifies criminals from their photograph. Turns out people who frown are criminals. People who smile aren't. Or so you'd believe if you anchored your expectations comparing mug shots to social media profile pictures.
The problem was, that with the ML, they ended up building a ruler classifier, because most of the pictures with skin cancer happened to also have a ruler in them to measure the size.