Curiously this seems like exactly the story of thing that code ML should be able to identify and flag. 'Hey, the comment says this, but the code is doing something totally different.'
However, it might be trained on code with erroneous comments, either because it was mangled by a merge or because it's outdated. The more times this happen, the more confused the AI model will be.
Which makes me to think that the AI models should be trained on the code evolution of commit chains and not just on isolated snippets of code. That way, the AI could analyze your own commits to detect when a comment becomes outdated.