Yeah, a century old approach using frequency tables where the next word is generated based just on the previous two words is highly competitive with contemporary neural nets. To produce coherent sentences, do you ever need to remember more than the last two words you said? I doubt it.