“There’s no mystery here” Nobody’s claiming there’s ‘mystery’. Transformers are ...

“There’s no mystery here”

Nobody’s claiming there’s ‘mystery’. Transformers are a well known, publicly documented architecture. This thread is about a video explaining exactly how they work - that they are a highly parallelizable approach that lends itself to scaling back propagation training.

“No person with … formal training … would be taken in here”

All of a sudden you’re accusing someone of perpetuating a fraud - I’m not sure who though. “Ad companies”?

Are you seriously claiming that there hasn’t been a qualitative improvement in the results of language generation tasks as a result of applying transformers in the large language model approach? Word frequencies turn out to be a powerful thing to model!

It’s ALL just hype, none of the work being done in the field has produced any value, and everyone should… use ‘statistical dashboards’ (whatever those are)?