Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

“There’s no mystery here”

Nobody’s claiming there’s ‘mystery’. Transformers are a well known, publicly documented architecture. This thread is about a video explaining exactly how they work - that they are a highly parallelizable approach that lends itself to scaling back propagation training.

“No person with … formal training … would be taken in here”

All of a sudden you’re accusing someone of perpetuating a fraud - I’m not sure who though. “Ad companies”?

Are you seriously claiming that there hasn’t been a qualitative improvement in the results of language generation tasks as a result of applying transformers in the large language model approach? Word frequencies turn out to be a powerful thing to model!

It’s ALL just hype, none of the work being done in the field has produced any value, and everyone should… use ‘statistical dashboards’ (whatever those are)?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: