Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's because the attention mechanism was designed for Seq2Seq models (i.e. translation in its most general form).

Any other use of it is a case of "I have a hammer, so that's a nail".



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: