Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure if this is the type of answer you're looking for, but RWKV is not really recurrent the same way RNNs are recurrent. This quasi-recurrentness allows it and its comrades to use algorithms like parallel SCAN to achieve log N complexity when parallelised. But you pay for that in terms of state-tracking.

There's a cool talk here if you care to know the details:https://www.youtube.com/watch?v=4-VXe1yPDjk



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: