Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Now just need an autoregressive transformer <==> RNN isomorphism paper and we're golden


Plain RNNs are theoretically weaker than transformers with COT: https://arxiv.org/abs/2402.18510 .


The paper says transformers perform better than RNNs, which is not surprising.

However, they are both, theoretically, Turing complete computers. So they are equally expressive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: