You're right about reference [2], which can alter things by ~1 order of magnitud...

kelseyfrog · on Oct 7, 2022

Thank you for the tokens-to-words factor! Much appreciated.

I'm definitely in agreement that multi-task models represent an ability to learn more than any one specialized model, but I think it's a bit of an open question whether multi-task learning alone can fully close the digital-biological gap. Of course I'd be very happy to be proven wrong on this though by empirical evidence in my lifetime :)