Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yeah even the official flashattention is moving many implementations from cutlass to triton except for the main mha backward/forward pass


It was written with cutlass? No wonder Peter Kim found it valuable and worthwhile to de-obfuscate. Adopting a new programming language invented by OpenAI doesn't sound like a much better alternative. I'd be shocked if either of them were able to build code for AMD GPUs, where it's easy to adapt CUDA code, but not if it's buried in tens of thousands of lines of frameworks. I like open source code to have clarity so I can optimize it for my own production environment myself. When people distribute code they've productionized for themselves, it squeezes out all the alpha and informational value. Just because something's open source doesn't mean it's open source. I think people mostly do it to lick the cookie without giving much away.


Triton has an AMD backend, although work is still ongoing.


You will also be able to use Triton to target Ryzen AI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: