Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So below 128gb is the sweet spot for local LLMs...


TBH, they are all rather useless at those sizes.

I used to run a lot of local models on my mbp - mainly stt, tts, embeddings and diffusion models - and small LLMs used for utility purposes - but stopped. It saves time in the long run to run those models on target architecture from the get go - which in most cases is nvidia/cuda - rather than test and tweak on metal, and then switch to cuda for prod - and experience weird and subtle differences and regressions. I don't think it makes much sense to develop anything (other than hobby projects for home use) on mlx atm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: