Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What’s the process to deliver and test a quantized version of this model?

This model is 264GB, so can only be deployed in server settings.

Quantized mixtral at 24G is just small enough where it can be running on premium consumer hardware (ie 64GB RAM)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: