Nobody is deploying 3+GB models to iOS beyond some enthusiast “because you can” ...

hustwindmaple1 · on Aug 11, 2023

If you have played any large mobile games, then you would not be surprised to see apps downloading massive files during first open.

brucethemoose2 · on Aug 11, 2023

A small download + an in-app weights download (and a space requirement warning) is probably sane, right?

refulgentis · on Aug 12, 2023

I agree, we're too far down a chain of hypotheticals motivated by "but ONNX must be bad compared to $MODELX.cpp?"

Wouldn't make sense to deploy 4-bit quantization as a product either.

turnsout · on Aug 10, 2023

The size makes it tough for App Store deployment, but I could imagine using a local LLM on-device for an enterprise app.