There's actually an old paper titled Optimal Brain Damage, where they don't try to find optimal quantizations, but optimal sparse versions of a models-- i.e. where some weights are set to zero.
That was really for them. You’re out there building neat stuff. Your talent might warrant looking into AdderNets and Bitnets which might get the cost down. There’s also some brain-inspired designs.
I don’t think many people have implemented such things. You might discover something new experimenting with them.