Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think thats exactly the point so everyone can run it on their PCs with no GPU.


Or without a beefy GPU. I've got 8GB VRAM, which is great for Stable Diffusion but not useful for any of the language models released so far.

I think the 4-bit 7B LLaMA would work, but the 7B is pretty fast anyway without GPU.


I'm installing it here. How's the 7B model going so far?


Haha, I just finished ordering 32GB of additional memory for my PC so I can run the 65B model, if that tells you anything. I'm upgrading from 32GB -> 64GB.

7B is fine, 13B is better. Both are fun toys and almost make sense most of the time, but even with a lot of parameter tuning they're often incoherent. You can tell that they have encoded fewer relationships between concepts than the higher-parameter models we've gotten used to--it's much closer to GPT-2 than GPT-3.

They're good enough to whet my appetite and give me a lot of ideas of what I want to do, they're just not quite good enough to make those applications reliably useful. Based on the reports I'm hearing here of just how much better the 65B model is than the 7B, I decided it was worth $80 for a few new sticks of RAM to be able to use the full model. Still way cheaper than buying a graphics card capable of handling it.


Heh, you just made me upgrade as well. After originally paying 130 € for 32 GB, it’s nice that I only had to pay 70 € to double it ;) Not sure if I want to run LLMs (or if my Ryzen 5 3600 is even powerful enough), but I’ve wanted some more RAM for a while.


If I was running in a server context, would the 50gb of ram be required to respond to one request, or can it be used to respond to multiple requests simultaneously?


I'm very late to this question, but I believe that that amount is only required once, but the context tensor will need to be created per request. I haven't confirmed that, though.


I'd assume that all the calculations used for 1 request would already eat up that amount of memory, but I could be wrong!


I'm still holding on to a small bit of hope that the GPU market will normalise this year. Don't think that I'm the only one looking to get something highly capable but for a fair price.


> I’m still holding on to a small bit of hope that the GPU market will normalize this year.

I suspect all the people hoping it will (b/c of Stable Diffusion, etc.) are exactly the reason it won’t.


Me too. But for 3rd world countries its mad priced.


It's expensive for first-world countries too. Just look at the 4090 - it's insane that it costs 2k EUR... it's literally double the fair price (which itself is high).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: