Hacker News

afro88 · 2026-05-07T17:41:12 1778175672

No offense, this is a crazy worthless contribution to the discussion.

Why?

dakolli · 2026-05-07T19:15:27 1778181327

Because everyone in these replies is in complete denial about the physical limits of memory and scaling in general. Ya'll literally living in an alternate reality where model capability increases with a decrease in size, its simply not the case. There will be small focused models that preform well on very narrow tasks, yes, but you will not have "agents" capable of "building most things" running on consumer hardware until more capable (and affordable) consumer hardware exists.

bensyverson · 2026-05-07T19:26:41 1778182001

Ah, you haven't realized that consumer hardware gets more capable over time

adrian_b · 2026-05-07T21:05:45 1778187945

Not this year, when many vendors either offer lower memory capacities or demand higher prices for their devices.

bensyverson · 2026-05-07T23:08:57 1778195337

Correct, the progress is not perfectly linear. But do you believe technological progress has stalled forever? If so, I'd get out of tech and start selling bomb shelters.

dakolli · 2026-05-07T23:33:20 1778196800

Do you really think the trend of consumer hardware is heading towards more memory and better specs? Apple's most popular product this year is an 8gb of RAM laptop..

The trend is heading in the opposite direction, less options for strong consumer hardware and towards cloud based products. This is a memory issue more than anything. Nvidia is done selling their ddr7 to gamers and people with AI girlfriends.

iuffxguy · 2026-05-08T02:42:09 1778208129

This is more then just the hardware evolving over time but we also are seeing big improvements in quantization and efficiency improvements.

dakolli · 2026-05-08T03:18:28 1778210308

There are physical limits to how much you can compress data. I'm just saying, don't sit on your hands waiting for this to happen, becuase its probably not going to for another decade +. There's no use in waiting, just write the code your fkin self and stop being lazy.

bensyverson · 2026-05-08T02:38:42 1778207922

Just so that I have your position straight: you actually believe that over the long term, like 10, 20 years, that the amount of RAM in a laptop is going to go down?

It's not out of the realm of possibility, but I just want to make you aware that this would be a very surprising development in computing history.

fulafel · 2026-05-08T03:31:18 1778211078

This seems to be a different discussion than was going on up thread about:

> in the next few years a "good enough" model will run on entry-level hardware

wtallis · 2026-05-08T03:42:45 1778211765

Exactly. In the next few years, entry-level hardware will not be advancing beyond 16GB. And anything beyond 32GB will remain decidedly high-end.

And that's for laptops with unified memory. In the desktop space, 8GB discrete GPUs are going to be sticking around for a very long time.

bensyverson · 2026-05-08T23:20:03 1778282403

I guess we'll find out! I bet all the vendors who supply RAM are looking at the current shortages and thinking "well, it's a shame we could never manufacture more RAM than we currently do."

dakolli · 2026-05-08T03:15:15 1778210115

A future with less RAM is possible with more applications using computational storage with ssd/nvme.

But that's not my main argument is that its delusional for OP thinks its reasonable to expect that soon we'll be able to run models on consumer hardware that will be able to build basically most things,

But I do think there will be many compromises made for consumer electronics, I don't think the powers that be are eager to give consumers all the best memory (that should be clear by now) There's 3 DDR5 DRAM manufactures in the world that have to provide memory to all the world's militaries, governments, datacenters/corporations. Consumers are last priority.

marci · 2026-05-08T17:25:34 1778261134

Did they modify their post? I can't see who claimed that consumer hardware will be able to build most things?

dakolli · 2026-05-08T18:23:27 1778264607

> If you looked at a graph of GPU power in consumer hardware and model capability per billion parameters over time, it seems inevitable that in the next few years a "good enough" model will run on entry-level hardware.

Of course there will always be larger flagship models, but if you can count on decent on-device inference, it materially changes what you can build.

I'm making some assumptions about what they're saying, but it seems clear they have no idea what they're about and that they're betting their competency on this technology.

bensyverson · 2026-05-08T21:29:52 1778275792

If you're not paying attention to what's happening with small models, I suggest you take a closer look. Keeping parameter count constant, the quality of small models is rising fast. When you look at what you could do with Llama just 3 years ago vs Gemma 4 on the same 16GB hardware, the trend is clear.

Meanwhile, this year Apple bumped the base of their Mac lineup from 8GB to 16GB RAM, and the iPhone 17 Pro ships with 12GB. The Neo is at 8GB but is a brand new product tier which is not comparable to any past model.

zozbot234 · 2026-05-08T21:34:43 1778276083

Small models are gaining useful reasoning ability and that's a genuinely helpful development, but they'll be heavily limited in world knowledge for the foreseeable future. BTW, the base of the Mac lineup is now once again a 8GB device with a small and low-performance SSD. Many people will tell you that it's broadly comparable (though of course not identical!) to the original base model M1.

bensyverson · 2026-05-08T23:18:13 1778282293

For many tasks, including lots of agentic applications, world knowledge is not a "must-have."

To me the Neo is an exception, and doesn't represent the core Mac lineup, which is all at 16GB+ of RAM. If you're developing pro software that would rely on an on-device LLM, you probably wouldn't be targeting the Neo anyway.

zozbot234 · 2026-05-08T18:49:59 1778266199

Anything can technically "run" on almost any hardware, the meaningful question is what's the real-world performance. I for one have made a case in this thread that DeepSeek V4 is de facto optimal for wide batching, not single-request or single-agent inference - even on consumer hardware (which is unique among practical AI models). I might still be wrong of course, but if so I'd like to understand what's wrong with my assumptions.