What does economics in die area say about using die for highly repeatable simple memory forms, vs complex instruction/ALU? I would think there is a balance point between more cores, and faster cores with more cache and more lanes between them.
We're on 8 as a mixture of fast and slow, hot and cool, on almost all personal devices now. I routinely buy Intel instruction racks with 16 per CPU construct, and a TB of memory. At one remove, chips seem to be fulfilling my mission but I am not in GPU/TPU land. Although I note, FPGA/GPU/TPU seems like where a lot of the mental energy goes these days.
VLSI is a black art. Quantum effects are close enough to magic I feel confident saying few people today understand VLSI. I know I don't, I barely understand virtual memory. I ache for the days when a zero pointer was a valid address on a pdp11
I think its possible. When 128 bits came along, I wondered if the very long instruction word people were going to come back into the room. If we imagine mapping addressing into the world, how about making some rules about top bits addressing other nodes in the DC and see how "shared memory" works between distinct racks?
Dont mistake technical viability for economic viability.