You can't prove LLMs can build theories like humans can, because we can effectiv...

imtringued · 2025-04-28T13:39:20 1745847560

This is correct. The model context is a form of short term memory. It turns out LLMs have an incredible short term memory, but simultaneously that is all they have.

What I personally find perplexing is that we are still stuck at having a single context window. Everyone knows that turing machines with two tapes require significantly fewer operations than a single tape turning machine that needs to simulate multiple tapes.

The reasoning stuff should be thrown into a separate context window that is not subject to training loss (only the final answer).

fouc · 2025-04-29T05:27:32 1745904452

Or have at least 2 models. Each with their own dedicated context.