The fundamental nature of the model is that it consumes tokens as input and prod...

earthscienceman · 2026-03-04T20:54:48 1772657688

Non-sequitor: "perspective hangover" might be my favorite phrase I've ever read. So much of what we deal with is trying to correct-the-record on how we used to think about things. But the inertia that old ideas or modes have is monumental to overcome. If you just came up with that, kudos.

dilap · 2026-03-05T15:27:01 1772724421

Ha, thanks!

gpm · 2026-03-05T00:17:29 1772669849

We could argue about whether fine tuning is still about predicting a distribution or not, but really I feel like whether or not that word is accurate misses the point of why the description is useful.

I like the phrasing because it distinguishes it from other things the generative model might be doing including:

- Creating and then refining the whole response simultaneously, like diffusion models do.

- Having hidden state, where it first forms an "opinion" and then outputs it e.g. seq2seq models. Previously output output tokens are treated differently from input tokens at an architectural level.

- Having a hierarchical structure where you first decide what you're going to say, and then how you're going to say it, like wikipedia's hilarious description of how "sophisticated" natural language generation systems work (someone should really update this page): https://en.wikipedia.org/w/index.php?title=Natural_language_...

dilap · 2026-03-05T15:30:14 1772724614

Welllll I'm not so sure that phrase is well-suited for your intended meaning, then. (Also, tangentially, I think could argue thinking models w/ the elided thought prelude satisfy "having hidden state where it first forms an opinion.")