Note: At the point of writing this, the comments are largely skeptical. Reading ...

CjHuber · 2026-01-05T12:14:54 1767615294

I have the theory that agents will improve a lot when trained on more recent training data. Like I‘ve had agents have context anxiety because they still think an average LLM context window is around 32k tokens. Also building agents with agents, letting them do prompt engineering etc, still is very unsatisfactory, they keep talking about GPT-3.5 or Gemini 1.5 and try to optimize the prompts for those old models, which of course was almost a totally different thing. So I‘m thinking if that‘s how they are thinking of themselves as well, maybe that artificially limits their agentic behavior too, because they just don’t know how much more capable they are than GPT-3.5

blks · 2026-01-05T16:28:31 1767630511

Because “strengths” of a model is based not on inherit characteristics, but on various user perception. It feels that model A is doing some thing better, same at it feels that your productive is high.

nkko · 2026-01-05T15:57:24 1767628644

Strong point. I’m considering to tag patterns better and add stuff like “model/toolchain-specific,” and something like “last validated (month/year)” field. Things change fast and for example “Context anxiety” is likely less relevant and should be reframed that way (or retired).