Someone actually mathed out infinite monkeys at infinite typewriters, and it turns out, it is a great example of how misleading probabilities are when dealing with infinity:
"Even if every proton in the observable universe (which is estimated at roughly 1080) were a monkey with a typewriter, typing from the Big Bang until the end of the universe (when protons might no longer exist), they would still need a far greater amount of time – more than three hundred and sixty thousand orders of magnitude longer – to have even a 1 in 10500 chance of success. To put it another way, for a one in a trillion chance of success, there would need to be 10^360,641 observable universes made of protonic monkeys."
Often infinite things that are probability 1 in theory, are in practice, safe to assume to be 0.
So no. LLMs are not brute force dummies. We are seeing increasingly emergent behavior in frontier models.
> It is unsurprising that an LLM performs better than random! That's the whole point. It does not imply emergence.
By definition, it is emergent behavior when it exhibits the ability to synthesize solutions to problems that it wasn't trained on. I.e. it can handle generalization.
Emergent behavior would imply that some other function was being reduced to token prediction. Behaving "better than random" ie: not just brute forcing would not qualify - token prediction is not brute forcing and we expect it to do better, it's trained to do so.
If you want to demonstrate an emergent behavior you're going to need to show that.
I'm all for skeptical inquiry, but "burning all credibility" is an overreaction. We are definitely seeing very unexpected levels of performance in frontier models.
Third things can exist. In other words, you’re implying a false dichotomy between “human computation” and “computer computation” and implying that LLMs must be one or the other. A pithy gotcha comment, no doubt.
Edit: the implication comes from demanding that the OP’s definition must be rigorous enough to cover all models of “computation”, and by failing to do so, it means that LLMs must be more like humans than computers.
I think you are vastly underestimating the emergent behaviours in frontier foundational models and should never say never.
Remember, the basis of these models is unsupervised training, which, at sufficient scale, gives it the ability to to detect pattern anomalies out of context.
For example, LLMs have struggled with generalized abstract problem solving, such as "mystery blocks world" that classical AI planners dating back 20+ years or more are better at solving. Well, that's rapidly changing: https://arxiv.org/html/2511.09378v1
No idea how underestimate things are, but marketing terms like "frontier foundational models" don't help to foster trust in a domain hyperhyped.
That is, even if there are cool things that LLM make now more affordable, the level of bullshit marketing attached to it is also very high which makes far harder to make a noise filter.
there should be only 3 regular meetings in an agile engineering team
- weekly iteration planning (1-2 hours max)
- daily standup (15 mins max)
- weekly demo & retro (1-2 hours max)
literally everything else is work off the kanban board or backlog.
in my teams everyone was told to decline all meetings unless it explicitly led to the completion of a weekly planned story/task. this way all meetings for the team have a clear agenda and end in mind.
for mandatory external meetings & running interference with external parties, there are ways to insulate the majority of the team from that.
Is that three kinds of regular meetings? Because I count 8 meetings (and four kinds, as I don't think I've ever had demo and retro combined due to different groups of people being in both).
Not correcting, just clarifying for myself. I sure wish I had such a controlled environment with only 15% of time in coordination and where standup actually was 15 mins and not a segue into the everything meeting.
that's really what agile was supposed to be. at least in the places where I saw it was successful.
every week, something is delivered, and is demoable, with approved tests from the business. That thing represents the most important thing to the business relative to the risk prioritization from engineering & usability prioritization from design.
every week, priorities can adjust, etc. and the cycle continues. hitting the actual 'release date' becomes much more knowable when you see the tangible date-driven progress on a regular cadence.
Yes, but expanded to the full deadline instead of only the short iterations.
The business does not care about week long deadlines. They need something on May 23 so they can achieve _______.
My understanding of Scrum (not representative of all agile, I know) is that the velocity is supposed to be tracked and used for better predictions. In my experience this takes a very dedicated core of people who are intent on making it happen. In other words, usually it doesn't happen.
But date-bound delivery is already our default mode of operation. We just don't like to admit it. We are going to deliver something on this date; we just don't know what, yet.
However the point of the weekly cadence is that the business does care about adjusting scope and priority towards hitting that deadline on May 23, so that they know what they're going to get on May 23 and have the power to adjust it.
Especially if the goal of what is delivered on that date is not clearly defined. It almost never is.
Most projects can be summed as "give me $X, I'll come back in 6 months, and ask for more time and money". or "here you go"... "that's not what I wanted".
It's a key risk mitigation toward a hard date to know every week if you're still getting what you wanted.
Velocity is overblown as a metric. It's one metric among many that can signal a few things (e.g. quality problems because bug fixes are overtaking features) but isn't as much of a lever as some say.
Yep I agree. Iterations are still good, demos are still good, ever-evolving scope discussions are still good, regardless of the overarching methodology.
IMO (also 30 years in the biz), it's rarely the date, that's #2. it's the budget.
They'll forgive you if you're slightly late, they'll hate you forever if you ask for more money.
Agile works really well if you have a good product owner that has secured appropriate budget for the level of uncertainty in the endeavor & can make decisions and not be overridden by extrinsic forces. Everything else is negotiable.
To me, the _real_ thing that matters isn't quite date or budget, but something that somehow acts as an umbrella to both of them: the promise. When you promise to deliver something by a day, or within a budget, it's very clear whether you met your promise or didn't. However, when it comes to functionalities, there is more of a grey area: you can start to argue that something _mostly_ works, that some bugs are always inherent, or that this functionality actually is not really needed because the problem can be fixed in an operational way, or that the requirements have changed, or that it was just a nice-to-have... but money/time don't have this grey areas.
That was literally the first thing I thought reading from OP comment down to your parent.
Then I thought: Sure but management made the devs promise these things. We don't do it of our own volition (exceptions prove the rule - some people are conditioned to do it of course).
That might be true in certain kinds of companies. I never worked in consulting, and I've always been so far down the totem pole that nobody has ever expected me to adhere to a monetary budget. I suppose I am extremely lucky, but at all places I've worked, I had no idea how much our software cost to build or how much revenue it brought in. If we needed a software license or development systems, or a specialized piece of hardware, we just requested it and it materialized. Often I didn't even know the per-customer unit price of the software. The only constraint that ever made it down through the huge tree of managers was the due date. Someone five managers above me was probably sweating the budget but to us low level developers, budget was never even a concept.
It varies on company culture and business model. Your situation sounds like R&D shops and how they often manage things.
R&D usually is budget constrained at the company or division level (% of revenue) and you can only ask for it once a year. Next year's budget time determines if you get more or less. Time constraints come indirectly (proof of progress for budget expansion or more importantly declining revenue from existing products),
But the only way management knows how to hold R&D accountable to ship is with dates as a forcing function, and those dates are often invented or organized around industry events (conferences, press events, etc).
There are other ways to manage progress, dates are the most common lever. That can work but can be abused by bad management. I've usually preferred shops that say "it ships when it's ready", but they require special circumstances to maintain funding and measure progress. In general if what you build is more important than when it ships, "it ships when it's ready" is better than hitting a date with a dud. So long as there's value for the budget and a way to measure it.
Hacker News discusses "deadlines" as one of many management strategies. How much depth is there really? Other industries use bonuses as a simple management strategy. The kinds of people writing blog posts like this do terribly boring work, which is the real problem.
I program for 20 years now and I think that what many people do wrong about these estimates is that they give them too early. The truth is that for many project the only truthful answer you could give someone on the question hoe long it takes is: "That depends on many things some of which I don't know, some of which we both don't know and some of which potentially nobody knows." After that you should say: "In my experience it takes betwern x and y weeks, with a lot also depending on how responsive your side is."
Time estimates are always hard, not only in programming. And outside of programming one of the main insecurities is customers changing the plan or wanting adjustments. This is the side you can't really control, so it is best to get a feeling for the customer, their communication patterns and their expectations early on and factor it in. The other insecurity is tough problems you encounter during the programming phase. How well you can deal with those depends a lot on how experienced your programmers are and how much they were involved in the inital process.
The truth is that the latter insecurities make up a main part of the whole thing and it has to be okay to tell a customer you can't give them an estimate before you know some more details.
That's never been true. At minimum, as it's missing the most important variable: quantity of people. 1 person working for a year vs. 12 people working for a month could cost the same and have dramatically different results. And it ignores so many other aspects of effective use of productive capital in software dev (toolchains, cloud, AI, etc.)
reply