If you are worried about agents diverging from user intent why not log user messages in a file, and make it a point to review this file against plans and executed work? In my own harness nothing the user types gets lost. It might be the most valuable piece of documentation in the project - the raw message log. I am only keeping user side, which is pretty thin, it's enough to figure out what happened. Logging messages to a file is just a matter of adding a user message submit hook, it costs nothing until used.
Codex and Claude Code store all this too. Lately I've started having each agent regularly read each other's chat transcripts as well as their own, including even the very same session I'm in. (With big contexts they increasingly forget a few things that they re-learn by just looking at the verbatim transcript.)
I don't think it's worth writing my own harness or switching to Pi and writing a plugin, but I definitely need to create some skills to automate much of this.
It is not worth switching to Pi except as a hobbyist.
Something that is overlooked: the mainstream harnesses have a huge advantage in telemetry and datapoints to use to improve the harness. They have internal teams building the tooling. They have tight integration built-in with their own backends (e.g. optimizing for caching).
Are you tinkering? Or trying to build something useful? If you're trying to build something useful, use a tool.
In this era of software when you can build almost anything you can imagine, why spend that time building plugins for a harness?
Pi has optimizations as well, and development is quite active.
We are literally months into this new frontier. Mainstream harnesses are not far off from a minimal + extensible open alternative.
You don’t have to build your own plugins, as you can simply install an existing plugin that does what the mainstream harnesses do. Folks are already making the same functionality, but with more control to the user.
If you are a builder, like many reading this thread, pi is the way to go. Pi already gives you the tools to leverage LLMs to assist with building plugins, if that’s the way you want to go.
That's like arguing that you should spend your time tuning your IDE. How does that relate to end-user value created?
Yes, you built yourself a nice little utility.
Meanwhile, you wasted those tokens and time that could have been spent building actual, useful software instead of hobby tinkering your harness.
It's like thinking your sneaker tread design is going to make the difference between you and someone who just goes out there and runs everyday. The person that just runs is going to win the race every time while you 3D print the perfect tread design optimized for you running style...and don't actually run.
If you want to produce better results at running, you just run and optimize the externalities (gear) later. Same here: you have a magical software production factory and the only thing you want to use it for is your hobby tweaking of your perfect harness instead of...just making useful software.
Why would taking the more open, minimalist, configurable and ultimately diligent route means you won't be working on anything else?? Not to mention that pi has other advantages over Claude and Codex, read up on it. Also, improvements to the agent itself will pay more dividends the earlier they are applied. The tone of this message is waaaay off.
> Why would taking the more open, minimalist, configurable and ultimately diligent route means you won't be working on anything else??
You're using the same finite pool of time and tokens. Why waste your time with the perfect gear instead of focusing on just getting really good at running? Just go run and when you've pushed the limits and the gear becomes the difference, then optimize the gear to get to the next level.
While you're busy trying to optimize your harness, others are just building and shipping with the magical software factory.
What are these "others" shipping, slopware? Agents are not a "magical software factory", they are a tool with a lot of limitations, but which can speed up development in a sustainable way, when used wisely. And that includes configuring it in a way that complements the other tools in our toolkit.
Everyone's waking up to this simple truth: vibe coding like there's not tomorrow accumulates conceptual and technical debt at a unsustainable rate. Then when the "magical factory" gets mired in its own mess, it's back to the drawing board. This is the also what the makers of pi have discovered, if you listen to their talks about how pi came about. I don't believe there are any justification for the assumptions you make about their approach, nor am I seeing you presenting any either. As it is, you take just feels peevish and unfair, to be honest.
A story to share: friend vibe coded absolute slop with Replit starting late 2024 (!!). Absolute trash code. Hacked multiple times because his login code exposed the full user list on the FE (!!!). Hacker found a way to exploit his account confirmation email because it was all front-end and sent an email to every customer telling them he was hacked. One time called me up in a panic asking why his web page was randomly refreshing (turns out, he was serving it in dev mode via Vite with HMR). It was mistake after mistake after mistake.
But he started to get customers. First a handful, then a dozen, then enough to get legal threats from other vendors, and this year, his first "enterprise" deal providing software in a space that was long dominated by a duopoly of legacy providers.
Guess what he did? Just rewrote it with the latest models and hired one engineer to ensure agents followed better practices. It's a legit business now built by a tiny team using a magical software factory to produce absolute trash code, but in shipping it, he found a market and customers willing to pay him for an alternative to the duopoly.
See, at the end of the day, it's cute that you have the perfectly tuned harness, but that also means whatever time you spent tuning your harness, reading up on Pi, spending tokens on your custom plugins -- all of that time and resources could have been used just building something useful.
People use Replit to build websites too, and some of them might scratch enough of a need to make money this way. So what? Is this what I should be mightily impressed with? That some random dude vibe coded some slopware which he was able to convince some random others to pay him for? I'm personally more interested and impressed by brilliant technical achievements, even if less monetizable, than some hustle or another in some industry niche which only ever attracted the interest of two legacy players. This is Hacker News, not Hustler News after all.
> Something that is overlooked: the mainstream harnesses have a huge advantage in telemetry and datapoints to use to improve the harness. They have internal teams building the tooling. They have tight integration built-in with their own backends (e.g. optimizing for caching).
> Are you tinkering? Or trying to build something useful? If you're trying to build something useful, use a tool.
Do I want to become completely dependent on the pricy pay-as-you-go tool? In the long run that will make me powerless.
You'll be dependent on it whether or not you use the main harnesses. You pay for the model. The frontier models will likely always be better than the open source ones.
> Are you tinkering? Or trying to build something useful? If you're trying to build something useful, use a tool.
I don't think that you really get what this new era of software is about otherwise you would understand why the experienced are spending time tinkering on the so called harness (like openclaw did)
> It is not worth switching to Pi except as a hobbyist.
Permit me to paraphrase slightly. "It is not worth switching to Linux except as a hobbyist. Something that is overlooked: the mainstream OSs have a huge advantage ....".
You are in good company. In 1999, Bill Gates confidently dismissed Linux as a threat, arguing it lacked the central control, features, and graphical interface needed to compete in the commercial market.
Back to the article, quoting:
> Pi might be built with Pi, but we’re quite far off today from where Bun and OpenClaw already are: fully detached, automated software engineering.
Please don't call it software engineering. I've been programming for 40 years, and most of that time had to put up with the derision from the other engineering disciplines: "If civil engineering built things like software engineers, the first woodpecker that came along would destroy civilisation". It hurt because it was true. It's still often true for things like web pages, but for the things I use like Linux and vim, it hasn't been true for a long, long while. We have finally mastered how to repeatedly build solid, reliable software.
Which is why I'm an Anthropic refugee. Opus is definitely the best for coding, but claude-cli + bun is the most unreliable piece of crap I've had the misfortune to come across in a while. Sadly I can't afford their API pricing, so either my principles or Opus had to give. I went to pi and an open-source model. The difference between the top open-source models and Opus are noticeable, but not drastic, unlike the difference between pi and claude-cli.
pi has proved to be solid, fast, have a transparent design, and be customisable in the old Linux way ("do one thing, and do it well"). I pray that will never change.