So many messages about how Codex is better then Claude from one day to the other, while my experience is exactly the same. Is OpenAI botting the thread? I can't believe this is genuine content.
not a bot, voiced frustration is real here. I kind of depend on good LLMs now and wouldn't even mind if they had frozen the LLMs capabilities around dec 2025 forver and would hppily continue to pay, even more. but when suddenly the very same workload that was fine for months isn't possible anymore with the very same LLM out of nowhere and gets increasingly worse, its a huge disappointment. and having codex in parallel as a backup since ever I started also using it again with gpt 5.4 and it just rips without the diva sensitivity or overfitting into the latest prompt opus/sonnet is doing. GPT just does the job, maybe thinks a bit long, but even over several rounds of chat compression in the same chat for days stays well within the initial set of instructions and guardrails I spelled out, without me having to remind every time. just works, quietly, and gets there. Opus doesn't even get there anymore without nearly spelling out by hand manual steps or what not to do.
It's a combination of factors. There was rate-limiting implemented by Anthropic, where the 5hr usage limit would be burned through faster at peak hours, I was personally bitten by this multiple times before one guy from Anthropic announced it publicly via twitter, terrible communication. It wasn't small either, ~15 minutes of work ended up burning the entire 5hr limit. That annoyed me enough to switched to Codex for the month at that point.
Now people are saying the model response quality went down, I can't vouch for that since I wasn't using Claude Code, but I don't think this many people saying the same thing is total noise though.
Yeah, my personal anecdata is that Claude has just gotten better and better since January. I haven’t felt like even making the minor effort to compare with Codex’s current state. Just yesterday Claude Code made a major visible improvement in planning/executing — maybe it switched to 4.7 without me noticing? (Task: various internal Go services and Preact frontends.)
I'm an Opus stan but I'll also admit that 5.4 has gotten a lot better, especially at finding and fixing bugs. Codex doesn't seem to do as good a job at one shotting tasks from scratch.
I suppose if you are okay with a mediocre initial output that you spend more time getting into shape, Codex is comparable. I haven't exhaustively compared though.
Yes, GPT 5.4 is better at finding bugs in traditional code. This has been easy to verify since its release. Its also worse at everything else, in particular using anything recent, or not overengineering. Opus is much better at picking the right tool for the job in any non-debugging situation, which is what matters most as it has long-term consequences. It also isn't stuck in early 2024. "Docs MCPs" don't make up for knowledge in weights.
I agree. You're preaching to the choir. But I can also appreciate that there's plenty of tasks and use cases where being stuck in 2024 is still incredibly modern, and debugging is a much more valuable skill than picking the right tool for the job.
Sorry, no, not a bot. I get way better results out of Codex.
It's just ultimately subjective, and, it's like, your opinion, man. Calling people bots who disagree is probably not a good look.
I don't like OpenAI the company, but their model and coding tool is pretty damn good. And I was an early Claude Code booster and go back and forth constantly to try both.
Looks to me like a mob of humans, angry they've been deceived by ambiguous communications, product nerfing, surprisingly low usage limits, and an appallingly sycophantic overconfident coding agent
4.7 hasn't been out for an hour yet and we already have people shilling for Codex in the comments. I don't know how anyone could form a genuine disagreement in this period of time.
Nobody I've seen in the comments is basing it on 4.7 performance. They're basing it on how unpleasant March and early April was on the Claude Code coding plans with 4.6. Which, from my experience, it was.
I'm interested in seeing how 4.7 performs. But I'm also unwilling to pony up cash for a month to do so. And frankly dissatisfied with their customer service and with the actual TUI tool itself.
It's not team sports, my friend. You don't have to pick a side. These guys are taking a lot of money from us. Far more than I've ever spent on any other development tooling.
I have not seen any comment from the early tests of 4.7 claiming that it does not work better than the previous version.
However, there have been some valuable warnings about problems that have been hit in the first minutes after switching to 4.7.
For instance that the new guardrails can block working at projects where the previous version could be used without problems and that if you are not careful the changed default settings can make you reach the subscription limits much faster than with the previous version.