Hacker Newsnew | past | comments | ask | show | jobs | submit | afro88's commentslogin

What good does hating the cogs do though? Make noise to the people who can change the machine.

Not that I'm entirely onboard with it, but often you don't have a channel to communicate with "the people who can change the machine", only the cogs in the machine.

When you hate the machine as a whole, the cogs are also in scope.

It gives you satisfaction. That's the whole value and it can be worth a lot to not hold bitterness long after the problem has passed. I agree with your parent. The cogs are part of the machine, they don't deserve any sympathy just because they chose to do bad things for money any more than a robber deserves sympathy because he's poor.

> The cogs are part of the machine, they don't deserve any sympathy just because they chose to do bad things for money

That's a bit of a stretch saying that someone who enforces the rules around disability for a job is doing bad things for money. These same rules filter out a lot of scammers that if not stoped would mean less money going to the right people.

It's also a low skill low pay job, probably worked by a large percentage of people who are close to the poverty line and just trying to make ends meet to support a family.


Depends on your goal. If you want a better machine maybe hating the cogs doesn't help.

If you goal is to not have a machine at all for some particular thing, then potentially no one wanting to work a job that does that thing might be an effective way of abating the machine from doing that.

Although inconveniencing bureaucrats handling disability benefits is probably a poor starting point no matter what your opinion is.


It increases costs for the machine, and eventually it realizes that cogs are cheaper when they're not getting yelled at all day.

Sounds like me with listening to AI covers. After a couple of weeks I couldn't care less. But I was so stoked in it at the start

I vibe coded a saas and it went nowhere because it wasn't a good enough idea to begin with. I consulted with multiple varied models along the way for competitive analysis, pricing structure etc.

AI doesn't solve for ideas and product market fit. But it did allow me to fail pretty fast before I sunk too much time into it. But also, I should have spoken to potential users earlier rather than vibe coding.


Why wouldn’t just make some AI generated user personas to talk to? Whatever their opinion is, it’s already been captured and is in the training data. You don’t need to talk to users.

Because AI is way too sycophantic still. The real audio engineers I spoke to said they run into this problem maybe once every few months and even then only when working for particular clients. Too much friction to purchase one offs, not enough pain for a monthly subscription.

https://claude.ai/share/be61468f-9f38-4dc0-aae3-5d758bf0f200


Hot take.

Possibly true, dependent on the service you use,

and the software space being evaluated.


Good idea, and an improvement, but you still have that fundamental issue: you don't really know what code has been written. You don't know the refactors are right, in alignment with existing patterns etc.


I guess to reach this point you have already decided you don't care what the code looks like.

Something I'm starting to struggle with is when agents can now do longer and more complex tasks, how do you review all the code?

Last week I did about 4 weeks of work over 2 days first with long running agents working against plans and checklists, then smaller task clean ups, bugfixes and refactors. But all this code needs to be reviewed by myself and members from my team. How do we do this properly? It's like 20k of line changes over 30-40 commits. There's no proper solution to this problem yet.

One solution is to start from scratch again, using this branch as a reference, to reimplement in smaller PRs. I'm not sure this would actually save time overall though.


It sounds like you know this but what happened is that you didn't do 4 weeks of work over 2 days, you got started on 4 weeks of work over 2 days, and now you have to finish all 4 weeks worth of work and that might take an indeterminate amount of time.

If you find a big problem in commit #20 of #40, you'll have to potentially redo the last 20 commits, which is a pain.

You seem to be gated on your review bandwidth and what you probably want to do is apply backpressure - stop generating new AI code if the code you previously generated hasn't gone through review yet, or limit yourself to say 3 PRs in review at any given time. Otherwise you're just wasting tokens on code that might get thrown out. After all, babysitting the agents is probably not 'free' for you either, even if it's easier than writing code by hand.

Of course if all this agent work is helping you identify problems and test out various designs, it's still valuable even if you end up not merging the code. But it sounds like that might not be the case?

Ideally you're still better off, you've reduced the amount of time being spent on the 'writing the PR' phase even if the 'reviewing the PR' phase is still slow.


If you haven't reviewed the code yet, how can you say it did 4 weeks of work in 2 days? You haven't verified the correctness, and besides reviewing the code is part of the work.


That's what I was getting at. With the review and potential rework time, we could be looking at over the original 4 week estimate. So then what's the point in using long running unsupervised agents if it ends up being longer than doing it in small chunks.


>Last week I did about 4 weeks of work over 2 days first with long running agents working against plans and checklists, then smaller task clean ups, bugfixes and refactors. But all this code needs to be reviewed by myself and members from my team. How do we do this properly? It's like 20k of line changes over 30-40 commits. There's no proper solution to this problem yet.

Get an LLM to generate a list of things to check based on those plans (and pad that out yourself with anything important to you that the LLM didn't add), then have the agents check the codebase file by file for those things and report any mismatches to you. As well as some general checks like "find anything that looks incorrect/fragile/very messy/too inefficient". If any issues come up, ask the agents to fix them, then continue repeating this process until no more significant issues are reported. You can do the same for unit tests, asking the agents to make sure there are tests covering all the important things.


The proper solution is to treat the agent generated code like assembly... IE. don't review it. Agents are the compiler for your inputs (prompts, context, etc). If you care about code quality you should have people writing it with AI help, not the other way around.


> how do you review all the code?

Code review is a skill, as is reading code. You're going to quickly learn to master it.

> It's like 20k of line changes over 30-40 commits.

You run it, in a debugger and step through every single line along your "happy paths". You're building a mental model of execution while you watch it work.

> One solution is to start from scratch again, using this branch as a reference, to reimplement in smaller PRs. I'm not sure this would actually save time overall though.

Not going to be a time saver, but next time you want to take nibbles and bites, and then merge the branches in (with the history). The hard lesson here is around task decomposition, in line documentation (cross referenced) and digestible chunks.

But if you get step debugging running and do the hard thing of getting through reading the code you will come out the other end of the (painful) process stronger and better resourced for the future.


Oh I didn't mean literally how do I review code. I meant, if an agent can write a lot of code to achieve a large task that seemingly works (from manual testing), what's the point if we haven't really solved code review? There's still that bottleneck no matter how fast you can get working code down.


I've been thinking a lot about this!

Redoing the work as smaller PRs might help with readability, but then you get the opposite problem: it becomes hard to hold all the PRs in your head at once and keep track of the overall purpose of the change (at least for me).

IMO the real solution is figuring out which subset of changes actually needs human review and focusing attention there. And even then, not necessarily through diffs. For larger agent-generated changes, more useful review artifacts may be things like design decisions or risky areas that were changed.


> Something I'm starting to struggle with is when agents can now do longer and more complex tasks, how do you review all the code?

Same as before. Small PRs, accept that you won't ship a month of code in two days. Pair program with someone else so the review is just a formality.

The value of the review is _also_ for someone else to check if you have built the right thing, not just a thing the right way, which is exponentially harder as you add code.


You’re not alone. I went from being a mediocre security engineer to a full time reviewer of LLM code reviews last week. I just read reports and report on incomplete code all day. Sometimes things get humorously worse from review to review. I take breaks by typing out the PoCs the LLMs spell out for me…


I'm security engineer too and when it really will come so far that I only review LLM code I refuse to do it for fewer than my doubled hourly rate.


So you have become a reviewer instead of a programmer? Is that so? hones question. And if so, what is the advantage of looking a code for 12 hours instead of coding for 12.


Build features faster. Granted, this exposes the difference between people who like to finish projects and people who like to get paid a lot of money for typing on a keyboard.


Bullshit! You project isn't finished as long as there are obvious major bugs that you can't fix because you don't unterstand the code.


Why does understanding computer science principles and software architecture and instructing a person or an ai on how to fix them require typing every line yourself?


yeah honestly thats what i am struggling with too and I dont have a a good solution. However, I do think we are going to see more of this - so it will be interesting to see how we are going to handle this.

i think we will need some kind of automated verification so humans are only reviewing the “intent” of the change. started building a claude skill for this (https://github.com/opslane/verify)


It's a nice idea, but how do you know the agent is aligned with what it thinks the intent is?


or moreso, what happens at compact boundaries where the agent completely forgets the intent


Will you get rid of him? It sounds like he's wasting a lot of your time


Or... is apical_dendrite just circling the wagons, scared of AI taking his job?

/management thoughts


I wouldn't have picked this article as AI until I got an agent to do some writing for me and read a bunch of it to figure out if I can stand behind it. Now I see the tells everywhere "It's not this. It's that." is particularly common and I can't unsee it. (FWIW I rewrote most of the writing it generated, but it did help me figure out my structure and narrative)

The problem I think with AI generated posts is that you feel like you can't trust the content once it's AI. It could be partly hallucinated, or misrepresented.


Yeah, but "it's not X. It's Y" is a common idiom that LLMs picked up from people. That's the point i was making. And it's starting to feel like every post has at least one comment claiming that it was AI generated.


This is exactly right IMO. I have never worked for a company where the bottleneck was "we've run out of things to do". That said, plenty of companies run out of actual software engineering work when their product isn't competitive. But it usually isn't competitive because they haven't been able to move fast enough


I think it depends on:

A) how old the product is: Twitter during its first 5 years probaby had more work to do compared to Twitter after 15 years. I suspect that is why they were able to get rid of so many developers.

B) The industry: many b2c / ecommerce businesses are straightforward and don't have an endless need for new features. This is different than more deep tech companies


There’s a third one, and it’s non-tech companies or companies for whom software is not a core product. They only make in-house tooling, ERP extensions, etc. Similar to your Twitter example, once the ERP or whatever is “done” there’s not much more work to do outside of updating for tax & legal changes, or if the business launches new products, opens a new location, etc.

I’ve built several of such tools where I work. We don’t even have a dev team, it’s just IT Ops, and all of what I’ve built is effectively “done” software unless the business changes.

I suspect there’s a lot of that out there in the world.


Not moving fast enough.. sure. But to what direction? The direction and clarity of it is the hardest part.


Nice setup, but GP said:

> how people who really try to learn with these tools work

This setup is potentially effective sure, but you're not learning in the sense that GP meant.

For GP: Personally I've reached the conclusion that it's better for my career to use agents effectively and operate at this new level of abstraction, with final code review by me and then my team as normal.


> This setup is potentially effective sure, but you're not learning in the sense that GP meant.

Then GP didn't mean anything useful. I've learned how to build those setups. I learn to build by orchestrating groups of agents, and I get to spend far more of my time focusing on architecture, rather than minutiae that are increasingly irrelevant.


Sometimes people are actually holding it wrong though


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: