Hacker Newsnew | past | comments | ask | show | jobs | submit | spl757's commentslogin

I agree that that is all true. And exceedingly fucking stupid.

What if the hammer has a problem where the handle breaks off randomly? Same thing happens with AI. Sometimes it breaks, randomly, and without any way of predicting it.

I think you have missed the analogy.

Another way to look at it is that the operator of the hammer has an immediate feedback loop and will not continue with a broken hammer. AI as it stands rarely has that feedback on the consequences of its decisions, and lacks the ability to react appropriately.


The problem that I see is that no one and no company seems to be making that distinction.

lots and lots of companies are making that distinction. but try and write a post here saying “our productivity is through the roof and our systems have never been more stable since we started using AI” and see what happens. as it always goes in this day and age, bubbles and echochambers… so easier to just go about your day doing amazing shit at amazing pace than try to “argue” about the merits of a technology. every post I see here suggesting positive results get dowvoted faster than anything else

AI is an umbrella term. All AI models can hallucinate. There has been no solution to this problem. Until that problem is resolved, it is, in my opinion, something that only an idiot would run in production. I read about a company that had their whole codebase wiped out because they gave an agent access to be able to do that.

You're not really making a good point if you can't distinguish.

Fin.ai by telecom is a full product powered by LLMs and makes $100m ARR.

Make your point and save the attacks and maybe you can be helped.


No, it's not. The problem is all AI hallucinates. Therefore, it is guaranteed to be confidently wrong. Until the problem of hallucinations are solved, anyone using AI in a production environment is an idiot, which is, of course, my personal opinion. But it seems pretty cut and dry to me.

Your original post (and even after this comment I think) was vague in that AI can be used in a lot of different ways in 'production' - to generate code, to manage deployment / scripts, or as part of a feature that uses inference.

For example, if you're writing code with AI, you can still review it just like you would if a colleague wrote it. You can write tests (or have the AI do so) to prevent some hallucinations, too.


Yes, AIs that hallucinate can all be used in different ways. But they can still all hallucinate, so I fail to see how what you are saying mitigates the fundamental, as yet to be solved, problem of AI hallucinations.

edit to say, what is the point, after all, of artificial intelligence if it's not used to make decisions? That's what it does. But ALL AI HALLUCINATES. Therefore, it's unreliable.


Tons of people, apparently, aren't enough. I guess I'm just tired of seeing post after post on HN about people complaining that their use of AI in production isn't reliable.

It makes me want to pull out the hair I used to have an scream into the wilderness and eat a twinkie.


Why do we find the unreliabilty and resulting hallicinations as acceptable for AI in production? Can you imagine if Postgres, Apache, Nginx, hell even the Linux kernel were allowed to be use in production if they occassionally went insane?

You can use the same logic for most humans yet they are in production since birth :)

Well, but agents today are pretty much like Fitzcarraldo...

I don't think that is an apt comparison.

No one gets a newborn to configure nginx

Because executives have been misinformed about it's reliability, or lack thereof, and are stupid? Just a guess.

e: typo


A bust that the poor and middle-class will foot the bill for.

Why change a working process?

How did you solve the problem of hallucinations?

the duct tape framing is fair but the deeper issue is the model has no persistent understanding of the system it's working in. each generation starts from scratch with no memory of prior context or architectural decisions. that's a harder problem than prompt engineering but it's solvable at the infrastructure layer

I've built a list of common gotchas in the generation prompts. Also if the compilation fails it falls back to opus with the error message and code and can try again twice.

That's not a solution to the root problem. That is duct tape.

This is built with the paradigm of "don't build for the model of today, build for the model in 6 months" It currently works, which amazes me still, but it will get much better!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: