Hacker Newsnew | past | comments | ask | show | jobs | submit | nyeah's commentslogin

Rage farming with no scientific interest. Sad to see this upvoted to front page.

There is an interesting question - how can we prove paternity or other DNA based questions with identical twins (full sequencing looking for mutations?) and if we can't, how do we handle legal responsibilities in this sort of case?

There's a lot of good material to discuss here.


no there isn't but i appreciate your amusing stupidity. this is a good example of the state of exception that most people with common sense intuitively understand.

Can you give a few penciled numbers?

You can rent a H100 GPU for $4/hour. [1]

300k tokens for that hour.

OpenAI charges $6.

Those are pessimistic assumptions.

[1] https://lambda.ai/instances


Can you keep that GPU 100% saturated at least 16 hours per day every day of the week?

If not, you aren't breaking even.


Note this is also assuming you

(1) Rent your GPUs.

(2) Pay list price, no volume breaks.

(3) Get only 85 tokens/sec. Realistically, frontier models would attain 200+ tokens/second amortized.

Inference is extremely profitable at scale.


Assuming 80GB H100 and you inference a model that is MoE and close to the size of the 80GB VRAM, you're going to see around 10k tokens/second fully batched and saturated. An example here might be Mixtral 8x7B.

You're generating about 36 million tokens/hour. Cost of Mixtral 8x7b on Open router is $0.54/M input tokens. $0.54/M output tokens.

You're looking at potentially $38.88/hour return on that H100 GPU. This is probably the best case scenario.

In reality, inference providers will use multiple GPUs together to run bigger, smarter models for a higher price.


3.99 at 8x instances, with a minimum 2 week commitment. Good luck getting 70% usage average during that time. Useful when you're running a training round and can properly gauge demand, not so great when you're offering an API.

Is it not a good penciled number? It helps set the directional tone that at inference cost is being covered.

It says the numbers are theoretically possible. Requiring a 66% usage to break even when 100% usage will piss off customers by invoking a queue means it’s a balancing act.

“Technically correct. The best kind of correct”. So inference may technically be _capable_ of being profitable, but I have question’s about them being profitable in _practice_.


The article is professional.

Soooo many comments here cite the "overreacting" point and then go on to prove it.

Another pack of evil volunteers trying to give us free stuff. Vote with your wallet. Stop paying your $0 per month until these crooks feel the pain.

One important thing is whether the tutoring is making better students, or just gaming the test.

And after graduation they can grind leetcode, and after that they can practice social cues to get in the management class. It's gamed tests all the way down.

For people who choose that career path. Still, somewhere somebody is doing some work.

The uggos I guess

Are those independent?

That's tricky. I think it depends on what kind of gaming and what kind of test.

IB may become important for US college admissions over time, but that's more aspirational so far.

True, I only listed it because, at least where I live, high schools often do one program or the other. If it's an IB school, you end up taking the APs on your own (ie, there isn't a class focused on that content, though the IB curriculum should, in theory, end up covering the same stuff, at least for the major subjects).

That kind of thinking pops up very prominently in the article.


"Why can't faster typing help us understand the problem faster?"

Because typing is not the same as understanding.


The typing referred to here is not "the typing part of coding" (fingers touching the keyboard), it's the whole coding (LLM is not a typing aid, it's a coding aid).

And coding faster CAN help us understand the problem faster. Coding faster means iterating, refactoring, trying different designs - and seeing what does and doesn't work, faster.


I think they explain "why" very clearly. They say the problem is people who don't understand their own contributions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: