Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> GPT-4 is crazy slow to use with API

Only somebody clueless to just how powerful it is when used correctly would say anything like this. Not to mention GPT-4 Turbo is not "crazy slow" in any sense of the word



I mean if your expected use case is "call an API and get an immediate response of the full text in under 200ms so a user interface doesn't have to make a user wait" then yea GPT4 is crazy slow. Personally I would prefer a more async thing, let me just send a message on some platform, get back to me when you have a good answer instead of making me sit watching words load one by one like I'm on a 9600 baud modem.

Also it's a text generation algo, not a mob boss. "how powerful it is" foh


People expect to wait a few seconds when calling LLMs. Just make it obvious to users. Our GPT-4 powered app has several thousand paying users and very very rarely is "slowness" a complaint.

"instead of making me sit watching words load one by one"

Huh? This is completely up to you on how you implement your application. Streaming mode isn't even on by default.


2 years development and you call me clueless. Try to get a response for 4000 tokens.


I dunno, I get a response back for 100k tokens regularly. What is the point you are trying to make?


With which model are you getting 100k responses? The models are limited and are not capable of responding that much (4k max). The point I am trying to make is written 3 times in the previous messages I wrote. GPT4 is extremely slow to be useful with API.


As expected, you do not know anything about its API limits. Maximum token is 4096 with any gpt4 model. I am getting tired of HN users bs'ing at any given opportunity.


1. Your original wording, "getting a response _for_ n tokens", does not parse as "getting a response containing n tokens" to me.

2. Clearly, _you_ don't know the API, as you can get output up to the total context length of any of the GPT-4 32k models. I've received output up to 16k tokens from gpt-4-32k-0613.

3. I am currently violating my own principle of avoiding correcting stupid people on the Internet, which is a Sisyphean task. At least make the best of what I am communicating to you here.


You might want to see a specialist about your behavioral issues. Also gpt-4-32k is not open to public.


I've had access for many many months now


Skill issue.


You bullsh*t saying "I dunno, I get a response back for 100k tokens regularly." A model that doesn't even exist, then you talk about a 32k non-public API. Stop lying. It is just the internet, you don't need to lie to people. Get a life.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: