> GPT-4 is crazy slow to use with API Only somebody clueless to just how powerfu...

jazzyjackson · on Jan 22, 2024

I mean if your expected use case is "call an API and get an immediate response of the full text in under 200ms so a user interface doesn't have to make a user wait" then yea GPT4 is crazy slow. Personally I would prefer a more async thing, let me just send a message on some platform, get back to me when you have a good answer instead of making me sit watching words load one by one like I'm on a 9600 baud modem.

Also it's a text generation algo, not a mob boss. "how powerful it is" foh

weird-eye-issue · on Jan 22, 2024

People expect to wait a few seconds when calling LLMs. Just make it obvious to users. Our GPT-4 powered app has several thousand paying users and very very rarely is "slowness" a complaint.

"instead of making me sit watching words load one by one"

Huh? This is completely up to you on how you implement your application. Streaming mode isn't even on by default.

system2 · on Jan 22, 2024

2 years development and you call me clueless. Try to get a response for 4000 tokens.

trifurcate · on Jan 22, 2024

I dunno, I get a response back for 100k tokens regularly. What is the point you are trying to make?

system2 · on Jan 22, 2024

With which model are you getting 100k responses? The models are limited and are not capable of responding that much (4k max). The point I am trying to make is written 3 times in the previous messages I wrote. GPT4 is extremely slow to be useful with API.

system2 · on Jan 22, 2024

As expected, you do not know anything about its API limits. Maximum token is 4096 with any gpt4 model. I am getting tired of HN users bs'ing at any given opportunity.

trifurcate · on Jan 23, 2024

1. Your original wording, "getting a response _for_ n tokens", does not parse as "getting a response containing n tokens" to me.

2. Clearly, _you_ don't know the API, as you can get output up to the total context length of any of the GPT-4 32k models. I've received output up to 16k tokens from gpt-4-32k-0613.

3. I am currently violating my own principle of avoiding correcting stupid people on the Internet, which is a Sisyphean task. At least make the best of what I am communicating to you here.

system2 · on Jan 23, 2024

You might want to see a specialist about your behavioral issues. Also gpt-4-32k is not open to public.

weird-eye-issue · on Jan 23, 2024

I've had access for many many months now

trifurcate · on Jan 23, 2024

Skill issue.

system2 · on Jan 24, 2024

You bullsh*t saying "I dunno, I get a response back for 100k tokens regularly." A model that doesn't even exist, then you talk about a 32k non-public API. Stop lying. It is just the internet, you don't need to lie to people. Get a life.