Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Set temperature to 0, and the LLM will talk in circles and not really say anything interesting. Set it too high and it will output nonsense.

Sounds like some people I know, at both extremes.

> The whole design of LLMs don’t seem very well thought out. Things are done a certain way not because it makes sense but because it seems to produce “impressive” results.

They have been designed and trained to solve natural language processing tasks, and are already outperforming humans on many of those tasks. The transformer architecture is extremely well thought out, based on extensive R&D. The attention mechanism is a brilliant design. Can you explain exactly which part of the transformer architecture is poorly designed?



> They have been designed and trained to solve natural language processing tasks

They aren’t really designed to do anything actually. LLMs are models of human languages - it’s literally in the name, Large Language Model .

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

I’m sorry but I don’t trust something that uses a random number generator as part of its output generation.


> They aren’t really designed to do anything actually. LLMs are models of human languages - it’s literally in the name, Large Language Model .

No. And the article you linked to does not say that (because Wolfram is not an idiot).

Transformers are designed and trained specifically for solving NLP tasks.

> I’m sorry but I don’t trust something that uses a random number generator as part of its output generation.

The human brain also has stochastic behaviour.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: