> Set temperature to 0, and the LLM will talk in circles and not really say anything interesting. Set it too high and it will output nonsense.
Sounds like some people I know, at both extremes.
> The whole design of LLMs don’t seem very well thought out. Things are done a certain way not because it makes sense but because it seems to produce “impressive” results.
They have been designed and trained to solve natural language processing tasks, and are already outperforming humans on many of those tasks. The transformer architecture is extremely well thought out, based on extensive R&D. The attention mechanism is a brilliant design. Can you explain exactly which part of the transformer architecture is poorly designed?
Sounds like some people I know, at both extremes.
> The whole design of LLMs don’t seem very well thought out. Things are done a certain way not because it makes sense but because it seems to produce “impressive” results.
They have been designed and trained to solve natural language processing tasks, and are already outperforming humans on many of those tasks. The transformer architecture is extremely well thought out, based on extensive R&D. The attention mechanism is a brilliant design. Can you explain exactly which part of the transformer architecture is poorly designed?