Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for building this. The mechanics are such an obvious idea that it's astounding that the first-party platforms haven't done this yet. I would be interested to see how this could be used for other tasks outside of JSON that require structured input.


> it's astounding that the first-party platforms haven't done this yet

I was under the impression LLM tech is currently in a breakneck arms race and that things are dramatically changing every few months. It could simply just be a consequence of limited developer resources. It would be "astounding" if decade-old tech were missing such a fundamental feature, but for AI tech in arms-race mode it seems reasonable that they are still missing QoL features.


I think they meant that you'd expect simpler/more obvious ideas to be implemented first.


Thanks! We have extended the approach to grammar-based sampling. We describe the approach in the paper linked above. The following PR is relevant: https://github.com/normal-computing/outlines/pull/178


Could this same approach be applied at training? If the guidance does a lot of the syntactical heavy lifting, would that create the opportunity for a model to use the weights for something else. Essentially not bothering to reduce the error of things that the guidance will stomp on anyway.


Hi, the paper at https://arxiv.org/abs/2306.10763 titled "Guiding Language Models of Code with Global Context using Monitors" shows how to have the language models generate code without hallucinated dereferences.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: