Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are also grave implications in training a model to assume the user is lying or deceiving it. I don’t want an LLM to circumvent my question so it can score higher on riddles, I want it to follow instructions.


The thing is that there is some overlap between trick questions and questions where the human is genuinely making a mistake themselves and where it would make sense for the model to step back and at least ask for clarification.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: