There are also grave implications in training a model to assume the user is lyin... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		oceanplexian 53 days ago \| parent \| context \| favorite \| on: “Car Wash” test with 53 models There are also grave implications in training a model to assume the user is lying or deceiving it. I don’t want an LLM to circumvent my question so it can score higher on riddles, I want it to follow instructions.

MillionOClock 53 days ago [–]

The thing is that there is some overlap between trick questions and questions where the human is genuinely making a mistake themselves and where it would make sense for the model to step back and at least ask for clarification.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact