It's an illusion. For any thing it "knows" you can persuade it to claim exactly ...

FeepingCreature · on Feb 3, 2023

> For any thing it "knows" you can persuade it to claim exactly opposite thing.

Which is actually a novel capability and arises because the network does reinforcement learning over its own context window. It's a strength, not a weakness. Humans can do the same thing. ("Assume that X...")

> It just randomly landed on correct thing first just because it seen it more often in the input data.

Isn't that just a description of learning?

It's true that the network has no idea what is "true". But it's not like we do either, all we do is learning from correlations. We're just better at it.

MayeulC · on Feb 1, 2023

Well, you are right that it may learn statistical associations that we associate/generalize ourselves with/as a model when we probe it.

I think the key bit here is that we can influence its associations by inputting the prompt, changing the model that is presented.

I'm waay ahead of myself here, but the thought is interesting, and it will likely remain an open question for at least a few months/years.