You also wouldn't see them giving their code to people learning to code to read and learn from, but that doesn't necessarily mean that learning from something violates copyright. It seems like a bit of a grey area where the distinction is between learning and using it directly
> You also wouldn't see them giving their code to people learning to code to read and learn from,
Actually, we do see them do exactly that. Microsoft is ahppy to have a shared-source licence that gives away much of the Windows source code to universities.
The fact that they don't want to train their models on it says a lot.
If, as they claim, learning from existing materials does nto devalue those materials in any way, they'd chuck the entirety of the Windows source code at the LLM for training purposes.