I would be very surprised if a compiler on a modern high-performance processor a...

I would be very surprised if a compiler on a modern high-performance processor actually did this optimization - as deep pipelines and speculative execution mean that the additional conditional branch per loop may hurt more than it helps.

That being said, on a simple architecture - namely a shallow pipeline - this sort of thing is an "obvious" optimization.

And w.r.t. "without optimizations" - there is no such thing at the C / C++ level. There are computer architectures where optimizations pretty much have to be done, for instance. (Many architectures with compile-time scheduling, for instance.)

For instance, if you have a dataflow machine it may end up running this in non-constant time. And I'll note that modern x86 processors are looking more and more like dataflow machines.