Modern freshest gcc 11 can optimize the if's nicely https://godbolt.org/z/771foE...

Denvercoder9 · on May 6, 2021

Anyone know what the purpose of the mov edi, edi instruction there is?

Edited to add: I understand that it's a NOP, but why would the compiler emit one here?

Const-me · on May 6, 2021

That instruction clears upper 4 bytes of the rdi register. Note the next instruction uses rdi in the address.

edi register is the lower 4 bytes of rdi. Instructions which write these smaller pieces zero out the unused higher bytes of the destination registers. This helps with performance because eliminates data dependencies on the old values in these higher bytes.

gaul · on May 8, 2021

This is not a NOP; it explicitly clears the upper 32 bits of EDI since the compiler does not know that they are zero in this situation. If you change cc from an int to size_t (long on x86-64) the compiler will generate:

        mov     eax, OFFSET FLAT:.LC0
        cmp     rdi, 258
        ja      .L1
        mov     rax, QWORD PTR CSWTCH.1[0+rdi*8]

Note that in some cases the compiler can do this automatically via lifetime analysis but not in this freestanding example.

JoeAltmaier · on May 6, 2021

It does nothing, so its a noop of sorts. I wonder if its a branch-delay tactic of some kind? Surely the algorithm is unchanged if it were removed.

colejohnson66 · on May 6, 2021

It’s just a two byte NOP. IIRC, it’s Intel’s recommended form for one of that length. Windows uses it for hot patching,[0], but I can’t imagine that’s the reason here.

[0]: https://devblogs.microsoft.com/oldnewthing/20110921-00/?p=95...

qayxc · on May 6, 2021

clang does the same - I tested it back to clang 7.0.

So clang did this for a long time it seems.