I find this the most plausible explanation by far: \* The highly professional ou...

anarazel · on April 7, 2024

> The performance regression is __not__ big. It's lucky Andres caught it at all. It's also not necessarily all that simple to remove it. It's not simply a bug in a loop or some such. If I was the xz team I'd have enough faith in all the work that was done to give it high odds that they'd get there before discovery. That they'd have time; months, even.

True in the contents of sshd logins it isn't that big, but ~500ms to get from _start() to main() isn't small either, compared to the normal cost of that phase of library startup. Their problem was that the sshd daemon fork+exec's itself to handle a connection, so they had to redo a lot of the work for each connection.

I suspect they started off with much smaller overhead and then it increased gradually, with every feature they added, just like it happens with many software projects. Here's the number of symbols being looked that a reversing effort has documented: https://github.com/smx-smx/xzre/blame/ff3ba18a39bad272ff628b... https://github.com/smx-smx/xzre/blob/ff3ba18a39bad272ff628bb...

Afaict all of this happens before there's any indication of the attacker's keys being presented - that's not visible to the fork+exec'd sshd until a lot later.

They needed to some of the work before main(), to redirect RSA_public_decrypt(). That'd have been some measurable overhead, but not close to 500ms. The rest of the startup could have been deferred until after RSA_public_decrypt() was presented with something looking like their key as part of the ssh certificate.

account42 · on April 11, 2024

If I understand things correctly the hooking of RSA_public_decrypt is done with an audit hook called for every symbol of newly loaded libraries. With this approach it doesn't matter how much is hooked since all functions are always processed. It's also harder to hook functiosn later because the GOT/PLT will have been marked read only. The exploit code also doesn't directly contain any of the strings (presumably for obfuscation reasons) and instead has a trie to map given strings to internal IDs which also requires an approach like this where you look at all symbols and then decide what to do with each symbol.

account42 · on April 11, 2024

> The payload of the 'hack' contains fairly easy ways for the xz hackers to update the payload. They actually used it to remove a real issue where their hackery causes issues with valgrind that might lead to discovering it, and they also used it to release 5.6.1 which rewrites significant chunks;

The valgrind fix in 5.6.1 overwrites the same test files used in 5.6.0 instead of using the injection code's extension hooks. This is done with what should have been a highly suspicious commit: https://github.com/tukaani-project/xz/commit/6e636819e8f0703... - this replaces "random" test files with other "random" test files. The state reson is questionable to begin with but not including the seed used when the the purpoted reason was to be able to re-create the files in the future is highly suspicous. This should have raised red flags bug no one was watching. I'd say this is another part of the operation that was much more sloppy than it needed to be.

> almost entirely eliminates the value of all those 2 years of hard work.

Except control over xz-utils/liblzma would have still been very valuable even without the sshd exploit path as it's central use in the toolchain used to build Linux distributions would have allowed for many other attacks.

olejorgenb · on April 7, 2024

They could also have regrouped and found another way to do the exploit, given the relative ease of updating the payload (though it's probably a limited number of times you could change the test blobs without causing suspicion?). But I agree this explanation is plausible.

anarazel · on April 7, 2024

If lzma isn't loaded as part of sshd, the path from an lzma backdoor to sshd get a hell of a lot more circuitous and/or easier to catch. You'd pretty much need to modify the sshd binary while compressing a package build, or do something like that to the compiler, to then modify sshd components while compiling.

account42 · on April 11, 2024

Perhaps but sshd is also not the only potential exploit. E.g. the landlock commit is a hint that they were also planning an exploit via the xz-utils commands directly. Seems rash to burn over two years of gaining trust for a very central library and set of tools just because the initially chosen exploit path disappeared.