It appears that Gloox, a relative low-level XMPP-client C library, rolled much o...

zamalek · on May 24, 2022

One of the harder things with XMPP is that it is a badly-formed document up until the connection is closed. You need a SAX-style/event-based parser to handle it. That makes rolling your own understandable in some cases (e.g. dotnet's System.Xml couldn't do this prior to XLinq).

That being said, as you indicated Gloox is C-based, and the reference implementation of SAX is in C. There is no excuse.

forty · on May 25, 2022

Not only that, but before the TLS session starts you have to handle an invalid XML document (the starttls mechanism start encrypting stuff right in the middle of the initial XML document). Also some XML constructs are not valid in XMPP (like comments)

I think rolling out your own XML parser for XMPP is a fairly reasonable thing to do. In the past at least, many, if not most, implementations had their own parser (often a fork of a proper XML parser). What is more surprising to me is why would they choose XMPP for their proprietary stuff. I don't think they want to interroperate or federate with anything?

(if I remember correctly and if it hasn't changed compared to many years ago, when I looked at that stuff.)

Flowdalic · on May 24, 2022

> One of the harder things with XMPP is that it is a badly-formed document up until the connection is closed. You need a SAX-style/event-based parser to handle it.

That is a common misconception, although I am not sure of its origin. I know plenty of XMPP implementations that use an XML pull parser.

zamalek · on May 24, 2022

It's possible by blocking the thread that's reading the XML, but now you're in thread-per-client territory, and that doesn't scale.

Flowdalic · on May 25, 2022

Smack uses an XML pull parser and non-blocking I/O. It does so by splitting the XMPP stream top-level elements first and only feeding complete elements to the pull parser.

forty · on May 25, 2022

https://github.com/igniterealtime/Smack/blob/master/smack-xm...

I don't see any opportunity not to block when calling "next"

TedDoesntTalk · on May 24, 2022

DOM-based XML parsers use SAX parsing under the hood.

zamalek · on May 24, 2022

Right, but if they don't give you access to the SAX parser then you are SOL.

Aeolun · on May 24, 2022

I find that response a bit strange, since the whole reason the Zoom client has these particular vulnerabilities is because they didn’t roll their own, and instead rely on layers of broken libraries.

It’s quite possible they’d have more bugs without doing that, but re-using existing modules could just as easily have been an even worse idea.

WesolyKubeczek · on May 24, 2022

Using what everyone and their dog is using is prone to bugs just as much because software without bugs doesn't exist or is not very useful, but it also has the benefit of many versatile eyeballs looking at it in many different contexts.

So if there's a bug found and fixed in libxml2 which is used by almost everything else, everyone else instantly benefits. Same with libicu which is being used, for example, by NodeJS with its huge deployments footprint. Oh, and every freakin' Webkit-based browser out there.

OTOH, they rolled their own, so all bugs they hit are confined only to zoom, and are only guaranteed to get Zoom all the bad press.

Choose your poison carefully.

Aeolun · on May 24, 2022

If they roll their own it also becomes less interesting to actively exploit.

Obviously this doesn’t really work for Zoom any more, since their footprint is too large, but it can stop driveby attackers in other situations. Nobody is going to expend too much effort figuring out joe schmuck’s homegrown solution, where they’d happily run a known exploit against the unpatched wordpress server.

pixl97 · on May 24, 2022

Security by obscurity has been debated to hell and back. It only works if you stay obsecure... and don't leak your code.

eli · on May 24, 2022

I think the point is that Unicode and XML parsing are known to be security critical components and you should take care that they are handled only by well tested code designed specifically for the purpose. You need to not roll your own and also ensure that any third party components didn’t roll their own.

remus · on May 24, 2022

> You need to not roll your own and also ensure that any third party components didn’t roll their own.

If you're not writing the code and somebody else isn't writing the code then who is writing the code?!

eli · on May 24, 2022

A well-tested Unicode library built for security should be doing your Unicode parsing in security critical components.

It’s just another way of saying you should be doing a security audit as part of selecting a library and integrating it into your product.

Flowdalic · on May 24, 2022

I get your confusion. But keep in mind that it is not only about just picking the library that shows as first result of your Google search. My naive self thinks that a million dollar company should do some research and evaluate different options when choosing external codebase to build their flagship product on. There a dozens of XMPP libraries, and they picked the one that does not seem to delegate XML and Unicode handling to other libraries, which should raise a flag.

mwcampbell · on May 24, 2022

I think that's a false dichotomy; IMO the best default choice is to rely on the most well-tested library in any given category. That suggests to me that they should have used expat on the client side.

powerapple · on May 25, 2022

IMO we should use external libraries, and should invest engineering time on the library rather than just take a library. Not using good third party library means you need to invest at least a few engineer-month in it to get the same result, and you will need to invest a lot more to do better than third party library. Instead, you can take the library and invest a few engineer month to improve the opensource library.

account42 · on May 25, 2022

Why? If anything, the client does the more reasonable interpretation of the XML-in-malformed-UTF-8 - skipping to the next valid UTF-8 sequence start. It's the server that has really weird behavior for their UTF-8 handling where it somehow special cases multi-byte UTF-8 sequences but then does not handle invalid ones.

xxpor · on May 24, 2022

This is a very common issue across all of software engineering I've found. But I really don't get why. If I was given the task of parsing Unicode or XML, I'd run and find a library as fast as possible, because that sounds terrible and tedious, and I'd rather do literally anything else!

Why aren't people more lazy, in other words?