the idea that LaTeX has a messy codebase is almost mindboggling to me--I had assumed that TeX--practically the only code that's published as a book (see: The TeX Book) would be clean. I guess the complaint is mostly about the Pascal + C that's code-generated by TANGLE.
It's not so much core TeX that's a mess as the ecosystem; the set of font-generation, path-management, shell scripts, PS/PDF/DVI code, and special-purpose glue binaries gets pretty hairy.