I know what brudgers is talking about - I frequently encounter the same frustration when attempting to use Google to find actual information in a space where too many people are trying to make money. The links you see depend on what Google knows about you. I tried the search in an incognito tab, and the pdftohtml link, which would be the most useful result, was nowhere on the first page. When I add the word "linux" to the search, it appears. Not bad at all in this case, but I've been stymied in the past when no combination of keywords and search operators could locate good, technical information that I knew was out there, but instead returned page after page of spam and junk. I know Matt Cutts still claims that there is a firewall between search and ads, and there is no compromise on search quality for profit reasons, but does anyone still believe this?
This whole sub-thread hinges on the assumption that the "pdftohtml" result is objectively the correct one for the the query "convert PDF to HTML", and that it's commercial corruption of Google that keeps that result off the front page. I think that that is a false assumption. If a non-technical user need to convert a PDF to HTML, the pdfonline.com result (the one with the friendly green button) is leaps and bounds better than the pdftohtml SourceForge onw with the broken Windows link. Google doesn't have any kind of obligation to promote FOSS software at any cost, they have an obligation to answer the users question - and in this case it does just that.
Yes, I too have had to sort through pages of various SEO keyword-spam to find what I'm looking for, but the connection from there to this poor example to the number of extraneous elements on a website to accusing Google of having forged a broken web is, well, weak.
Your point about the difficulty in establishing one or another result as objectively better is a good one - though somewhat weakened by then arguing that the online service is better.
However, it is not as if pdftohtml is obscure. It is a common component of a great number of GNU/Linux distributions and has been for many years. So setting aside the merits of the top result, the lower results are still problematic.
Unlike the top result, neither Pdftohtml nor the high ranking results are SaaS - they're downloads for offline use. What differentiates them is their value propositions. They are in opposition between Google and the lay person and the results are caveat emptor solely for Google's benefit.
Elsewhere, I have used a better example, "weather". The top results are advertising revenue driven. The best result? In most cases, the National Weather Service point data - advertising free, updated regularly, and generated by Phd meteorologists, not pretty faces for TV.
In this case, Pdftohtml shows the way in which Google drives us toward online rather than offline solutions. It drives us away from using our CPU cycles and towards consuming bandwidth, and absent that toward tools which pay for advertising.
Perhaps you're right that the connection is becoming tenuous. I, for one, was not assuming that the pdftohtml result was "objectively the correct one"; just that it's obviously more relevant than some, or many, of the results that are ranking higher than it for the query. And it's so common for less relevant or spammy pages to rank higher than more relevant, higher-quality results that I no longer expect the best results to be near the top, nor even on the first page. There are probably many reasons for this, and one reasonable theory is that Google is attempting to increase their profit by driving people to pages that might earn them a commission. Or it might just be that their ranking algorithms, despite being still perhaps better than anyone else's, are fundamentally flawed, and attach too much importance to linkage and not enough to other signals of quality.