Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Website Obesity Crisis (2015) (idlewords.com)
197 points by juliangoldsmith on April 11, 2017 | hide | past | favorite | 83 comments


Ads are definitely a huge problem, but it seems they are at least equally, perhaps even less of a contributor to bloat than the tons of off-the-shelf Javascript frameworks/libraries major websites throw in to perform analytics, or to implement stock features like user-comments, login, social-media stuff, etc.

I mean, just looking at the network resources for a fresh download of www.cnn.com, I see:

livefyre_base.js, 167KB - apparently some kind of user-comment library

cdn.optimizely.com/js/131788053.js, 93.2K - some library for user tracking/analytics or something,

controltag.js, 72K - something from a company called "Krux DMP" that seems to be yet more user analytics and tracking,

sdk.js, 61.9K - something from gigya.com, apparently some kind of login and user-preferences framework

jsmd.js, 87.5K - another analytics framework

... and various others.

I see a "pubads_impl_113.js" from doubleclick.net, which is obviously for ads, but if cnn.com is typical, then it seems that off-the-shelf Javascript frameworks for user analytics and other stock features like logins/social-media crap, etc., are just as big a factor in contributing to bloat as display ads.


Optimizely is A/B testing. It's not strictly analytics (in that it's not a Google Analytics replacement), but it re-implements some of that functionality for audience targeting.

Krux is an ad tracking service.

jsmd.js appears to be the homebrew analytics system of one of the media partners.

The actual web analytics being run on CNN is Adobe Analytics, which is actually fairly compact. The real problem is that every ad also includes its own analytics framework. Each of them often includes multiple frameworks, each of which tracks something different. E.g. one for demographic targeting, one for tracking across multiple ad networks, one to see if the user scrolled far enough to see the ad, and one to see if the ad loaded before the user closed the page or navigated away. It's a goddamn mess.


> Ads are definitely a huge problem, but it seems they are at least equally, perhaps even less of a contributor to bloat than the tons of off-the-shelf Javascript frameworks/libraries major websites throw in to perform analytics

If you look at how this works out when you reload such a page just once, you'll see that you're wrong. The frameworks are heavy, but they are cached and typically only contribute to parsing delays etc., but ads load every time you visit the page and often the JS involved isn't cached, nor minified/gzipped.

Try a "repeat view" with webpagetest.org, for example.


Been running with uMatrix for a few weeks now and it's crazy how much gets loaded on every website that isn't necessary for the actual page's functioning. Websites finally feel snappy again - it's the best extension I've ever used.


Agree 100%.

Advertising is a zero sum game. If an advertiser's budget is $5, they will spend $5. They will spend $5 on as many dumb ads as they can get, and they will spend that $5 using very targeted means if available.

There's a tremendous amount of effort expended in the ad industry with very little real effect than to direct ad dollars through that platform and maybe provide a temporary lift to response rates.

But honestly, it's all garbage. The more prevalent advertising is, the more people ignore it. Especially targeted ads. Hell, the targeting algorithms aren't even all that good - even in 2017. I bought an electric razor the other day, and I have been getting ads for electric razors ever since. Why? I already bought one. I won't need another for at least a year. These algorithms are stupid and likely don't boost anything.

Anyway, once a new generation of ad buyers comes on board, expect the bubble to pop. The fundamentals are garbage.


Even if you hadn't bought one yet, would you ever choose a specific product based on an ad? I wouldn't. Best thing an ad can do to me is remind me I need something in a category, then I'll go around internet comparing products and grab the one that looks best, especially with a product like an electric razor. Companies should be investing in making good products so that when someone remembers they need something, their product will have the most positive reviews and most reasonable price.


Easier to just rebadge some crap from China and buy the positive reviews.


I hate advertising. However, I think a glib dismissal like this one underrates the power of ads a lot, and for me it's important to not underestimate my adversary.

Sure, when I research virtually anything for myself, down to soy sauce or toothbrushes, I will literally go and seek out reviews and do some legwork.

The real question is what happens when you are in a pinch. You're driving to your brother's house for dinner, and recall that you forgot to buy a toy for your nephew's birthday. Precisely there you'll be nabbed by a tendency to go for the "can't go wrong" choice, and the perception of the can't go wrong choice will have probably been created by advertising.

So, don't sell it short. I agree with virtually everything; it's a worthless arms race and we'd be better off with companies focusing on quality. Just don't fall into the trap of thinking everyone can just use willpower to fight it. It requires more, as legislation banning cigarette advertising has shown.


If it's a nephew, and they are old enough to not asphyxiate themselves, then the obvious answer is always Lego. Little boys fucking love Legos.


Lego is great. But don't forget that this is also an idea that is shaped by advertising: why is Lego not for girls? Why do girls need pink blocks with figures that wear make up? Why this genderification?


Lego is essentially a monopoly built on branding. Same as Coca-Cola. Advertising shapes slowly, one ad at a time


Girls don't need pink and makeup. It's simply what sells to them.


Advertising is not remotely zero-sum, as your own electric razor example demonstrates. Better targeting is win-win-win - the advertiser doesn't want to advertise something you have no interest in buying, you don't want to see an ad for something you have no interest in buying, and the site doesn't want to be associated with irrelevant ads either.


It is in relation to ad budgets.

Consumers only have so much attention span. It's been over-saturated for years; more adtech doesn't lead to better conversion -- it's just smoke and mirrors so as networks can justify rates without the return ("but wait, this new feature will boost your campaigns by 5%!")

Also consider that the #1 advertising category (automobiles) are about to head off a cliff in the next decade... there's a reason Wall Street thinks Tesla (whose 10 year plans are almost certainly to stop selling cars to consumers and operate a ride sharing platform) is more valuable than Ford.


> It is in relation to ad budgets.

Ad budgets are not fixed universal constants. If advertising becomes more effective then companies can and should spend more on it (and vice versa if it becomes less effective).

> Consumers only have so much attention span. It's been over-saturated for years

The fact that the OP remembered their electric razor example suggests otherwise.


Ad budgets kinda are fixed. That's how large ad buyers operate -- a product has a certain marketing budget, and an expected lift for a campaign. If you get more than that lift, you get a bonus. Less, meh, it happens (more often than not actually).

I only remembered the razor ad because my immediate thought was "wow what a terrible ad" (and I work in this stuff). Can't tell you what brand of razors it was for.


I don't agree with your targeted ads assessment. My targeted ads are so right it is downright spooky. I swear I have only THOUGHT about buying some obscure item and then boom there it is right there in a sidebar ad.

Honestly I don't even mind, I'd rather look at ads for things I want, rather than things I have no interest in.


I first realized Javascript/Frontend/client developers were a danger to society about 10 years ago.

We were trying to figure out why some particular, quite globally popular web site did not work when transcoded in our Opera Mini transcoder.

The page itself looked perfectly static, no fancy effects or anything. Turns out the site had decided to re-implement the "click a link" feature entirely in javascript, down to the "create a click listener for entire document, then create a click router based on X, Y, width, height coordinates" level. We patched it somehow, had a laugh about the stupidity, but the sour feeling of insanity got stuck with me. It did not provide any feature beyond what a simple 'a' tag would have done.

It was just so absurd. We spent some time to try to figure out if there was a legitimate reason for it but we couldn't find any.


Sounds like a rehash of this: https://www.w3schools.com/tags/tag_map.asp in JS, perhaps designed by someone unable to construct an adequate search engine query to solve the problem they thought they had but having access to a manual for JS. Compound that with the religious zeal that invariably (well - quite often) follows initiation into the sect around a newly discovered programming language. The manual may not actually exist but consist of a few links to Stackoverflow.

"Never attribute to malice that ..." as Mr Halon might say. I've found out later that I've re-invented the wheel with a system more than once (mine obviously sported Fandango tyres) and laughed at several other single axis rotation optimised load bearing solution implementations with minimised frictional components and thrust load resistance.

Experience is a wonderful thing - your's and theirs (and mine)

Cheers Jon


What you've described is basically a heat map data collection script, or part of one.

The script will capture the (x,y) coordinates and cache a running log of where events occurred. It's proxying the events and then forwarding them, so it can get inbetween the event and the outcome, and record the event's details with greater precision.

Since JavaScript events are asynchronous, one part of the heatmap recorder may have been decoupled and not directly connected to the rest of its sibling scripts in a very obvious way.

But anyway, to get to the point, these heatmap apparatuses do spew useless data, user by user, and can hog system resources in non-obvious ways, but in aggregate, across thousands or millions of users, develop a picture of how many users clicked directly on a button, in proximity to a button, and which button, out of a selection of buttons, was clicked in whichever order.

After deploying something like this, the raw (x,y) coordinate data is then collected, and used to generate an image. The image consists of they watthe webpage appeared to the user, and then a semi-transparent false-color overlay is placed as a layer over the appearance of the web page, and the (x,y) coordinates color the pixels red/hot for many clicks, and blue/cold for fewer clicks, like a topographical map.

See the results of similar scripts here:

https://duckduckgo.com?q=!googleimages+web+page+click+heatma...

Similar heatmaps are generated in more controlled environments, with eye motion tracking tests, where the user consents to sit in a room, in front of a computer, with a camera attached, and use a redesigned website or app, while the system records the specifics of their behavior, capturing not only clicks, but where they were looking, and how their eyes moved toward the elements that received clicks.

Testing processes like these can be really expensive to perform.

This is usually done to gather evidence of an effective design layout, among competing designs in an A/B test.

If you only noticed part of the script that proxies the click events, it may be that someone left part of an A/B test harness behind, by mistake, or that part of the harness was left out by mistake. These sorts of scripts definitely take the collected click coordinates, and forward them to the server, otherwise it's not possible to assemble the individual events into a heatmap.


Hi,

thanks for the description. I'm pretty sure that what we witnessed at that time wasn't something like this though. We spent quite some time picking apart the javascript.


Well, my problem is you think that's JavaScript's fault.


I believe the blame was aimed at javascript developers, not the language.


This is a repost. I'm guessing due to the blowup of the electron post.

There is no right answer. A huge fat bloated website can be successful (see BuzzFeed for the general population), or a lightweight text based informational website can be successful (see hacker news for you guys).

Same thing for the electron debate. A bloated web app ported to desktop could just be as successful as a native app and vice versa.

The problem is that it just sucks. The choices we have to make suck and everything sucks right now.

Development is now "pop." This is why websites are fat and web development is going to the desktop.


It was never about the code.

It was always about the products.


Agreed and up-voted. I think that is what I was trying to say in my rant. I guess as a developer though it's hard to come to terms with the fact that you are creating crap.

I went through the same thing when I was a musician so it stings doubly.


The media fetishized a "techno-genius" that never really existed. Design and development are two very different skills, and it's rare to find a person who is good at both.

But all things equal, a good product person can make do with shitty developers. Good developers can never overcome shitty product design. If you want to design apps, go to school for design.


I see what you think are shitty developers and raise you actually shitty developers.


Craigslist finds your argument unpersuasive.


Craigslist is great design. I'll take that over what "UX experts" come up with any day.


Craigslist seems overwhelming and like bad design when you first encounter it, but start using it for awhile and being able to go exactly where you need to go with one click and know exactly where on the page it will be located, without scrolling, and have your mouse ready to click it, is actually really nice.


Craigslist only seems like bad design if you've been programmed to use successful application of current design trends as a shortcut to figuring out whether a site is abandoned, under-funded, or incompetently-executed. Which we all have, and that signal often works so it's not useless. With some "badly designed" (several design trends out-of-date) sites that signal's incorrect, though.

For most sites I really miss the blocky header+sidebar-nav+footer-fallback-nav trend of 2006-2012. Fast, efficient, predictable. Current design trends are much worse to use for most purposes. But, of course, if I see a site still using the former design, I assume bad things about it until proven otherwise (at least until I see a recent date associated with something) and I'm not inclined to trust them with any money.


Even better is you can bookmark what you want, no navigating through an animated SPA.


"Never" isn't accurate though. If you step away from vapid content sites for a minute, to a place where real money and products are exchanged, there are strong cases for load time affecting the likelihood of purchase or cart abandonment, or whether they'll resubscribe to your SaaS.

It does largely depends on how much the user thinks they specifically need you but if it's between your flabby website, taking seconds to load each page, and a competitor's fast website, you can bet you're losing money to them, even if they're slightly more expensive.

Slow websites frustrate users and look unprofessional.

I know this article is aimed squarely at content but articles like this lead to people making silly comments —like the one I'm replying to— that make absolute statements about all web development, all the time.

Of course some schmoos can get away with some bloat, but that's very different from saying it's only the product that matters.


Everything you mention is something that is the concern of a product manager, not an engineering lead. It becomes the engineering lead's problem when the product manager tells them it is. A huge problem at a lot of companies is when engineering teams go "off script" and build solutions for problems that nobody has. Is it an accident that Google's "10% policy" has largely evaporated as pressures on efficiency and productivity have come in from Wall Street?

Also speed on the type of sites you mention is a solved problem. CMSes these days have an option to generate static content and push it to a CDN. Is it free? Nope. But it's a problem with a solution.


I don't think I could disagree more with your post.

Every developer you have should be capable of weighing up performance ramifications of their decisions, and be able to discuss and justify them. Not saying that every line needs explaining but come on... Not their concern? I'm all for getting stuff done but doing a crappy job the first time more than doubles the TTL.

Moreover, telling your developers to ignore this stuff means it becomes a lost skill. When you do need them to double-back and profile and bisect everything back to 0AD, they'll be much less capable than a performance-conscientious developer.

I don't disagree that over-optimisation is a thing, but a hands-off approach is far more toxic. Big ticket optimisation rewrites should not be something a single developer takes on without their lead signing off. Letting developers do that is a management issue.

And "the type of sites I mention[ed]"? I think you misread my post because I was very explicitly talking about e-commerce and webapps. Not CMSes and not usually something you can run entirely static from a CDN.

And just because it's static and/or on a CDN doesn't mean it's not bloated to hell and back; that's really the point TFA is making.


Wonderfully put, especially this sad tidbit:

I tried to capture a movie of myself scrolling through the Verge Apple watch review, but failed. The graphics card on my late-model Apple laptop could literally not cope with the load.


I have just run down that page in Chrome on a i7 with four cores/eight threads and gobs of RAM. All of the threads jumped to 65-90% constantly, according to the Plasma widget on my desktop which is currently showing low single digits. If I had unplugged the charger I'd be draining the battery at the same rate as the watch timeline graphic but in real-time.


What the hell did you just send me to? If I hadn't already known this was real I would have thought I was on a parody website. A movie that advances frame-by-frame as you scroll? Seriously?

What's funny is it looked good and performed very well before I allowed the page to run scripts.


My strategy has been to disable all javascript by default, and whitelist specific sites. So far I'm pleased with the result, especially how fast most of the random HN/Reddit links load.


Every now (when I break X through a badly chosen compiler flag or whatever) and then I use Lynx. It takes me right back to WAIS n Gopher before this newfangled www thing appeared 8)

I used to telnet a VAX, then a X25 PAD, then to the outer reaches of JANET then to NIST (I think). Then I got to fire up Gopher. sigh (not) the good old days. I can't even remember why - probably to download fractal porn or something equally bloody pointless. I spent three months staffing a helpdesk with no users in a rather well funded organisation in '91 ...


That sounds like my old msu connection...


I've started using uMatrix, it makes such a huge difference in performance and usability.


It's funny to think that Moore's has meant that applications just get less efficient but there is no such concept for internet speeds.

In reality, it seems that people have just settled on a response time that's acceptable and devs try to stay within that window in most cases.


> In reality, it seems that people have just settled on a response time that's acceptable and devs try to stay within that window in most cases.

More specifically, every time CPUs or browsers get better, publishers look at their metrics and go "Oh! PLT at 95th percentile is down! Great, now we can add more trackers while still hitting our perf target!" and so the gain is immediately gobbled up -- we're in a Red Queen Race and websites will never get faster. I cannot wait for the adtech ecosystem to collapse, what a godawful waste of resources this is.


I agree with your sentiment, and I think it's just one facet of a broader issue with how advertising/marketing is rampant. For example, you can pay to see a movie in a theater, and still get shown ads.


Some airlines do it, too. It makes me absolutely furious, but there have been many occasions when, just sitting in my seat TRYING TO SLEEP, the screens will all flip on an ad with audio simultaneously. It's insane that these aren't even the airline fluff pieces about themselves, but actual, literal advertisements.

I've heard that airplane fuel is becoming prohibitively expensive and consumer demand for flights is relatively elastic, so this (and all the other ways they behave miserly) are attempts to remain solvent without drastically raising prices. I feel somewhat sympathetic to that argument (if it's actually true), but it seems like that should be a signal for technology to step in and innovate around the problem rather than attempting to blunder through with more advertisements.


> I've heard that airplane fuel is becoming prohibitively expensive

Why not verify what you've heard? Kerosene prices have actually fallen dramatically from just a few years ago.

https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=E...


Yea, current JetA prices are pretty low in the grand scheme of things. They're also not likely to go up significantly in the next 5-10 years


This is Jevons Paradox (https://en.wikipedia.org/wiki/Jevons_paradox) applied to the World Wide Web.


Reminds me of another article in which the author shows the average web page is larger than a Doom install image: https://mobiforge.com/research-analysis/the-web-is-doom

HN discussion: https://news.ycombinator.com/item?id=11548816


They will slim down after net neutrality is over. But they'll still be user-hostile.

Ads are about tracking. The fact that you might see the ad is what we call "langiappe" where I come from.

If everyone disabled third-party cookies right now, networks would not be able to easily aggregate browsing profiles. And everything else --- the things you actually care about --- would still work.

Of course, then attention would just turn to the ISP's, who can now sell that information directly.


> They will slim down after net neutrality is over.

Net neutrality - in regards to any actual implementation - existed for about five minutes. The supposed benefits were to be long term. The short term impact was next to nothing.


They still use all sorts of dirty tricks like browser fingerprinting (which has a lot of ways to go about), IP, geolocation, timing / usual usage patterns...

Honestly, unless you're using VPN + Tor + no 3rd party cookies + no JS, I don't think you can hide. :(


This article is a great summary explanation for why I love the Hacker News interface.


Same. It's rare to find a site that I can use when I read my data limit on T-Mobile that doesn't turn into a mess. If only I could just tell Safari to turn off all JS and image loading at will.


This is part of a larger phenomenon in IT where resources are plentiful so they are wasted. A small part of it is due to high resolution displays which cause an increase in the size of required bitmaps and some of it comes with 64 bit computing, but a lot of it is due to a general lack of care.

Perhaps the browser needs to implement a limit so a page just can't use an infinite amount of memory, CPU time and bandwidth. Or perhaps we should decide that 10 pieces of javascript is enough and any attempts to load more will just fail. The web designers will complain but they'll manage.


Worth noting this webpage is ~1MB itself. However it does have a lot of images - I'm not entirely sure why it has all of those images, but the size is not a surprise given that.


And that's sort of the point, in my mind: the page is about a megabyte, while something comparable on the modern sites he's talking about would be many times larger. As you say, it's a megabyte because there's a megabyte of content, not a megabyte of Javascript and trackers and "From Around The Web"

Back when I was on 28.8K dialup, I'd typically figure I could fetch a megabyte a minute. Because the site is just plain old HTML, even over a terrible connection you'll be able to start reading pretty much immediately, and the images will pop in as you go. It's a good user experience.


28.8kbps = 3.6KB/s = 216KB/minute

I know what you mean, but I remember waiting about 10 minutes for a megabyte on a 14.4k connection, and about 5 minutes on 56k. Because the phone wires were crap, the modem never synced faster than about 35k.


It has all those image because it's the text version of a presentation. Those are the slides the text is referring to.


Ah, thanks for clarifying. I missed that.

Yeah, I wasn't criticising the size, rather just making a meta-comment of the same form the author made himself in the article.


This seems like another example of bad stuff happening for game-theoretic reasons, so individual action won't help, the only solution is regulation. Ban tracking for a start. It's similar to banning outdoor billboard ads, which many places have done with good results.


How does the USA ban tracking on a website hosted in Germany?


There will always be edge cases, but law enforcement on the internet is surprisingly possible. Look up the EU cookie law and GDPR.


There are few websites these days that don't get the cores a fizzing and your laptop gradually doing your thighs to a medium rare, without an ad blocker.

The article had me giggling to the point of having to explain myself to wifey.


It's all smiles and sunshine, until you choose 13px as the font-size for this post.

It's right then, that it happens. You made a good point about lightweight, text-driven web, but didn't care to use a saner 16px for the font-size.

And so you instantly become a stereotype in my eyes. The old guy shaking his fists at modern hipster web.

See this for an example of a lightweight site with a good design and readability: http://www.zeldman.com/


The meaning of "13px" on a website is completely up to you. All CSS units are relative.


> lightweight site with a good design and readability: http://www.zeldman.com/

>1 MB is not "lightweight".


Yeah, but that page is ugly.


It looks great to me.


Yeah, well, that's just like your opinion, man.


Indeed.


"This is a screenshot from an article on NPR discussing the rising use of ad blockers. It is 12 megabytes in size..."

Hmmm. When I save this page from my browser it's 116k.

"... in a stock web browser."

Ah-ha. Perhaps the web browser I am using is not "stock".

If so, I can still read the article without a "stock" browser, so not sure what is behind this "stock" idea.

I am assuming reading the article is the goal?

Could it be the "stock" browser, whatever that is, is the root cause/enabler of the "web page obesity crisis"?

"Unfortunately complexity has become a bit of a bragging point."

Does that include "stock" web browsers?


FYI the browser I use does not download resources automatically. This way I just get the page with the content: the text of the article. That is all I want.

To read articles, which are just text, I access the web in VGA textmode with a text-only browser. I never see any ads.

There is no such thing as "load time" with this "UX". All pages load fast.

I do DNS lookups in bulk and store IPs before doing any HTTP requests, so I take time consumed by DNS out of the equation. I do not store IPs of ad servers, Facebook domains serving hidden beacon images and other nonsense. I never access these resources.

If I want to look at images or watch videos, I download them and use other programs running on other computers with graphics to view them. Those computers have no gateway to the internet.

For someone who likes to read, it all works very well. But if I was forced to use a "stock" browser of some organization's choosing (Google, MS, Apple, Mozilla, etc.), then perhaps I would have a very different "user experience" - heaps of ads. I suspect this is because those organizations place advertisers before users. If the ad sales business were to dry up it would be a disaster for some of those organizations.

Fortunately, I do not need those browsers to read the news.


Thanks for clarifying--I assumed that you were using a standard browser, and were confused why the size of the file the browser downloads when you use "Save As" was less than Maciej reported.

Is your browser publicly available? Did you create it yourself--and if so, what did you use?


You, sir, are the Ron Swanson of internet consumers and I salute you.


Mr Stallman, I had no idea you were a Hacker News user!


When you save this page it doesn't save every resource downloaded. The correct way to measure is to use the devtools and load the page with cache disabled (to mimic a first load), then check the total content transferred. I get 1.8MB with uBlock origin on, and 2.3MB without. I imagine that the author either got a particularly bad set of ads, or NPR wasn't compressing some large asset at the time.


I suspect you might have missed the central thesis of the OP's article. He appears to be a Polish bloke with a command of English that is frankly rather good and runs a successful web business based around bookmark saving. His thesis is that many sites are a bit fat and slow.

The OP knows about irony. It is a bit like steely but with less carbon. If you are a web dev or dev ops then why not comment on the article rather than playing with toys.

Cheers Jon


Not sure I understand your comment--I was replying to gwu78's comment, interpreting their comment as a criticism of Maciej's (aka idlewords, the author of the linked talk) measurement of the size of the NPR page as not reflective of a realistic page load. Their subsequent comment makes it clear that they have a very non-standard browser and are criticizing the standard browsers, so I guess my comment is irrelevant.

> If you are a web dev or dev ops then why not comment on the article rather than playing with toys.

Not sure what you mean here, or whether I'm supposed to be offended. I don't think I play with toys--I host my own site, carefully keep each page under 100kB total, and am working on a framework to make it easier for others to do so.


Try opening the "116K webpage" with your internet connection turned off.


A "stock" web browser is just a newly installed browser before any add-ons are installed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: