Because now every asset you download from the web is a Google tracking resource.
Is it really unclear what’s going on? When you perform a GET request for these assets you are being monitored. These requests end up being part of the profile built for you which is used for advertisement targeting and content recommendation.
PS: You work for Google. Do you work on this project?
> When you perform a GET request for these assets you are being monitored.
You'll only ever retrieve Google AMP cache results from the Google search page, where they were already able to track if you made such a request, since the link you clicked has trackers in it.
So from that perspective, nothing changes.
> PS: You work for Google. Do you work on this project?
No, I work on mostly internal infrastructure. My interest in AMP is simply that I don't dislike the AMP "experience", it's fine. But more importantly, I legitimately don't get the HN hysteria around AMP. Returning to your concern, literally nothing changes with AMP vs non-AMP.
I don't get it. The most compelling concern I've heard is that it's annoying to have to couple parts of your infra to AMP-standard stuff. And I sort of understand that. But even that isn't different than previous SEO/ranking changes that required changes to the page.
Assume every website would use AMP. Like on mobile. How often do you get redirected to AMP already when you just click a random link? I sure do!
From then on, every asset it loaded via Google servers. Google now controls the entire internet. Google does this so it can serve its ads and track all users. It's as if I would only use Google for my internet surfing.
I don't use Google because I strongly believe it is an evil company, but if websites use AMP, then I am forced to hand over my data to Google even though I don't want to.
Right now, I can block Google servers entirely. But if the entire web is served via AMP, I can't do it.
And that's the whole reason AMP exists. So everything I do (or at least as much as possible) goes through Google servers.
> You'll only ever retrieve Google AMP cache results from the Google search page, where they were already able to track if you made such a request, since the link you clicked has trackers in it.
So from that perspective, nothing changes.
I am not affected, I don’t use Google search. The problem is for individuals who use Google search and now don’t have an option to avoid in deep tracking. The difference between regular pixel trackers and multiple data points associated with every resource a site serves is immense. I work in ad-tech, not particularly in the identification side, but I started multiple projects in that end. From experience, a regular tracker can be fooled, but you cannot fool every resource request. One of the things I did to identify ad fraud bots was actually drive them to a site in which I controlled every resource. The resource request fingerprint for bots was easily distinguishable from real people. Moreover, some humans exhibited navigation patterns that were distinguishable from other humans. I remember I caught a QA person doing a shoddy job of testing the front end once due to it. That is the kind of power that Google is acquiring as more and more sites choose to use AMPs. It is scary to think that a single identity has that power.
Nothing beyond the initial page load is served by the AMP cache. If you request additional dynamic resources or navigate to other pages, you'll go to the originating site.
In the example I saw, you could go through a mini site experience; you can “visit” the site without leaving the AMP. I don’t believe this one change, serving images from the AMP cache, is the issue. My concern is with the proliferation of AMPs. A lot of individuals will browse AMPs thinking their browsing is between them and the site publisher without realizing Google is in the middle.
Honestly, Google has an okay track record of respecting people’s data. In their AdExchange, they are one of the few that obfuscates the IP address. However it is still concerning that a single entity continues amassing all those browsing patterns from billions of individuals. It can be abused easily, with intent or not.
@DevKoala, do you have an example where you encountered the "mini-site" experience? I haven't seen it, but it could be a bug that would be worth fixing.
AMP also offers traffic after the load, so this is in no way comparable to a click tracker (which is also none of their business, btw). This will also offer in-band ad obfuscation eventually.
I’m curious, what do you value about the services you consume? I like transactions where I know what I am giving to the service provider. Do you really want to push away from this reality for the benefit of a few MS load time leaving a search page? That’s essentially what you’re arguing for.
The AMP team doesn't prefer these URLs shared either:
If you click the browser share icon, or trigger the browser native share intent, the origin URL will be shared, not the AMP Cache URL. Only if you explicitly copy the URL bar will the AMP Cache URL be shared.
The Signed Exchange spec that AMP has offered sites for a year now allows them to have their own URLs displayed in browsers that support it. In that case, the google.com URL will never be displayed and thus can't be accidentally shared.
All AMP documents on the AMP Cache contain `<link rel=canonical href={origin url}>` and Google recommends that social media prefers the canonical URL. This is useful outside of AMP as there are often multiple URL variants for any article. The sharer and sharee may not ideally get the same version. As an example, a mobile vs. desktop article.
That's really quite useless. I rarely share links by clicking weird "share link" buttons. I usually have half a message already composed in mail/messages/slack, and I just want to cmd-L cmd-C in the browser and cmd-V in the message I'm writing.
Also, "the google.com URL will never be displayed" is a world with an internet I don't want to be a part of.
The workflow I described goes for mobile and tablets just as much as desktop; for the cases where a keyboard is not connected please mentally replace "cmd-l cmd-c" with "tap in address bar to select, tap copy".
Also, others might share an amp link from their mobile devices, which I then end up clicking in a desktop slack/mail/messages app, and there we go again with the amp virus even on desktops.
It's more of a question of if a specific document is using AMP, the site can be a mix. Just like a site using jquery as an example.
An AMP page can be identified by examining only the first few bytes of the HTML. The `<html>` tag will contain either the `amp` or lighting-bolt emoji attribute, ie: `<html amp>`.
Technically an AMP document must pass AMP Validation to be truly AMP, so there are documents that match the above condition which aren't valid AMP. There are multiple ways to validate. A starting place is https://validator.amp.dev/
> Would you bat an eye at Google acquiring CloudFlare?
Not anymore than I already do bat an eye at cloudflare.
(It's probably worth noting here specifically that I do work at Google, so my risk profile is probably different than yours, for me personally and speaking solely from a trust perspective, I'd probably prefer it if Google acquired CloudFlare since I would get a net increase in transparency, but I can understand why that isn't a general position, and there are other reasons I don't think Google acquiring cloudflare would be good).
Thank you for taking the time to engage. Google is a scary beast at the end of the day, and I firmly believe it's an organism that should not remotely resemble what it is right now. Splitting it has potential to go a long way thinking about it.
I share these fears to a lesser degree with Microsoft and of course Facebook. Apple seems to do a great job of safeguarding, but they could become sour if they don't remain careful. Stuff like Clearview crosses the line into directly-dangerous. CloudFlare is currently innocent in my eyes, but they've managed to centralize a lot more channels than I'd like to think about.
I don't think you may realize how much of your online activity is already tracked by google / facebook / instagram.
Google's javascript is everywhere, including explicit tracking with analytics, and lots of CDN loads for endless lists of things (js libraries, fonts etc).
Their properties also track you, google search, youtube, email. They also make software you might use (chome / android / google maps / google play store).
If you think something about signed exchanges let's google track you, and they can't now... please examine these assumptions.
Folks who come up with these super complex schemes (google will use javascript loaded into AMP to take over and track you) ignore that google ALREADY tracks them.
And folks who say they don't use any google products (no android / google maps/ play services / chrome etc etc) are often either lying or don't understand how many third parties load google analytics into websites, or load recaptcha bot protection etc.
Just realize that AWS / GCP / Azure have already gobbled up vast swaths of website hosting in all forms and are growing along with some free CDN and DNS providers.
If google said, we want to track people, and brings android, chrome, dns resolvers, network infrastructure, google cloud compute, AI systems, google analytics which these media sites voluntarily, google play services etc to target and track you - they probably could.
EVERY single person (including you) who claim they don't use google, if you dig down, they often are lying and do. And if you don't, some of the people you email or interact with do, so indirect profiles can be built.
AMP solved a need for a lot of users, which is the janky, slow ad filled websites that media sites in particular had become. So there is an actual end user reason people like AMP - it's a better user experience in many cases. This is where AMP is ruining the web gets hard to support. For most folks they don't perceive they are giving up a lot more in terms of privacy, and they are getting a lot.
> If google said, we want to track people, and brings android, chrome, dns resolvers, network infrastructure, google cloud compute, AI systems, google analytics which these media sites voluntarily, google play services etc to target and track you - they probably could.
Their data governance team wouldn’t allow it. You are basically describing a system they could only introduce with the permission of the government. I don’t care if the government is tracking me honestly, I can’t fight that. I just don’t want Google tracking me for the purpose of influencing my spending habits, emotional state, or perception of the world. That is my main beef with their advertising capabilities.
PS: You work for Google. Do you work on this project?