Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Even Fathom and Plausible analytics struggle with logging activity on adblocked browsers.

The simple solution is to respect the basic wishes of those who do not want to be tracked. This is a "struggle" only because website operators don't want to hear no.



I don't know how I feel about this overall. I think we took some rules from the physical world that we liked and discarded others that we've ended up with a cognitively dissonant space.

For example, if you walked into my coffee shop, I would be able to lay eyes on you and count your visits for the week. I could also observe were you sit and how long you stay. If I were to better serve you with these data points, by reserving your table before you arrive with your order ready, you'd probably welcome my attention to detail. However, if I were to see you pulled about x number of watts a month from my outlets, then locked up the outlets for a fee suddenly - then you'd rightfully wish to never be observed again.

So what I'm getting at is, the issues with tracking appear to be with the perverse assholes vs. the benevolent shopkeeps of the tracking.

To wrap up this thought: what's happening now though is a stalker is following us into every store, watching our every move. In physical space, we'd have this person arrested and assigned a restraining order with severe consequences. However, instead of holding those creeps accountable, we've punished the small businesses that just want to serve us.

--

I don't know how I feel about this or really what to do.


The coffee shop analogy falls apart after a few seconds because tracking in real life does not scale the same way that tracking in the digital space scales. If you wanted to track movements in a coffee shop as detailed as you can on websites or applications with heavy tracking, you would need to have a dozen people with clipboards strewn about the place, at which point it would feel justifiably dystopian. The only difference on a website is that the clipboard-bearing surveillers are not as readily apparent.


> you would need to have a dozen people with clipboards strewn about the place

Assuming you live in the US, next time you're in a grocery store, count how many cameras you can spot. Then consider: these cameras could possibly have facial recognition software; these cameras could possibly have object recognition software; these cameras could possibly have software that tracks eye movements to see where people are looking.

Then wonder: do they have cameras in the parking lot? Maybe those cameras can read license plates to know which vehicles are coming and going. Any time I see any sort of news about information that can be retrieved from a photo, I assume that it will be done by many cameras at >1 Hertz in a handful of years.


I'm in Germany, so even if those places have cameras, they need to post a privacy notice describing what they do and how long they retain data. Sure, most people will not read this, but it's out there. I will make a mental note to read these privacy notices more often from now.


I think that's the point. It's the level of detail of tracking online that's the problem. If a website just wants to know someone showed up, that's one thing. If a site wants to know that I specifically showed up, and dig in to find out who I specifically am, and what I'm into so they can target me... that's too much.

Current website tracking is like the coffee shop owner hiring a private investigate to dig into the personal lives of everyone who walks in the door so they can suggest the right coffee and custom cup without having to ask. They could not do that and just let someone pick their own cup... or give them a generic one. I'd like that better. If clipboards in coffee shops are dystopian, so is current web tracking, and we should feel the same about it.

I think Bear strikes a good balance. It lets authors know someone is reading, but it's not keeping profiles on users to target them with manipulative advertising or some kind of curated reading list.


I don't think it's very important that people "can" do this; the only thing that matters if they actually "are" doing it.


The coffe shop reserving my place and having my order ready before I arrive sounds nice - but is it not an innecessary luxury, that I would not miss had I never even thought of its possibility? I never asked for it, I was ready to stand in line for my order, and the tracking of my behavior resulted in a pleasant surprise, not a feature I was hoping for. If I really wanted my order to be ready when I arrive, then I would provide the information to you, not expect that you observe me to figute it out.

My point is that I don't get why the small businesses should have the right to track me to offer me better services that I never even asked for. Sure, its nice, but its not worth deregulating tracking and allowing all the evil corps to track me too.


Here's a better analogy using the coffee shop:

You walk into your favorite coffee shop, order your favorite coffee, every day. But because of privacy reasons the coffeeshop owner is unaware of anything. Doesn't even track inventory, just orders whatever whenever.

One day you walk in and now you can't get your favorite coffee... Because the owner decided to remove that item from the menu. You get mad, "Where's my favorite coffee?" the barista says "owner removed it from menu" and you get even more upset "Why? Don't you know I come in here every day and order the same thing?!"

Nope, because you don't want any amount of tracking whatesoever, knowing any type of habits from visitors is wrong!

But in this scenario you deem the owner knowing that you order that coffee every day ensures that it never leaves the menu, so you actually do like tracking.


As much I agree with respecting folks wishes to not be tracked, most of these cases are not about "tracking".

It's usually website hosts just wanting to know how many folks are passing through. If a visitor doesn't even want to contribute to incrementing a private visit counter by +1, then maybe don't bother visiting.


If it was just about a simple count the host could just `wc -l access.log`. Clearly website hosts are not satisfied with that, and so they ignore DO_NOT_TRACK and disrespectfully try to circumvent privacy extensions.


> If it was just about a simple count the host could just `wc -l access.log`

That doesn't really work because huge amount of traffic is from 1) bots, 2) prefetches and other things that shouldn't be counted, 3) the same person loading the page 5 times, visiting every page on the site, etc. In short, these numbers will be wildly wrong (and in my experience "how wrong" can also differ quite a bit per site and over time, depending on factors that are not very obvious).

What people want is a simple break-down of useful things like which entry pages are used, where people came from (as in: "did my post get on the frontpage of HN?")

I don't see how anyone's privacy or anything is violated with that. You can object to that of course. You can also object to people wearing a red shirt or a baseball cap. At some point objections become unreasonable.


Is there a meaningful difference between recording "this IP address made a request on this date" and "this IP address made a request on this date after hovering their cursor over the page body"? How is your suggestion more acceptable than what the blog describes?


Going out of your way to specifically track people who indicate they don't want to be tracked is worse.


My point is that your `wc -l access.log` solution will also track people who send the Do Not Track header unless you specifically prevent it from doing so. In fact, you could implement the exact same system described in the blog post by replacing the Python code run on every request with an aggregation of the access log. So what is the pragmatic difference between the two?


Even the GDPR makes this distinction. Access logs (with IP addresses) are fine if you use them for technical purposes (monitor for 404, 500 errors) but if you use access logs for marketing purposes you need users to opt-in, because IP addresses are considered PII by law. And if you don't log IPs then you can't track uniques. Tracking and logging are not the same thing.


May I remind you that your own suggested solution was using access logs for marketing purposes!


Google cloud and AWS VPS and many hosting services collect and provide this info by default. Chances are most websites do this including this one you are using now. HN does up bans meaning they must access visitor IP.

Why aren't people starting their protest against the website they're currently using instead of at OP.


We all know that tracking is ubiquitous on the web. This blogpost however discusses technology that specifically helps with tracking people who don't want to be tracked. I responded that an alternative approach is to just not. That's not hypocritical.


Again, you don't answer the question of what's the difference between a image pixel or javascript logging that you visited the site vs nginx/apache logging you visited the site?

You're upset that OP used an image or javascript instead of grepping `access.log` makes absolute no sense. The same data is shown there.


It's rude to tell people how they feel and it's rude to assert a post makes "absolutely no sense" while at the same time demanding a response.

One difference is intent. When you build an analytics system you have an obligation to let people opt out. Access logs serve many legitimate purposes, and yes, they can also be used to track people, but that is not why access logs exist. This difference is also reflected in law. Using access logs for security purposes is always allowed but using that same data for marketing purposes may require an opt-in or disclosure.


> a simple count the host could just `wc -l access.log`

Tons of very simple hosts, like Github Pages, don't give access to detailed info like that (and that's totally fine).


I have, unfortunately, become cynical in my old age. Don't take this the wrong way, but...

<cynical_statement> The purpose of the web is to distribute ads. The "struggle" is with people who think we made this infrastructure so you could share recipes with your grand-mother. </cynical_statement>


No matter how bad the web gets, it can still get worse. Things can always get worse. That's why I'm not a cynic. Even when the fight is hopeless --and I don't believe it is-- delaying the inevitable is still worthwhile.


The infrastructure was put in place for people to freely share information. The ads only came once people started spending their time online and that's where the eyeballs were.

The interstate highway system in the US wasn't build with the intent of advertising to people, it was to move people and goods around (and maybe provide a means to move around the military on the ground when needed). Once there were a lot of eyes flying down the interstate, the billboard was used to advertise to those people.

The same thing happened with the newspaper, magazines, radio, TV, and YouTube. The technology comes first and the ads come with popularity and as a means to keep it low cost. We're seeing that now with Netflix as well. I'm actually a little surprised that books don't come with ads throughout them... maybe the longevity of the medium makes ads impractical.


Hm, I don't think that's the purpose of the web. It's just the most common use case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: