Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Rev="canonical" tags to cure short url problems (shiflett.org)
19 points by davedevelopment on April 12, 2009 | hide | past | favorite | 13 comments


I think url-shortening is just a really bad idea in general. Certainly giving normal people the ability to track who clicks on their links is a good idea, but you don't need to obfuscate the url to do that. You could create a nice tracking service using urls like: http://digg.com/www.mysite.com/interesting-blog-post. The real cause of obfuscation is twitter's character limit, right? What does twitter gain by including anchor text in their 140 characters?


The ability to follow links embedded in SMS messages. The whole thing is really a problem with the SMS protocol.


As usual a lot of people jumping in and engineering a solution without understanding the problem.

- Twitter links are almost never clicked on outside of the first several hours. 301 redirects will cause search engines to use the canonical URL as well. This Link Rot Apocalypse is largely a bunch of nonsense.

- People use short URL services to get analytics on the data they're sharing, not just to shorten the URL. People like their click stats.


I agree on the Link Rot Apocalypse. I've seen services like URL Tea vape to be sure, but bigger services like TinyURL are rather more stable. In fact, I use TinyURL a lot to link to news stories, and many's the time I've followed a shortened, descriptive link back to news sites to see that the sacred "canonical" link now 404s.

People are overstating both the ephemerality of shortener sites and the longevity of original URIs.


"almost never clicked on outside of the first several hours"

Got data to back that up?

The click stats are really great for addictiveness.

I think rev="canonical" is not a great solution either.

BTW, I meant Link Rot Apocalypse facetiously.


In fact I do have data to back that up but I cannot share.


I may be a bit dim here, but if you are doing the URL redirection your self, because you don't what someone else hijacking your content, and long URL's are not usable, then why not just create short URLs to begin with.

You wouldn't need any thing special, and people wouldn't need to shorten your url.


It's not just for shortening urls it's also for deciding where your search engine scores go. So to take an example from a googler at a recent conference say you have an online store that has a bunch of categories. Someone wants to buy a red bag and they go through bags then to red and pick item 5. /bags/red/5

Someone else goes to red things then bags than item 5 /red/bags/5

You end up with duplicate content issues and the possibility of having your search engine score split between the two pages.

With canonical you can say on both pages that /items/5 is the real url and thats the one that will get the search score + be indexed most likely.


All you have to do is set up a rule to have the screening terms in a certain order. We did on Dawdle in like a day when we realized our in-stock item screening presented the issue: http://www.dawdle.com/search.php/

I was pretty convinced that canonical was a solution in search of a problem, but at least joshu's opening salvo, the DiggBar brouhaha, and the resultant conversation has shown one use for it.


It leads me to wonder, if google is as smart as most people see them as, why don't they make an attempt at working this out, or at least defining a way to deal with it themselves?


They kind of do but this gives site owners a method to instruct Google on what they want stored about their sites. You can't figure that out with an algorithm.


I'm all for creating a better solution for short urls, rev=canonical is not a good option.

The proposal mixes alternative urls with the idea of a canonical url. Besides being damn confusing (remembering the difference between rel and rev), it opens up security issues. Moreover, the 'rev' attribute is not included in HTML 5, making this proposal obsolete before it even starts.

I've written up a longer explanation of these problems on my blog for anyone interested:

http://cubiclemuses.com/cm/articles/2009/04/12/short-and-can...


Ultimately, if you want to discourage shortener services, you need to make short, meaningful URIs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: