How did people hunting for the origin of this image discover the random niche website preserved by the Internet Archive that this image happened to come from?
Up until last month, the earliest known post/repost of the Backrooms image was an archived 4chan post from 2018, but it was believed to have been taken in 2012 or earlier based on the filename. So people have been looking for earlier posts/reposts of the image for years in an effort to uncover its origin.
During the recent successful search, the searchers trawled 4chan archives for early-2010s posts with similar image metadata to the 2018 Backrooms image copy. These archives were missing the original image files and thumbnails, but still retained some image metadata that could be filtered on (dimensions, image file md5s etc.) One of the searchers came up with a list of posts which might have originally included the image file, based on image metadata and context. Another searcher plugged the image md5 of one of these candidate posts (an April 2011 post recently added to an archive) into other archives, and hit on a post with a thumbnail matching the original Backrooms image from March 2011. At this point they'd finally found an earlier copy of the image, after years of searching.
Soon after, one of the searchers plugged the filename of the March 2011 post into Twitter's search, and came up with a post from 2019 which included the physical address and a link to the image source (this Twitter user had already found the source before the search had really begun, but it had gone unremarked upon at the time). The website had been replaced with blogspam in the interim. A searcher plugged this domain into waybackmachine and found a page with the image and a full explanation (it was taken during the renovation of a commercial property in Wisconsin).
Should have posted that info in a sane location where people discuss these things instead of tweeting it into the void hoping that someone was looking.
Jokes aside, what’s wrong with having an interest in a (seemingly very) niche subject that doesn’t harm anyone and is a pretty cool investigation task?
Wait so, the Internet Archive was not involved at all in finding the original, but since the image exists in the archive, IA have written a blog post claiming to be crucial to its discovery? Seems like taking credit for something they didn't do to be honest. They didn't even mention the Tweet in the blog post which was essential to finding the image, which makes me think they want that part overlooked.
dang changed a twitter link to non a twitter one in the past month or so because paraphrasing the twitter one may not work, let’s not prefer twitter, blah blah [nb original link worked fine].
Sorry I didn’t archive this but should be in the history.
That said I know many here have an involuntary eye spasm episode with Elon being mentioned but not sure IA does so not sure I agree with the original accusation.
I try to not mention Xitter because I just want it to go away. It was terrible as a conversation format before, and it's completely unusable now. Oh yeah, the owner is a raging douchecanoe too. But mostly it's just broken.
That wasn’t the context of the question. We are well aware that there are randos out there who don’t like Elon. I was asking why the GP felt the need to share the assumption of the author’s reasons for not mentioning the source, despite having no evidence. You’ve managed to lower the bar even further in the conversation, however.
The original URL of the photo was actually found on Twitter, where it had been posted in 2011. Wayback Machine was used only for the final confirmation. It's curious that this is not mentioned in the article, but I suppose it ruins the narrative.
I read about the whole thing last week at 404media, via waxy blog, which is a much more comprehensive article: https://archive.is/sj846
This seems very unfair. There's absolutely no point, anywhere in that IA blog post, that says "We found it". Anywhere! They're just providing information on the history of the file, from their archives, and detail into why it's an amusing story.
They even say
Naturally, as news of the Backrooms being “found” travels throughout the world, responses have wildly ranged. For some, this is a proof that “with enough eyeballs, all problems are shallow”.
I think it’s usually pretty customary to attribute the original source if you’re going to write an exposition. They go through enough work explaining exactly the location of the furniture store, you’d think they’d have the courtesy to link to the tweet which actually made the discovery.
As has been pointed out though, IA has a pretty tenuous relationship with the Musk/chan adjacent parts of the internet, so it doesn’t surprise me they deliberately left those facts out.
From other comments in this thread, it looks like some guy on X solved it in 2018, yet others had no idea, and solved it independently too.
How many others "solved it"? Where is the official record of this? Who really was first? Did someone solve it before 2018? Did more than one party solve it, independently, recently?
Trying to untangle that and be sure, isn't simple.
They covered this by saying many eyeballs solved it, and then went on to their key message, highlighting the importance of archives, and showing how their archive proves the solution is correct.
My point is, pointing an accusatory finger at them, and saying they are trying to take credit is not fair.
I didn’t read the details of how they did it, but it would be cool if the Internet Archive exposed some kind of image hash / perceptual hash / similarity metric database, so that this task could have been a quick lookup in such a database.
It wasn't needed for this but it would be great to have a fulltext (and binary!) and reverse image search of the internet archive. Often you know something about what you are searchng fore but have no clue about the location.
Of course this is not going to happen with the current resources of the IA and if it did it would probably just result in them getting hammered with DMCA requests and other legal demands for content that the "owners" didn't even know was on the archive.