Some number I found online, while trying to multiply the 30 trillion human cells with the data storage of DNA per cell:
"one gram of dried DNA can store 455 exabytes of data"
Seems like a pretty sweet use case to me!
I definitely do lack storage by the way. Say I want to download the common crawl data set, 380 TiB. And for redundancy I'd need multiple copies of the data too. That's a lot of disks for in the home. "18TiB ought to be enough for everyone" really doensn't cut it.
> one gram of dried DNA can store 455 exabytes of data
Yes, and half a gram of Hydrogen could produce ~500 Megawatts of power in a fusion reactor. However, that theoretical value will remain irrelevant, as long as we cannot build a practically useful fusion reactor. And even if we could build one, it still has to compete with all other forms of producing power for scalability, reliability, efficiency and cost.
The fact that there is a very high theoretical number that seems really impressive, isn't a use case.
So, with that being said: how long does it take to write these 455EiB? How long does it take to read them? How error prone are both processes? And how much does it cost to write/read them?
> "18TiB ought to be enough for everyone" really doensn't cut it.
Pretty sure I never said that.
Also pretty sure common crawl can be compressed. Even assuming only a 2:1 compression rate, that means it fits comfortably on 11 LTO-9's. Now, a quick google-search churned out tape prices of about 110-140 $ per LTO-9. Let's say ~150$ per tape, that means the whole thing fit's on 1650 $ worth of storage. About 5000 bucks with 2 backups included. Double that for uncompressed storage.
These days, it costs $600 to sequence a complete genome which contains around 200 gigabytes of data or about $3 per gig. Today, magnetic tape technology offers the lowest purchase price of raw storage capacity at around two cents per gigabyte
end quote.
So just reading the 380 TiB back from uncompressed storage ONCE, would cost ~1,140,000 dollars.
And that's just for reading. At a price differential that is measured in multiple orders of magnitude, a technology better offer some REALLY good, REALLY tangible advantages to compete.
I of course wouldn't want to store my data in there today, I wouldn't even trust that I get it back reliably because DNA reading comes with a relatively big error rate for storage purposes (of course error correction can mitigate that). But it would be cool if the technology progresses. All technology, including disks, magnetic tapes, and new alternatives. Whether DNA is viable in the end or not, I don't know. I do know that tech always has been progressing and new alternatives are sometimes found, and that I do see a use for more storage.
But an argument whether DNA is a viable option in the future or not would have to say technically what the issue of DNA is with future tech.
Whether it's more expensive today, or that there's no need for more data today, are not really arguments against it.
I do not intend to be arguing for snake oil or anything here though. If "DNA storage" is in a similar category of "perpetual motion machines" and "cars that run on tap water" then count me out.
I don't even know how our comments ended up being like arguing against each other. The only thing really I didn't agree with in the original comment was "Because, storage isn't something we lack", because I do find it lacking, both at home and in the cloud.
"one gram of dried DNA can store 455 exabytes of data"
Seems like a pretty sweet use case to me!
I definitely do lack storage by the way. Say I want to download the common crawl data set, 380 TiB. And for redundancy I'd need multiple copies of the data too. That's a lot of disks for in the home. "18TiB ought to be enough for everyone" really doensn't cut it.