Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just as any "plain blob storage" eventually evolves a hierarchical filesystem (but with silly quirks!) on top of it.




AFAIK, in Google it was the other way around -- their main blob storage (BigTable) is built on top of GFS (distributed filesystem).

You have that backwards. GFS was replaced by Colossus ca. 2010, and largely functions as blob storage with append-only semantics for modification. BigTable is a KV store, and the row size limits (256MB) make it unsuitable for blob storage. GCS is built on top of Spanner (metadata, small files) and Colossus (bulk data storage).

But that's besides the point. When people say "RDBMS" or "filesystem" they mean the full suite of SQL queries and POSIX semantics-- neither of which you get with KV stores like BigTable or distributed storage like Colossus.

The simplest example of POSIX semantics that are rapidly discarded is the "fast folder move" operation. This is difficult to impossible to achieve when you have keys representing the full path of the file, and is generally easier to implement with hierarchical directory entries. However, many applications are absolutely fine with the semantics of "write entire file, read file, delete file", which enables huge simplifications and optimizations!


Thank you, yes my knowledge was very outdated, waay before Spanner.

Spanner for GCS actually explains how public Google Cloud was always ACID for object listing, while S3 only implemented it around 2020. I always suspected that there must be some very hard piece to implement that AWS didn't have until 2020. Makes sense now that that piece was Spanner.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: