This isn't a fork of PostgreSQL, it's a completely bespoke database solution. Its only tie to PostgreSQL is that we've chosen to appear as a PostgreSQL server to clients. If a user didn't use any versioning features, then the goal is that they should be unable to tell that they're not on an actual PostgreSQL server.
The versioning features are an important distinction though. Dolt (production ready, MySQL protocol) and DoltgreSQL (pre-alpha, PostgreSQL protocol) are built specifically to address the lack of versioning support in databases, and gaining these versioning features is as easy as swapping out the database you are using for Dolt and DoltgreSQL (once it's finished). MySQL and PostgreSQL are written using C/C++, while Dolt and DoltgreSQL are using Go, so there is no shared code. The storage format is implemented using prolly trees (https://docs.dolthub.com/architecture/storage-engine/prolly-...), which are based on merkle trees (used by Git and Bitcoin), so there is no overlap with any existing database solutions.
Hmm, when choosing database solution for a new project I'm not selecting sql dialect (like postgresql sql), but stability and ecosystem. So having this as an extension to postgresql, and possibility to combine it with other extensions it would be great, but reimplementation is a no go here.
And I can not use it for existing projects, because again extensions, and I surely don't want to findout how your implementation differs from the mainstream postgres...
Knowing that extensions are very important to you is great feedback for us. As we hear more about what users' requirements are, it helps us better plan for the future.
Regarding any differences from mainstream Postgres, you can look to how we've handled Dolt, which is production-ready. It targets MySQL, just as DoltgreSQL targets PostgreSQL, and it recently achieved 99.99% correctness according to a set of roughly 6 million tests (https://www.dolthub.com/blog/2023-10-11-four-9s-correctness/). This test is not a definitive stance that we are exactly 99.99% the same as MySQL, but it's a good general guide to how we approach our compatibility, and how serious we are in that regard.
Sadly though, the full versioning capabilities would not work as an extension to Postgres. We looked into it before we settled on our current approach. I talk a bit about it in the blog post as well. To truly allow versioning in the same capacity that Git does for source code, it required us to either fork Postgres and spend years reimplementing all of the work that we've done in Dolt just to get to where we are today, or choose the path that gets something out quickly, and allows us to have the very conversation that we're having right now.
That was actually the very reason for deciding to host an announcement that we're working on it. Many people have said that they'd like Dolt but for Postgres, however they've not said whether they need the Postgres binary specifically, the ecosystem, the syntax/wire protocol, etc. This announcement gives us the opportunity to receive that feedback.
Where are the 0.01% differences? When I'm trying to commit transactions? During select? Just not supporting some esoteric stored function syntax?
And most importantly, how does Dolt compare under heavy load, at the limits of server memory or bandwidth or CPU or disk thoroughput? You can assume an SSD and a multicore processor for purposes of answering.
Thank you very much. You are competing in a field where trust is extremely difficult to acquire - and the consequences to a lead dev for choosing Dolt[greSQL] could end his career. Nobody ever got fired for choosing the incumbent, as variations on the saying go.
The sqllogictests are better used as a judgement of how accurate our results are to MySQL's results given a very large range of statements (around 6 million tests). They are not, however, indicative of the entire feature set. That's to say, there are a few things that are not yet supported in Dolt, however they are the more esoteric features. Things like transactions, etc. are what I would consider "core" SQL, and we are very keen on making sure those work exactly as you'd expect
There are two key points regarding performance that I'd like to mention, and that's that Dolt uses higher-than-expected disk, and we also use more memory than MySQL by comparison. The disk usage is due to our optimization of speed at the sacrifice of disk accretion via temporary storage, which we've decided is a fair trade off considering disk is very cheap (and we're working on making this tunable so users can decide on speed vs disk efficiency). Memory usage is a bit more complex, and I'm not the employee to comment too much about it, but both of these issues are being worked on to reduce their impact. With that in mind, our performance is comparable to MySQL as long as the machine limits are not being reached, however I'd expect us to be a bit slower once those limits are reached, simply due to the extra complexity that we're managing.
Lastly, trust is something that can only be built over time. In 10 years, I'm sure that there will be no doubt of our stability, and at that time I can see Dolt and DoltgreSQL becoming the de-facto databases used for relational storage. Of course, I may be a bit biased :)
Dolt was built with MySQL in mind, and we're creating DoltgreSQL with PostgreSQL in mind. We've gotten interest in a "Dolt for Postgres", and so we're finally starting development on that exact thing.
Is there any compatibility between the underlying storage systems of Dolt and DoltgreSQL? Yes, the interface is slightly different, but is the storage (partially) compatible? Why aren't the two SQL dialects different interfaces to the same underlying storage?
The reason that I am asking is because I have a hard time trusting a newcomer to the database competition. And for purposes off discussion, DoltgreSQL will be a newcomer for the first ten years of its existence. Sharing the underlying storage model with Dolt (still a newcomer, sorry) would greatly increase my confidence in it.
You can think of Dolt as having 4 layers to it. From the top-most interface to the bottom:
Vitess (Handles Wire Protocol)
go-mysql-server (GMS, Implementation of MySQL featureset)
Dolt GMS interface
Storage Engine
DoltgreSQL, right now, shares the storage engine, interface, and a portion of GMS. This is primarily to get something up and running so that we're able to further iterate on a "working" product (i.e. something we can set up testing for). I envision that GMS will split into three, with a MySQL & PostgreSQL core interacting with a base SQL core. Dolt's GMS interface would then adapt to the base SQL core interface. The storage engine is more-or-less set in stone.
The versioning features are an important distinction though. Dolt (production ready, MySQL protocol) and DoltgreSQL (pre-alpha, PostgreSQL protocol) are built specifically to address the lack of versioning support in databases, and gaining these versioning features is as easy as swapping out the database you are using for Dolt and DoltgreSQL (once it's finished). MySQL and PostgreSQL are written using C/C++, while Dolt and DoltgreSQL are using Go, so there is no shared code. The storage format is implemented using prolly trees (https://docs.dolthub.com/architecture/storage-engine/prolly-...), which are based on merkle trees (used by Git and Bitcoin), so there is no overlap with any existing database solutions.