It's not exactly what the post is about, but transactional DDL is a feature I don't actually know how to live without. The idea that schema changes could fail somewhere in the middle is terrifying, and I'm quite happy Postgres just solves this.
Can you recommend any documentation on doing transactional DDL in production?
I would expect major caveats in such a feature, like long lived table locks taking down the whole application for an hour when you don't exactly know what you're doing. For example concurrent index creation is harmless on its own, but combined with another fast but locking schema change, you could get a long lived lock, taking down the application until the index creation finishes (had a similar incident in a different database).
I would suggest taking a look at strong migrations[1]. It's a rails project, but the readme does a great job explaining what it checks for and what safe alternative to use instead. I still link to their explanations in PRs for non-rails projects. The examples unfortunately use the Rails migration DSL instead of plain SQL. If anyone knows of a SQL-first reference, I'd love to hear about it.
> Can you recommend any documentation on doing transactional DDL in production
MS SQL fellow here, rather than postgres, but the concepts should be the same. No docs I can immediately point you at, but some feeling for how I [don't] do things:
I still wouldn't perform most schema updates in live production, I'd wait for a maintenance window.
The key benefit from my PoV is that the changes are transactional so either all work or they rollback and you are back where you started instead of some mysterious mid-way point. Adding indexes is fine, as long as you have spare IO capacity for the process to chew if performed on a large table, and many index alterations can be done online too, and adding/modifying procs, views, and similar objects can be similarly undisruptive if properly wrapped in transactions (so active sessions never see a half updated set of parts), but I'd not make actual table changes “properly live” in production.
> I would expect major caveats in such a feature, like long lived table locks
In DBs with transactional DDL some table operations are practically lock-free or just happen so fast that they might as well be, such as (usually) adding a NULLable column or making an existing one NULLable, and some other operations are, despite being long-winded, sometimes possible to perform online (the new structure is added, then a final sync done for changes made while that happened, then switched over to, and the old parts cleaned up now the new are in use), but I'd never risk it unless I really really had to make the change ASAP and absolutely couldn't arrange a maintenance break in good time.
> For example concurrent index creation is harmless on its own, but combined with another fast but locking schema change
There should only ever be your process making schema changes, as the application and other users shouldn't be at all, so in theory you don't need to worry about two changes competing for locks or causing lock escalation like that. Several concurrent online index creations/modifications are fine, as long as you can afford the extra IO load that will impose, particularly if the objects being touched are on different storage (or otherwise have separate IO quotas) so the two index changes won't compete with each other for IO, but not table changes and similar: keep them to one at once, while nothing else is happening.
In Node projects, in general I don't recommend using knex.js for data access, but their migrations facility is great.
By default it runs all of your DDL within a transaction, but in some cases where you can't run in a transaction (like adding a value to an enum type) it makes it easy to disable it: https://knexjs.org/#Migrations-API-transactions
Having worked with Oracle it is indeed terrifying. You can make it safe-er but it’s a lot clunkier and you have to think it through. With Postgres it mostly just works.
> transactional DDL is a feature I don't actually know how to live without
Same with MS SQL Server. Make several schema changes, and be assured that if things go base-over-apex part way through you are going to be returned to the last good state.
I thought an infrastructure engineer would be more interested in the provisioning process of setting up and configuring authentication, backups, clusters, monitoring, replication and such. The features listed in the article cater more to a database administrator.
thats a good call out - i suppose infra eng can mean different things. This was a bit more operationally focused i agree. I was thinking of talking about HA and replication, but also wanted something relatively short, perhaps another post :). Thanks for reading.
Great article! For anybody interested in this topic, I've been working on a schema migration tool which automates zero-downtime migrations using many of the techniques mentioned: https://github.com/fabianlindfors/reshape. It also uses some other incredible Postgres features, like updatable views and schemas.
Promising and innovative tool indeed. I wonder why the big PG users and players are not investing in this (potentially) game changer tool. (AWS, zalando, gitlab, yugabyte, edb to name a few). Great work!!
Aside from the replication issue, Postgres is just a better SQL database for actually managing your data than MySQL. It has proper typing, proper validations, standard SQL and its documentation is second to none for any software product (FOSS or COTS) that I've used over the last 20 years.
Replication and failover are actually hard problems to solve and the logical replication now gives Postgres the flexibility needed to avoid the WAL-shipping (and the Oracle et al equivalents).
In particular, RDBMS with ACID require 2-phase commits to have true replication, which then requires a transaction controller, which then needs to be HA and have failover. Lots of people cut corners on this and only discover their loss of data when a failover event happens.
PostGIS is truly amazing and I'm not sure any other RDBMS has a similarly extended GIS capability.
I recently discovered that Snowflake has geospatial support, which is handy. I'm not massively familiar with PostGIS but at a glance, I can accomplish a bunch of spatial queries in Snowflake (whereas I figured I'd need to spin up Postgres to make this project work).
To be fair to Snowflake, they make a great product, even if their pricing model is bananas and they want to be Oracle when they grow up.
I assume that it's a reference to synchronous replication, where transactions have to be successfully propagated to standby servers before they're considered committed by the primary. Postgres offers low-level features (PREPARE TRANSACTION and COMMIT PREPARED) that would be needed to implement it, but not a complete out-of-the-box solution. The Postgres documentation on HA https://www.postgresql.org/docs/current/high-availability.ht... is also excellent and does a very nice job of clarifying potential solutions in this general area.
Indeed in ~10 years of administering postgres dbs from 0 - 10TB the only issues that have approached "nightmare" status were from punting vacuum jobs on large/fast growing tables until it got to be too late.
> When an infrastructure engineer tells you they prefer MySQL. It’s probably because for them it’s simpler/easier to operate it. Backups, replication, failover, upgrades. Accomplishing what is most important to them.
> When product engineers prefer postgres, it’s usually because of something like postgis, jsonb/hstore. Some feature they can use in the app to build something faster.
> I hope this helps in explaining why you often see the infra orgs of many large/high scale companies choose mysql.
Isn't that missing a very important reason which is that
(a) MySQL was better known in the period 2000-2010 when is when many of the large/high scale companies started to create their infrastructure, and
Very good points. I'd add that PostgreSQL didn't even get built-in streaming replication until late 2010 so if you were evaluating anything for a high-availability setup before 2011, you'd be looking at MySQL. As you note, database choices tend to be sticky.
PostgreSQL's replication is still a bit of a pain and MySQL has things like Vitess that which offer great tools for scaling out a database (including sharding, replication, etc)
PostgreSQL has improved a lot from the 2000-2010 time period, but MySQL still has some good stuff going for it.
I would add to this list how Postgres makes solving problems in side projects fun. Write a little pl/pgsql rather than building a trigger or service worker system. Play with all the postgis stuff when you're messing around with maps. I've never felt frustrated when toying around and having Postgres playing support.
PL/pgSQL is great to cut down on network round-trips for relatively localized changes, though it can be surprising for folks used to keeping all the work in the app layer. DBs can also become CPU bound bottlenecks.
I have yet to play with postgis specifically, i have read many good things. Thinking of un-archiving some old side projects and experimenting with postgis!
PostGIS is amazing. The types of queries you can write continually blow my mind. One warning is that ultra fancy queries can be pretty slow if you're going beyond a million rows. Some tips:
• Materialized views with indexing are an easy way to solve many speed issues where you'd like to use a fancy query quickly.
• The GEOGRAPHY data type is great for data integrity, but often slower for queries. I've made sure our primary data is stored as GEOGRAPHY then added expression indices [1] casting to GEOMETRY. Spatial joins and filters can be done on the casted column where faster queries are needed.
• Source data is often very high resolution. If you don't need it, simplifying high accuracy data (5m or whatever) to something much lower resolution (500m, 1km, or whatever) in a derived view or table using PostGIS' simplification functions can greatly improve spatial predicate performance.
If you get your GIS data into PostGis, then you should try QGis, to visualize it. It helped also fix some bug and find some unexpected stuff, for example - that in postgis (and gis databases in general?) the order of coordinates is different from what you expect from using a gps in everyday life.
Now.. which order was the correct one? I have forgotten after not doing gis stuff for some years now.
The coordinate order is (X, Y) just like in math (and XYZ if you want to use a vertical coordinate) which differs from our colloquial (latitude, longitude). This is easier to remember if you remember PostGIS supports many non-lat/lon non-WGS84 coordinate systems.
From my perspective, the other side of the coin is that major version upgrades are a major big deal. You can't upgrade in place. You can't even upgrade with downtime (afaik). You've got to create a new database and migrate everything to it.
The Postgres documentation (which is generally excellent) describes workflows for upgrading across major releases, including what can be fairly described as an "in-place" upgrade, via pg_upgrade and the "--link" option.
Upgrades without downtime can be achieved by using logical replication to create a standby instance running the new major release, then switching the new instance to primary.
Yeah, was just about to post this about pg_upgrade. In addition, there was a front page post on HN recently about Adyen upgrading their 50 terabyte postgres instance with minimal downtime: https://news.ycombinator.com/item?id=29923303
In contrast, Oracle's database upgrade agent (DBUA, graphical Java app) will perform an upgrade in place, usually taking less than 30 minutes.
The one place where I've found DBUA to fail is an upgrade of a 32-bit database to 64-bit (version 11.2.0.4 was the last offered for 32-bit). This type of upgrade requires the command-line dbupgrade utility, and it's a bit more complex.
Complete export/import is only required when switching architectures for Oracle. Postgres should try for this sometime.
I've also seen MS SQL Server quietly upgrade an imported database.
Is this zero-downtime? pg_upgrade with the link option is pretty fast, usually only taking a few minutes even with our >10TB database, but it does require taking the server down while it's running which is a major issue for many modern apps.
I'm loving the cockroachdb upgrade process for our cluster - you just do a zero-downtime rolling restart to the next major version, with the ability to roll back to the old version if you see issues, and then when ready allow the database to apply it's migrations internally to upgrade the data structures, which is also zero-downtime.
I think that "Real Application Cluster" (RAC) databases, where multiple Oracle database instances are connected to the storage table spaces, are able to perform a rolling upgrade, but I have not researched it in depth. I am certain that quarterly patches can be deployed in this way.
> As next steps, we’re continuing to investigate the specific failure scenario, and have paused schema migrations until we know more on safeguarding against this issue.
We're in the process of adding MSSQL support to our product. We've been using Sybase SQLAnywhere but we'll likely end up moving as SAP, who bought Sybase, seems to be interested in moving the product in a different direction to what we need.
While MSSQL has some nice operational features over SQLAnywhere, it feels like going back to the stone age in terms of writing queries.
Most annoyingly are the silly limitations w.r.t. aliases, there's no regexp matching, no nice list() function only annoying string_agg()...
Especially the aliases will be super-painful. Just going through a few non-trivial ones, I'm looking at turning them into 5+ levels of nesting due to that.
Alas, our customers wants to manage their own MSSQL instance, so that's what we gotta cater for. If I had anything to say, we'd most definitely would move to PostgreSQL.
I've spent last 5 years with MongoDB and currently trying to use Postgres for my pet project. Alas, it seems to me that Postgres is not really production-ready.
It needs external connection manager (pgbouncer). Replication is very basic: one should do a backup/restore first and then start a replication praying in the meantime for WAL not to run too far. Until version 13 (released year ago) it was either that, or dedicated replication slot will overflow the disk and kill your master if any replica falls out of sync.
No automatic failover out of the box? That's not what I used to with Mongo. Come on, even Redis learned that trick (with Sentinel, yes, but it is part of the package).
Continuous WAL archiving and point-in-time-recovery is a nice feature, but it needs fs-level backup, not the one from pg_dump. What pg_dump is for, then?
Need time and experience to wrap my head around all this.
Your comment is borderline flamebait with that intro, making me wonder if you're just trolling...
Anyway, to start of: why would you want a connection manager Middleware for a pet project? Do you honestly expect to need hundreds/thousands of simultaneous db connections?
The original replication does exactly what it was supposed to, but is not what developers often think they want, which is clearly the reason for your outrage. Clustering isn't easy, as mongodb continues to prove to day by making creation of them easy but then catastrophically fail in production under load as the countless post mortems show. I really doubt you're going to need it anyway for a pet project....
Countless port-mortems or the same post-mortem recounted over and over?
As an employee of MongoDB we treat "catastrophic" failures of our software pretty seriously. If you have experienced a failure we want to hear about. Paying customer or no...
I haven't had any real contact to speak of with MongoDB in years, so my previous comment was uncalled for. I'm sorry for that strong sentence. Not to excuse that kind of claim, but I was indeed strongly triggered by SergeAx comment.
I've heard that MongoDB has improved a lot over the years, so I really shouldn't have said that.
I really don't understand the hate on MongoDB. There are plenty of failures on PostgreSQL or any other database you care to name, especially if you go back far enough in time. Any installed database with less than 10 years of prod deployments is going to have problems sooner or later. I can't think of any significant exceptions.
You are right, I am being a bit emotional about this topic recently. I am reading that Postgres is a most popular database today and I can't beleive it is so immature in critical points.
I see a recommendation of using pgbouncer almost on every tutorial and how-to and feel it is really a vital point. As a newbie user I want to be on a safe side with that. Also, heavily concurrent app architecture, even with connection pooling, may easily produce tens of connections with moderate load and jump to hundreds during peaks. I know from the experience that API users tend not to think about load they create on the other side, just hacking together series of wrapped calls.
Can you elaborate on "countless postmortems", please? On my work we recently had a 12 hours downtime on one of our 20 Mongo databases, but it was caused by a mistake during recovery from storage overflow. I will give a credit to Postgres here, by the way: on disk with 0 bytes free it still works in read-only mode.
> Alas, it seems to me that Postgres is not really production-ready.
It is production ready as long as you want to run it on a single box and scale vertically - which is a lot of projects out there. Agree on the other points.
For my pet projects I want more reliability than for work ones. At work we can afford dedicated SREs and PagerDuty and on-call shifts and stuff. As an individual developer I want things to just work and deal with problems when I am awake and not busy with more important tasks.
Having a single point of failure in the form of a single box database is the stuff nightmares are made of for me. Even for the least critical things I've deployed as an SRE, redundancy is a must. It costs so little upfront, and can be life saving later on.
Mongo solves it by not being ACID compliant, which is sort of a big deal.
This is like comparing apples to watermelon.
Essentially, mongo solves this the same way redis does, but you said redis doesn't count as a database.
Also, it is possible to horizontally scale postgres. It's not for the light of heart, but the easiest way is with read only replicas, which is pretty easy (especially with RDS), and read heavy loads are where postgres really shines, and also most apps out there.
MongoDB is ACID compliant. Yes, I know someone will post the 18 month old Jepsen report, but the fact is those issues were fixed almost immediately. The Jepsen test suite continues to be part of MongoDB's release testing.
I am strongly against treating Redis as yet another cache. For me it is a perfect database for non-critical short-lived data, that can be reproduced from more reliable databases in case of failure.