I can't imagine managing lots of such systems, but a handful sounds doable, which is all you need sometimes.
I'm currently speccing out a system for an internal application that processes huge amounts of data. So far the plan is to just use standard Postgres on Debian on a huge "pet" server with hundreds of GBs of RAM and redundant 4TB NVME drives and call it a day. It's sized for peak load so no need for any kind of scaling beyond that, and it's a single machine using well-tested software (and default configs whenever possible) so maintenance should be minimal (it'll also be isolated onto its own network and only ever accessed by trusted users so the need for timely updates is minimal too).
It's doable, believe me. Linux was always reliable, but its reliability improvements didn't stop over the years, so keeping a lot servers up to date, and running smoothly is easier than 15 years ago.
We have a lot of pets and cattle servers. Cattles can be installed in 150+ batches in 15-20 minutes or so, with zero intervention after initial triggering.
Pets are rarely re-installed. We generally upgrade them over generations, and they don't consume much time after initial configuration which are well documented.
I prefer to manage some of them via Salt, but it's rather an exercise for understanding how these things work, rather than a must.
In today's world, hardware is less resilient than the Linux installation running on top of it, so if you are going to build such monolith, spec it to be as redundant and hot-swappable as much as you can. Otherwise murphy may find ways to break your hardware in creative ways.
They should only be years behind given very unlucky timing, ie, upstream releases a new version right after the stable feature freeze.
Whatever the latest version is at the time of Debian's feature freeze, that will be the version for the life of that Debian release. That's basically the point of Debian—the world will never change out from under you.
> They should only be years behind given very unlucky timing, ie, upstream releases a new version right after the stable feature freeze.
Literally the first package I looked up is shipping a January 2020 version in bullseye despite freezes not starting until 2021. And yes, there were additional stable releases in 2020.
Stable is released every 2 years, so <=2 at most, and yes on purpose? Isn't that kinda the whole point of releases like Windows LTSC, Red Hat, etc.? That you actively do not want these updates, only security fixes?
Backporting security fixes is forking the software though.
There have been instances where upstream and Debian frozen version have drifted far enough apart that the security backport was done incorrectly and introduced a new CVE. Off the top of my head this happens for Apache more than once.
I for one appreciate the BSD "OS and packages are separate" so my software can be updated but my OS is stable
For apache I never heard about that. Instead the issue I heard about was that debian organises/manages apache quite differently, nothing about version drift.
This will be hosted by Hetzner, OVH or a suitable equivalent, so the “SLA” is based on assuming that they’ll rectify any hardware failures within 2 days. I this case I’ll gamble on backups with the idea that in the worst case scenario it takes us less that an hour or rebuild the machine on a different provider such as AWS.
The machine itself is only really required for a few days every quarter, with a few days’ worth of leeway if we fail. Therefore I feel this is a acceptable risk.
that sounds like a fun project, but you definately want to still automate its setup with something like ansible, saltstack, puppet, etc.
Because someday, you'll get a new pet, with much more CPU power you'll want to migrate to. Or maybe rather than upgrade to a newer version, plus reconfigure disks, etc, its just easier to move to a new system, etc. Or the system just plain dies, DC burns down, etc, and you need to quickly use DR to get it setup on a new system. Having all those configs, settings, application's you install, etc, defined in a tool like ansible, and then checked into git is just about priceless especially for pets or snowflakes.
I agree, learning Ansible (or equivalent) is on my todo list.
I the meantime, a document (and maybe a shell script) with commands explaining how to reinstall the machine from an environment provided by the hosting provider (a PXE-booted Debian) is enough considering the machine is only critically required for a few days every quarter and needs only softwares that’s already packaged by the distribution.
I'm currently speccing out a system for an internal application that processes huge amounts of data. So far the plan is to just use standard Postgres on Debian on a huge "pet" server with hundreds of GBs of RAM and redundant 4TB NVME drives and call it a day. It's sized for peak load so no need for any kind of scaling beyond that, and it's a single machine using well-tested software (and default configs whenever possible) so maintenance should be minimal (it'll also be isolated onto its own network and only ever accessed by trusted users so the need for timely updates is minimal too).