We just spent multiple hours in a team of multiple people debugging this issue. Systemd doesnt work. Checked all disks, fstab, recovery system, etc. and there was nothing clearly wrong.
Turns out something in proxmox (maybe a service?) doesn't understand daylight savings time (Dublin).
The only way to know was to google "proxmox systemd 100% cpu" and find above post.
Christ.
Edit: The fix, of course, is `ln -sf /usr/share/zoneinfo/Etc /etc/localtime`
Edit 2: Looks like that just unsets the timezone. It's too late for me to find the real fix, but you basically want to set the timezone to something else (like UTC).
Sorry this happened to you, these kind of problems tend to be vague and difficult to patch up.
I've advocated for over a decade as a systems engineer not to set system time to anything other than UTC. Not all languages or applications compensate for DST very well, and depending on your applications relationship with time it's a non-trivial problem to solve. What is far more trivial to do is to ship logs to a log server and let the UI of the log server translate time for the viewer of the logs. Almost every time I have argued with executives, managers, and SWEs without SE experience and lost something unexplainable and detrimental happens around DST.
Edit:
For the uninitiated I'll try to paint a clearer picture. All system time in Linux is tracked in seconds past Jan 1, 1970. The problem occurs in the translation between UTC and local time, which is handled by a library/package/module in your application. If your application is time sensitive and not looking for a literal time traveling event then things can get real weird, real fast. If there is a bug in that library/package/module things will also get real weird, real fast.
> All system time in Linux is tracked in seconds past Jan 1, 1970.
> If your application is time sensitive and not looking for a literal time traveling event...
One of my least-favourite facts is that these statements are misleading. Since Linux uses UTC and not TAI, it no longer tracks an absolute measurement of seconds since the unix epoch as it takes “leap seconds” in to account. As well, because of leap-seconds you can absolutely have time-travelling events.
Unix, not linux. I don’t even think they can track TAI, but I definitely have never seen one which does, they all define unix time as 86400*days since epoch + seconds since midnight.
You and I can agree on that in principle but there's a lot of reasons an application may read from system time. If the application reads the wrong system time source, then they get unexpected time travel. Even things like the RTC can experience time travel under certain conditions. My point was, using anything other than UTC is just unnecessary complexity and often enough ends in complicated problems.
If your app interacts with humans or machines in other time zones, tracking in a different time zone can be necessary. Though often people don’t want to specify time zones with their complexities but instead specify times at locations. But I agree, generally your service doesn’t need time zones. The exception is necessary though when time zones get ambiguous or change (e.g. daylight savings going away)
Are you unsetting your timezone, I hesitate to sound authoritative about because I really don't know, and have not found the correct man page yet. but usually you symlink /etc/localtime to the tzfile you want as your local time. what happens when you symlink it to a directory? in your case /usr/share/zoneinfo/Etc
My guess it that it effectively unset it, in your case that probably does nothing(system time is utc and dublin is in utc) what was it set to before?
edit: all I can find is localtime(5) and it only says that /etc/localtime is the local time zone file.
We share this planet with people that don't use UTC. That's just the way it is. If some computers are set to time zones other than UTC, then non-UTC time zones must be correctly handled, even "servers".
There are terminal servers, for example, which are basically workstations. There are servers configured by other people for local time, and we may need to configure our own servers into its network.
There is server software out there that uses the host timezone. Heck, huge vendors do this regularly. This happens in log files, in schedulers, in notification emails, and so on. People want to see their log timestamps in their local time, they want to schedule in local time, and they want their emails to show events in their local time.
Yes, optimally, all times should always stored in UTC or UTC+offset formats, then displayed in local time only "at the end client". Generally, this is actually what happens. Windows, MacOS, UNIX, Linux, and Android all store the system clock as UTC and all system APIs use this.
So in some sense, we are all already using UTC on our servers. The time zone setting doesn't change the system clock to something other than UTC, it just tells user-mode software how the human users expect to see time formatted for display.
The problem is that server software especially is terrible at this. Just... bad. The worst offenders are text-based log files, which almost always get a formatted timestamp with an unspecified time zone. Could be UTC, could be the server's local time, could be Mars time, who knows?
In summary: don't blame system administrators for correctly configuring server settings. Blame the lazy software developers who can't be bothered to use ISO timestamps that specify time zones unambiguously, irrespective of the time zone setting.
Next thing you'll tell me to stop using space characters in file names...
> There are terminal servers, for example, which are basically workstations.
For a terminal server it is not uncommon when different users connect from places with different time zones. It would make sense to have TZ in a user profile but keep UTC as a server system locale.
Add the below as a .reg file and run it, will tell windows to use hwclock as UTC. (adding in case anyone else gets annoyed with the clock mixup situation on dual-booting)
This thread is a good proxy for why commercial computing is so toxic; people can't even agree on the time of day.
We share this planet with people that don't use UTC. That's just the way it is. If some computers are set to time zones other than UTC, then non-UTC time zones must be correctly handled, even "servers".
This is just patronizing hand-wringing; well done you.
For global systems, a single time convention is the basis of co-ordination. So many people on this thread have seem to have a "pets not cattle" viewpoint, it is quite concerning.
They are UTC - the actual time on the system is measured in seconds since midnight Jan 1 1970 (https://en.wikipedia.org/wiki/Unix_time). This time is then converted to local time per the OS timezone setting. A lot of apps (e.g. databases) will also have their own time zone conversions based on context (e.g. per database connection settings).
As far as practical use of UTC vs local time: I am a developer that happens to work with customers directly. It is difficult enough to get customer to provide timestamps for issues they report in local time, I cannot imagine they would be able to convert to UTC.
> This time is then converted to local time per the OS timezone setting.
What exactly do you mean by this? As in, what part of your software stack is doing this. IIRC if you set your OS to be in UTC then anything at the OS level will only speak UTC. Databases and your own software will do what they are told to do but systemd etc. will all speak UTC only. Even running date in the shell should give you a UTC timestamp.
Your system time is totally different from what userspace cares about. Go ahead, run "TZ=UTC date" (or a different timezone if you're using UTC already).
I'm not even sure what you mean by the OS speaking a certain timezone. Basic "time()" will return seconds from the epoch UTC regardless of your system timezone. Things that care about timezones will use relevant conversions.
Of course userspace may convert from what time() returns by spec to any given timezone. But it only does so if you instruct it like you did with TZ= or if it is programmed to do so somehow. I have not delved into systemd’s code but I can’t imagine it arbitrarily converts anything away from UTC time if /etc/localtime is set to UTC.
TZ doesn't do anything magic. It just tells libc you want your local timezone to be something different than the default. It affects output formatting in a way that the app mostly doesn't need to deal with if it only works with one timezone. I'm not even sure what claim you're making now, but the original msg is just wrong. If an app uses libc, it uses TZ setting - regardless if it's systemd or date or something else.
> but I can’t imagine it arbitrarily converts anything away from UTC time if /etc/localtime is set to UTC.
It really depends on which part of systemd you mean. There's lots of code in there and quite a few bits deal with timezones explicitly. (Since it sets it for users to begin with)
You and I are saying the same thing in different ways, except the part about systemd. I didn’t realize it did anything with timezones because I keep thinking of it as just modern initd but I guess it goes beyond that. I didn’t realize that if you remove all mention of local timezones from /etc/ that a user could still break systemd by setting their own TZ.
I’ve seen junior sysadmins and developers repeating this for years also. Don’t know where it comes from.
Apart from this issue, I’ve never once see an error due to the tz presentation settings on a server, and I have spent 20 years running many thousands of servers at a time. shrug.
Let them have to adjust in grafana/kibana if it helps them sleep I guess..
The OS tracks time in epoch but your application doesn't. Bugs like this happen because a library that does time conversion for your application has a bug or your application is time sensitive but doesn't know how to handle time travel.
My challenge has been getting the user to mention their timezone when providing a local time. I can usually guess within a couple of hours, and then find logs that confirm it.
Timezones are disaster. Once upon a time it was difficult and time consuming to make changes to them. Then the world began developing more and "accurate" timekeeping devices. Atomic clocks were developed and began to be miniaturized. Then someone had the brilliant that incredibly accurate clocks could be used to determine location. GPS was born.
A few decades pass and we now all have highly accurate "clocks" that we carry around in our pockets that receive GPS signals and sync their idea of time to what the satellite thinks. Sounds great, right?
Except, humans don't actually use atomic clocks as the basis for their timekeeping. We look outside and expect the time on the clock to mostly match with the position of the sun in the sky.
Once leap seconds became entangled in them, the whole thing basically a toxic pile of radioactive waste. Timezones are now being updated multiple times a year since the internet means we can update them at will. The whole thing is now a giant mess of toxic radioactive waste that can never be correct for any extended period of time.
We, the software industry, did this to ourselves. We built this, we have to live with it, and it's never going away.
I'm not scared of it. It's unnecessary excessive complexity that we have foisted onto ourselves. It has caused countless bugs and will most likely cause countless more before I die. Show me the software that copes with timezone changes at runtime.
It really isn’t that hard. What makes you pull your hair out is dealing with 3rd party code that handles it incorrectly. Do everything internal in UTC or always attach a timezone to dates, your choice.
When you cross the boundary either ingesting naive dates or displaying them read the system time zone again and convert.
I mean it’s kinda annoying but “naive” times is the time equivalent of storing arrays without their length. I can’t fathom why we even allow it.
Generally I agree (mine are in America/New_York), but I did find [0] and fix [1] a weird bug in Debian's cron that only surfaced when I moved. I was in America/Chicago, moved to America/New_York, and then changed TZ on the servers once they were set up. Afterwards, I noticed that a daily cronjob that was supposed to fire at 0845 was instead happening at 0945.
Worked at a very big social media company and we set them all to Pacific since that’s where we were located. The servers themselves were physically located in different time zones. There’s no hard and fast rule what TZ’s servers need to be in IMHO.
this position is getting a lot of flak, but I agree since we don't live in magic world where all software can handle time zones well.
I strongly prefer everything in UTC and changing time zone is an issue for only the vary top layer of the display/user interface/frontend whatever if absolutely needed - generally libraries designed for displaying have much more robust timezone handling than just random daemons and backend software.
I mean, clocks go out of sync, every few (days? Weeks?) I assume the ntp client has adjusted actual local time on your server; not just the time zone your server translates it to when you run “date” we are talking about here…
If it can handle that, it can handle a tz with dls time.
Clocks going backwards for a whole hour is massively more trouble than clocks compensating a few seconds of discrepancy.
Usually being a few seconds ahead is compensated by running the clock slower for some time. Showing the clock to compensate a whole hour breaks many kinds of assumptions. But jumping the clock actually back an hour breaks even more assumptions that look entirely sane.
Using UTC when all of my servers are in one timezones actually decreases clarity, as I live in America/Toronto with my daily existence, and which is what my laptop and (mechanical) wristwatch† are set to. If I less(1) the logs I now have to do mental math as to "when" things happen because the hour number it says in the logs does not match my human reality.
If I look at something now (at Mar 25 21:09) the logs would say "Mar 26 01:09" which is not "now" in my mind.
† Though I've been toying with the idea of getting a GMT watch "for fun". However I'll probably go with a chronograph—more useful feature on a day-to-day basis. Not as many choices if I want to get both complications though.
I'm curious what your logs look like for 1am local time on Nov 6, 2022? I believe you set your clock back on that day, so 1am-2am would have happened twice.
Maybe your logs mention the timezone / UTC offset? Otherwise it would seem possible to get confused.
I have no idea because (a) I didn't experience any problems that needed debugging at that moment so I've never looked, and (b) I wasn't awake looking at the logs for fun at the time.
But I do sometimes have to look at the logs on some random Tuesday, mid-morning, when someone says they can't log into our SSH bastion hosts (but it worked yesterday), I can see there's failed password attempts at 09:59, 10:03, and 10:04 (versus 13:59, 14:03, and 14:04 (i.e., UTC+4), which I would then have to do mental math to bring back to local time), and so it turns out they entered their password wrong a few of times so fail2ban blocked their IP
Wallclock time is a human construct created for human convenience. Having UTC when all my servers are in America/Toronto is not convenient for me.
We don’t watch logs for fun. If issues get reported in local time, using UTC logs simply shifts the burden on an issue reporter (who may be clueless in this regard) and/or an investigator who now has to convert dates back and forth in their communication and probably while viewing, because let’s be honest barely anyone will scurpulously convert all dates into UTC with an additional tool, when it’s easy enough to subtract time in your head.
We could use both dates in logs, but that would eat an unreasonable amount of columns and confuse naive greps. We could also use non-standard time formats, e.g. <localtime8601>[<dst switch warning>] <unixtime>, but standard log handling tools aren’t built for it.
Maybe you aren't on top of keeping your timezone database up to date, maybe you have to deal with dates spanning 2005, maybe Ontario will finally abolish DST. Using UTC simplifies time handling with very little downside.
From a Unix perspective (Solaris, Linux, IRIX), it was fine. And Ontario has provisionally passed getting rid DST changes (sadly in favour of going to year-round DST), and IMHO it will be fine again if it is ever finalized:
Sure. And we all survived Y2K. Point being that setting your servers to UTC is one less thing to worry about. I'd flip the argument on its head and posit that there's no compelling reason to use anything other than UTC.
So long as you keep your TZ database up-to-date local time is likely to be not-so-problematic. Once you have that one server in a closet that didn't get updated, all bets are off.
> Point being that setting your servers to UTC is one less thing to worry about.
It's only one less thing to worry about if you worry about it in the first place. I do not.
> I'd flip the argument on its head and posit that there's no compelling reason to use anything other than UTC.
There is a compelling, or at least useful, reason for me; from another reply I did:
Using UTC when all of my servers are in one timezones actually decreases clarity, as I live in America/Toronto with my daily existence, and which is what my laptop and (mechanical) wristwatch are set to. If I less(1) the logs I now have to do mental math as to "when" things happen because the hour number it says in the logs does not match my human reality.
If I look at something now (at Mar 25 21:09) the logs would say "Mar 26 01:09" which is not "now" in my mind.
Servers (and every other Unix/Linux box) ALWAYS use UTC. The time zone is only specified for the purpose of DISPLAYING the time in some locale where the system resides. The only things that should care about the time zone are those that generate output for human consumption (such as log entries), and this is usually automated.
For a system to not boot in a particular time zone is a serious bug, but you should not blame the problem on the time zone, you should blame the problem on whatever is broken in some critical piece of code that probably should not be using local time.
Assuming servers should be UTC because timezones are hard, you could say servers should be airgapped because security is hard. Or that you should only type ASCII because Unicode is hard.
I mean, yes, you can do that, but don't pretend that a program not handling unicode is the user's fault for typing unicode (s/unicode/timezones, s/typing/using).
EDIT: I'm just saying that you can't blame the user for using the features that come with the system.
It’s all true but you will have a harder life if you give all your servers Unicode names and directories and set them to some obscure time zone that is 15 minutes off.
> […] set them to some obscure time zone that is 15 minutes off.
For anyone curious:
> On the flip side, there’s one border where traveling north or south puts you ahead or behind just 15 minutes. That would be the border between India (UTC +5.5) and Nepal (UTC +5.75).
(you appear to have triggered some downvote fairies by having a differing opinion, have an up to compensate a little! I disagree too as documented below but unless something truly offensive is said I'm of the option that downvoting is generally a bad response)
> you could say servers should be airgapped because security is hard
Many effectively do, and I wish more would do, as a default starting point. Maybe not physically air-gapped but firewall all incoming and outgoing then selectively allow packets through as specifically needed.
Translating to timezones: default to UTC and configure hosted software to display local timezones as needed. Only have the system in a different TZ if something won't work properly otherwise (including it can't be configured to display in something other than the system TZ) or is date sensitive and buggy in a way that it gets confused about yesterday/today/tomorrow when local midnight and UTC midnight differ.
Also if considering resources used by people in multiple timezones, you have to pick something and translate for others, you might as well pick UTC as some things assume it.
> Or that you should only type ASCII because Unicode is hard.
Again yes, by default, and always for things like hostnames, and adjust at needed or when you know Unicode is supported end-to-end (or that where it isn't, something just looks wrong and firefighting that is preferable to not sticking to ASCII in the first place). Though it helps from my point of view that the most obvious omission from ASCII in my locale is our currency symbol.
I say the same for spaces in filenames too: while you perhaps shouldn't have to avoid them, in practise you are setting yourself up for potential trouble later if you don't.
> EDIT: I'm just saying that you can't blame the user for using the features that come with the system.
I agree completely.
But, for instance, while I won't victim blame when someone accidentally leaves doors unlocked and has things stolen, I do make an effort to be damn sure my doors and windows are secure and recommend others do too. There is a difference between victim blaming and recommending preventative defaults because you know there is shit code it there.
Don't you mean /usr/share/zoneinfo/Etc/UTC ? If I remember correctly, /usr/share/zoneinfo/Etc is a directory, and /etc/localtime should point to a zoneinfo file.
Almost all software bugs come from managing state. Which is why hoarding state management in PID 0 is a dumb idea. I said this when systemd was gaining traction and I'm still saying it now.
When Debian switched to systemd in the default was when I began to seriously use OpenBSD and I've never regretted the decision.
> Which is why hoarding state management in PID 0 is a dumb idea.
Because the Unix process API is unreliable and unsafe ( http://catern.com/process.html ), managing processes is difficult to impossible in a general way under Unix systems including Linux. Linux gained some features to help with this, but for the longest time, PID 1 was the only process that could manage its child processes in a way that wasn't fraught with race conditions -- meaning systemd or something like it was the only way to have safe, effective process management because it tracked all that state in PID 1.
Blame Unix being broken, not systemd, for the problems systemd solves.
Because the litany of shittily written bash scripts don’t hoard state, or what exactly are you saying?
The only thing I dislike about systemd is that it is written in C, but otherwise it is a single purpose solution to the surprisingly complex problem of managing services during boot and running. I fail to see how state is relevant here — if it were written in Haskell it would also have state, almost by definition it has to handle external state. No Monad would help you there, there is just correct and not correct software.
You do realize that there is a systemd project and a systemd program. Is the plasma5 shell bloated because under the KDE umbrella there are hundreds of other projects?
This is just another one of those reasons I just can't use Debian responsibly.
Backport hacks always come back to bite you in the end when your distro is running unsupported software, and it feels like I see something at least annually about Debian's "ensure everything is old so all bugs are predictable" ends up just causing pain.
It's the same for every distro. Millions of 5+ year old Ubuntu LTS boxes everywhere. End of life this April. And predictably people will just double down, and keep using an unsupported thing, because upgrades are scary/hard/etc.
Since this only seems to happen in Ireland, could it be because Ireland technically has summer time as its standard time, and winter time as negative DST? Not sure how that's implemented in tzdata, but it could very well be that some applications can't handle that.
Looks like the negative DST has already caused problems in the past and applications that can't handle it (ICU & OpenJDK) have to build tzdata as per the rearguard section / ziguard.awk.
I have to say these time zone files are fascinating. Any place/time when a jurisdiction has changed the definition of time is in those files, along with a detailed history of why.
Example, California had an energy crisis soon after WWII, and changed when day light savings started/ended. There are time zone rules that track this history exactly, including a note about how clocks lost 6 minutes a day because PG&E changed AC from 60 to 59.5 Hz:
It's 2023. Setting a non-UTC timezone shouldn't break your system (or applications). Bugs happen and this is one of them, but "set your system to UTC" isn't an appropriate response here.
Bugs should be fixed indeed but using anything but UTC on a server is swimming against the tide. Bugs get fixed in old software over the time but new software written every day without consideration that time can jump around. And even old software from time to time is being rewritten in new languages with old knowledge / workarounds thrown away with bathwater (old code). So I don't believe we will run out of bugs related to DST switch anytime soon.
The DNS related issues seen (have they now been resolved?) where it takes over the DNS resolver functions and refuses to honor what you have put in e.g. /etc/resolv.conf ; a web search for "systemd breaks normal dns resolv.conf" will give you a lot of results ; unsure if this is now fully fixed.
Strange things with LXC based VMs moved/migrated between hosts, even if an offline move; I think in 1 case I just ended up creating a new VM from scratch; then rsync'ing everything over from the host node into the VM's directory.
If you want a static resolv.conf, but you're not using a completely static network config and have some kind of dynamic network management daemon, be that `dhclient`, or `NetworkManager`, or `systemd-resolved`, you've always needed to explicitly configure it to leave resolv.conf alone.
The DNS issue here seems like you just wanted to use DNS like you had before despite the OS changing. That isn't really a bug, just a change in how the OS operates. If you don't want that, you can disable that feature.
Which services are enabled by default is up to each distribution. In you case I'm guessing it might be systemd-resolved or systemd-resolvconf, if not NetworkManager. Just disable what you don't want.
In terms of problematic default services, I'd look at GNOME. systemd is just the medium.
Same here. I am adjacent two vendor-supported Proxmox clusters (Motorola Emergency CallWorks 911 phone system) and there was no outage / issue last fall moving from EDT to EST, nor two weeks ago flipping back. We’re a 24x7 operation and would have heard about any outage (either planned to address the issue before it occurred, or triage during an unplanned outage).
I ran into issue, where I shutdown system with Proxmox before start of EDT, shipped it to co-worker and when he powered it on after daylight saving time was in effect, time was off by 1 hour for Proxmox and all LXC containers.
Fix was to re-symlink `/etc/localtime` to proper time zone.
Didn't want or cared to dig into this deeper, but interesting to see that there are more issues about this.
Many places in the Southern Hemisphere have been on daylight savings time over the past few months, I'm wondering how come this bug did not surface earlier.
Just the usual reminder that systemd is crap and that is terrible for us that it was forced shoved to everyone in Debian and co.
The main issue was always this one, having side functions like dealing with timezones breaking your pid 1.
In such a case it is almost impossible to resolve the problem from the system itself you have to mount the drive with another OS or reinstall when possible.
i hate systemd it's always getting hung up on shutdown. "a stop job is running for blah blah" yea no. when i'm ready to shut down, it needs to shut down. immediately.
When I shutdown I would like a shutdown sig to be sent to my database and then continue once it gets a successful response. No one wants that to just be killed because that would cause data loss.
When I shutdown I want my filesystem synced so things mid-write don't experience data loss.
The problem is badly written scripts that keeps sending a "one more min" response. But you could override that if you actually cared instead of just whining.
Also, systemd comes with a timeout, after which it kills the app anyways, so I think you can surely configure that 90s timeout to be something like 5s or so, which then is much faster.
But as you wrote, If someone really cares, they would fix it.
PS: But what usually gets me about these 90s waits whenever I get them is that the message does not say which thing (unit, etc) is the issue. THAT is something worth criticising.
To my understanding systemd does proper job here, as it supposed to be, highlighting the issues with other subsystems - be it badly written scripts or misorder of shutdown sequence, those parts were ignored by "traditional" init systems before.
We had a requirement from a Government agency to store the date/time in the system's local time (Ireland, in our case). All certs issued had to have the time it was issued, and they had to be local.
All systems in common use, use UTC for the underlying representation.
The concept of setting the time zone is a high level library thing already and all these systems, their standard time functions simply provide a time since some fixed epoch.
Ie see C time(), gettimeofday, clock_gettime, GetSystemTime. They’re all “UTC” based.
If a piece of server software is going out of its way to deal with local time APIs and blows up because of it - well it either legitimately needs to deal in local time, or it’s so poorly designed that simply keeping the system locale set to UTC isn’t somehow a magic fix. That’s just an adhoc assumption.
Turns out something in proxmox (maybe a service?) doesn't understand daylight savings time (Dublin).
The only way to know was to google "proxmox systemd 100% cpu" and find above post.
Christ.
Edit: The fix, of course, is `ln -sf /usr/share/zoneinfo/Etc /etc/localtime`
Edit 2: Looks like that just unsets the timezone. It's too late for me to find the real fix, but you basically want to set the timezone to something else (like UTC).