Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
You don't need to daemonize (ntlworld.com)
115 points by vezzy-fnord on June 28, 2015 | hide | past | favorite | 77 comments


We still run daemontools even in systemd systems. The OS packaged cruft runs in systemd. Everything important runs in daemontools. Still the best "services manager" out there.


Why? It's no longer necessary.

Also, daemontools suffers from the same problem as other non-systemd service managers; it can't manage processes that daemonize themselves for good reason (like nginx or Unicorn). systemd can handle it because it creates a unique cgroup for every service; if the service needs to be stopped, all processes in the cgroup are terminated.


There's plenty of successors to daemontools following the same principles, but more advanced: s6, nosh, perp, runit, etc.

That said, from what I know systemd doesn't necessarily handle double-forking daemons any better. That is to say, if you set the service to Type=forking, it'll still need to use a PID file (with its inherent race conditions) or employ a PID guessing heuristic, which can fail.

This is a universal problem that no hack can truly solve. The bottom line is if you're running under a service manager, you don't daemonize. You delegate to the service manager to daemonize for you. Any deviation from this will be finicky on most Unix-likes.

The purpose of cgroups here is instead process tracking, i.e. reliably killing all children. But for some services this is exactly what you don't want, and then there's nothing stopping you from allocating cgroups yourself if needed, or talk to a cgroup hierarchy management daemon like cgmanager. Ultimately, you just want some unit of isolation here, so you can use whatever your platform has, i.e. jails or contracts.


One little known feature added in Linux 3.4 is the PR_SET_CHILD_SUBREAPER prctl(2) [0] that lets you designate the process as being a "subreaper" and it will be notified of any descending subprocess exits, just like init would be.

[0] http://man7.org/linux/man-pages/man2/prctl.2.html


Yes, that was actually proposed by Poettering himself. DragonFly BSD has a similar mechanism with procctl(2).


Yeah, I did research a while ago to find out where that patch came from when there was a lot of noise about how systemd needed to be pid 1.


As a bit of an aside, cgmanager got kinda sidelined thanks to the cgroups kernel maintainer at one point insisting on there being only one cgroup management process. And as systemd did both init and cgroup management, it ended up being favored...


We run both nginx and unicorn through daemontools. Both have options to prevent daemonization.

Why run something terrible (systemd) when we already have something that works really well (daemontools)?


> Why run something terrible (systemd)

It is terrible only because you say it is. It's been a massive help for us in adopting it, and accepting the help it offers makes custom software systems a lot easier to write in a robust way.

It's only "terrible" if you have axe to grind about the "right" way and then define that as daemontools.


I use systemd to start runit which is like daemontools. I consider daemontools/runit it right because they gives me all the benefits I'd want out of systemd (not doing anything special to daemonize my process, supervision, log handling) while being much simpler and more well tested.


> while being much simpler

The sum of the init functional that can deliver what systemd is doing (and it is good to do those things) is more complex than a suite of tools for launching daemons in an unhelpful environment. Sure.

I submit the universal configuration file tooling is fantastic in a day and age where we provision and configure machines with template-based tools like salt and ansible rather than synchronized command tools like puppet.


I'm not sure I understand what you're saying. Is it that it's nice to be able to copy over a systemd service template file with ansible vs running commands to start services?

If so, how is that different from copying runit service files instead? If not, can you elaborate / reword?


Sorry, that post was on mobile. Honestly, I do not relish posting on this terrible corner of the internet. But I'll respond, and it's better late than never.

There are 2 things:

1. Systemd's boot process does more (and shows better performance and in some cases better resilience). So while many people complain it is more complicated and this is true; it is still very simple for what it is doing (which is much more than daemontools).

2. Runit and monit and stuff all eventually appeal to launcher shell scripts. You do not 'copy' these over with ansible in a real operational deployment, you 'generate' them with profile-specific variables (e.g., beta vs production, datacenter-specific values, etc). Template generation of shell scripts is MUCH more subtly error prone than generating configuration files with simpler grammar and static correctness checks.

Systemd makes more functionality available as a configuration option as opposed to functionality available via shell scripting. I think that makes it much better.


> It's only "terrible" if you have axe to grind about the "right" way and then define that as daemontools.

Conversely, It's only "terrible" if you have axe to grind about the "right" way and then define that as systemd.


I have an axe to grind about the way it was introduced. Fedora and Red Hat massively mis-managed it. Previously, they introduced various things in tech preview releases, and it either stayed or went with subsequent releases.

systemd was thrust upon us by an axe-grinding developer.


So, stupid politics rather than technical merit.


What axe did they grind? I've got only good things to say about systemd so perhaps I can't understand the issue?


Sounds like your problem is with Red Hat, not systemd.


While I'm not a fan of systemd (same thing can be achieved with upstart/supervisord for example), there's just no point to run daemontools on top of the new inits.

deamontools provides: logging stdout, restart, keeping things running. That's pretty much all. The exact thing is already provided by systemd/upstart/supervisord, with additional benefits: resource limits, namespaces, different behaviour on multiple failures (daemontools will just keep restarting your process as fast as possible, taking 100% CPU if possible and flooding the logs), support for syslog/journal rather than just local files.

By putting daemontools on top of systemd, you're just adding simple system on top of a complex one. You're losing features, but not gaining anything instead.


You've got some of the daemontools details incorrect.

> resource limits

Daemontools includes the softlimit [1] helper.

> namespaces

It's not clear to me why this need be built in to a supervisor instead of being applied through generic helper programs such as unshare(1).

> daemontools will just keep restarting your process as fast as possible, taking 100% CPU if possible

supervise [2]: "It restarts ./run if ./run exits. It pauses for a second after starting ./run, so that it does not loop too quickly if ./run exits immediately." Simple and not configurable, sure, but should not hammer the CPU.

> support for syslog/journal rather than just local files

Daemontools svscan [3] "optionally starts a pair of supervise processes, one for a subdirectory s, one for s/log, with a pipe between them". The "s/log" script can do anything you wish, reading logs on stdin. If you use the included "multilog" program, then it's certainly geared toward writing local files, though it does include the ability to run arbitrary post-processor during rotation (which might, for example, fire off a job to copy the logs to an aggregator). Or you can not run multilog and just send the logs to syslog or whatever.

What I do myself is send save the logs locally using s6's multilog analog, s6-log, and also pipe them to a local syslog that forwards them to an aggregator outside my control.

And, in spite of newer supervisors adding more and more features beyond what daemontools provides, daemontools is still an awesome system that was 14 years ahead of its time.

[1] http://cr.yp.to/daemontools/softlimit.html

[2] http://cr.yp.to/daemontools/supervise.html

[3] http://cr.yp.to/daemontools/svscan.html


Xe also has the logging problems entirely backwards. It is systemd where one has problems with log flooding. One example of a person suffering from this is http://unix.stackexchange.com/questions/208394/ . The daemontools convention, ironically, is to have multiple separate log streams, usually one per service, which cannot flood one another.

Restart behaviour sometimes is configurable, by the way. (-:


Re. resource limits, I meant more than just softlimit - cgroups CPU and network throttling.

Re. namespaces - sure, it doesn't have to be built in. But many people use it and it's convenient when it is.

I used the daemontools many years ago and remember fast cycling to be an issue which was worked around by manual pauses in run scripts. If it was fixed later - I'm glad it works.

But my main point was that daemontools worked great when we had simpler inits. Running it with inits which could restart makes sense. Running it with modern inits just doesn't give you anything interesting apart from another idle system process.


> We run both nginx and unicorn through daemontools. Both have options to prevent daemonization.

How are you able to do hot-reloads without daemonizing the master process? Or have you simply chosen to sacrifice this feature?


Hot reload? Like a config reload? Just send the master process a HUP.


That can't possibly work correctly with runsv, given the way in which Unicorn performs configuration reloads. Eventually the original process that runsv forked will die, and runsv will try to restart it and fail (because the original process launched a new generation that continues to listen to the socket). Then 'sv stop' will fail to work because the new generation of master and workers isn't under runsv's supervision anymore.


nginx doesn't die. It just reloads the config file and applies the changes. I can't remember what Unicorn does because we almost never change its config.

Also, we're using daemontools not runit.


> processes that daemonize themselves for good reason (like nginx or Unicorn)

Is there a good reason? I assume you're thinking of the ability to fork and exec a new master process, running, e.g. a new version of nginx, then handing control to that new master without dropping any connections, possibly even waiting around a bit in case the new master dies and the original master needs to take over again.

I suspect with something like s6-fdholder-daemon [1] could be used to orchestrate a similar process, though I'm not enough of an expert to know. Instead of inheriting file descriptors through forking, an entirely separate, supervised nginx-newversion could be started, get the file descriptors from fdholder, then coordinate with nginx-oldversion about who's going to accept new connections.

Certainly depending on such a service is arguably more complex then just having the functionality built-in. On the other hand, just as every service should not implement its own daemonization, one can argue each service should not implement its own hot restarting. If the technique was more common, the argument to not duplicate such code would be stronger.

[1] http://www.skarnet.org/software/s6/s6-fdholder-daemon.html


Yes, that was exactly the idea behind s6-fdholder-daemon: set up a central server to keep fds open when you need to restart a process. The old process stores the fd into the fdholder, then dies; the supervisor starts the new process, that retrieves the fd from the fdholder, and starts serving.

And if you don't want to use a supervisor or a fdholder, you don't even need to coordinate, and you never need to fork: simply re-exec your executable with your serving socket in a conventional place (stdin is good). Daemons should be able to take a preopened listening socket and serve on it; hot-restarting is then a simple matter of one execve(). There's really no reason to make it more complicated.


> I suspect with something like s6-fdholder-daemon [1] could be used to orchestrate a similar process,

systemd has the ability to hold fds for you. You need to enable it for each service with http://www.freedesktop.org/software/systemd/man/systemd.serv... and the service needs to know hot to actually do it.


> unique cgroup for every service

This is a nice feature of systemd. I've been using s6 quite a bit (of daemontools heritage), and often want such a feature, but part of the reason I'm using s6 is so that I can use the same run scripts in darwin and Linux, and only Linux has cgroups.


It says a lot about HN's general self-styled adherence to meritocracy that a stupid, baroque, and ill-thought out technical decision driven by political spite is top-ranked on a system discussion.


I've used the python package supervisord with good success. Out of the box it manages stdout & stderr, rotates log files, restarts crashed processes, and has plugins for things like REST APIs to restart the process (for example, during an automated deploy). Another plugin can monitor memory usage & take some action when thresholds are reached. I find it very easy to use. You can commit your configuration file for supervisor into your app's repo, and symlink it in to supervisor's conf.d directory. Other plugins add stuff like web based dash boards. Its great for everything from managing a single process on a single machine, or a large cluster of worker processes on multiple servers.


I like supervisord for these reasons too. It's really handy.


Seeing then FQDN with the trailing dot I wonder how much software breaks when you use it. I bet the duplicate check here doesn't treat homepage.ntlworld.com. and homepage.ntlworld.com as equal.


Actually none from experience. It's there for a really good reason.

Say you want to visit www.hahaha.com but your network admin is a rotten bastard, and this does happen a lot in corporate networks, well what happens is they set up a DNS record of www.hahaha.com.myevilcorporation.local and set your default domain suffix to myevilcorporation.local.

When you hit www.hahaha.com you don't go to www.hahaha.com; you go to some internal site. The browser tends not to give you any clues about this so you just carry on without knowing.

If you visit www.hahaha.com. (note the trailing dot) then you go to www.hahaha.com. as that dot is the root of the DNS heirarchy and myevilcorporation.local never ever resolves.

Of course this throws all sorts of warnings if you are using SSL but for the average user and site, that isn't necessarily true. Hell we did it years ago and deployed an internal CA cert to all our workstations so it looks like a ton of our corporate sites are actually internet-based and SSL covered but are on internal networks only with private certs.

I suspect that this is less of an issue these days and browsers have protection against this but I wouldn't be 100% sure of it.

So in conclusion, it's really important.


Public service announcement: It's best to use "example.com" (or .net, .org, etc) for this sort of thing, because there's no real server there to accidentally send traffic to or alter the search rankings of.

It's not usually that big a deal, but every so often someone will paste the wrong thing or an automated process will misinterpret something. It's better if this happens to a nonexistent domain.


There is a real server at example.com - check out http://example.com/ - or, if you want to be secure, it's also on https :)

example.com is the standard FQDN to use for examples, though. http://example.com/ links to the documentation...


JdeBP is famous for his ungodly pedantry, so it'd be blasphemous for me not to honor RFC 1738 [1].

[1] http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/we...


He's also famous enough that whenever I see ntlworld.com [1] I immediately think, "Ah, must be JdBP - that'll be worth reading."

[1] It's actually slightly astonishing his hosting still works at that URL given that NTL was consumed by Virgin nearly a decade ago.


I must admit, I felt a pang of nostalgia seeing that domain name. Must have been nearly 15 years ago that we got our super-fast 64k connection with free ntlworld.com web hosting, and we could finally have always-on internet at home. Oh how things have changed..


Wrt hostnames RFC 1738 just refers back to 1034/35, which came 7 years earlier. It'd be pretty sad if we couldn't get this right after 28+ years


The problem with automatic daemonization is that there is no way to indicate startup failure: you just fire and forget the process in the background. If the process dies, there's no clear way of knowing whether it was due to an error in startup conditions (possibly as simple as a configuration error) or a runtime failure at a later time. In the first case, restarting makes no sense. In the latter case it might be a reasonable option.

With manual daemonization, the process can delay forking until it has ensured that its configuration is sound and it has the resources it needs. This way the parent process can indicate with its exit status whether the service was successfully launched.


If your daemon is of Type=forking then failure to initialize should be indicated by the initial process (the one launched by systemd) outputting an error message and then failing.

Only once the grandchild (not the child, you must double-fork in order to start up correctly) has indicated to the initial process that it is ready to provide the service, should the initial process exit, cleanly, thereby notifying systemd that startup was successful.

See daemon(7) for the full details.


If the program exits with a special status to indicate non-restartable failure, you could, in principle, recognize that in the process manager and handle it differently, by not restarting. Is there any way to do this in systemd, or in any other process manager?


Yes, systemd has "RestartPreventExitStatus" to configure a list of exit codes which prevent the automatic restart.


Was there not a recent blog/article that mused about the likelihood of systemd spawning all manner of special case entries?

And now i ponder likening systemd with CISC...


Usually these daemon managers detect when the process is dying right after starting up and indicate that with a message like "[process] is respawning too fast". Besides, there are always logs.


But that's not a way to detect service failure. That's a way to detect potential service failure. The only unambiguous failure notification is the return code.


Yes, which systemd will also display, and you can configure it to handle it differently (e.g. don't auto-restart on certain exit codes).


For Rails apps I've been using god to manage unicorn and delayed_job/resque processes in recent years. I've been mildly skeptical of systemd---not really out of true knowledge but because keeping things simple matters to me, and that seems to be the gist of the criticisms. But if systemd would let me do away with god, that would be nice! I agree that pidfiles are a hack and often cause problems. It seems like I'm going to need to learn systemd soon anyway, so I'll be looking into whether it can do what I need for Rails apps.


I don't understand. If I write a server app that listens for client connections, why not daemonize that? Is systemd doing all of the listening then invoking the server app if a client connects?


Run in the foreground and systemd will do the right thing.

Your app is still the one listening for connections, systemd just calls wait() until you exit, then launches you again.

There's no need for you to do a fork() or setsid() or whatever, that was a relic of the "just run all these init files as scripts" nature of sysv init.


Definitely. Also, when you're in non-systemd environments, runit or supervisord will happily manage your undaemons.


I just started using runit and really like how simple it is. I feel like nows a good time to plug void linux http://www.voidlinux.eu/ which I think is pretty cool, it's the only OS/distro I know that uses runit as the init system.


How does that work with nginx or haproxy, which support zero-downtime restarts?


That's a really interesting question, actually. (This was posted a day ago so I'm not sure anyone will read this any more, but...)

If the job of a process "supervisor" is to launch you, wait(2), launch you again, repeat... who has the role of doing zero-downtime restarts?

I'll define a zero-downtime restart in this instance as: a new process must be launched while the old process is still running, so it can negotiate a handoff of responsibility (via migrating who owns the port and draining connections in the case of nginx or haproxy), and the old process only dies once the handoff is complete.

If you wanted this behavior with systemd/upstart/etc as a supervisor (where processes are foregrounded and monitored), the supervisor would require a special case for "restart" which would just start a new process and monitor that one instead, and not bother with killing the old one.

I have no idea if systemd can accommodate for this without switching to a non-supervised process management mode (which is definitely possible.) I don't have much familiarity with advanced systemd or upstart, although I have plenty of familiarity with mesos schedulers, which can be kind of a datacenter-level version of systemd, and in this instance we do "rolling restarts" where new instances are launched, we wait for them to pass health checks, and then we drain the old ones, while a load balancer is responsible for routing incoming connections to the healthy instances. It's an interesting notion for what a single machine's process supervisor should do in this case.


If nginx and haproxy switch over to SO_REUSEPORT (and the SO_REUSEPORT semantics are fixed in Linux to work like FreeBSD's) it should no longer be an issue: its own worker supervisor need not fork itself, and it can simply manage the children.


> If I write a server app that listens for client connections, why not daemonize that?

Because writing a daemon is almost always more work than writing an application that runs in the foreground. And systemd is capable of taking applications that run in the foreground and running them as daemons instead.

Put another way - if something else is doing all of the hard work for you, why would you want to write a daemon yourself, except possibly as a learning exercise?


Stated more accurately, services running under a supervision suite yield their daemon instantiation to the underlying process manager, for which enforcing a consistent daemon control framework is its purpose.

So has been the case since IBM AIX in 1992.


> why not daemonize that

Double-fork-detach was arguably only ever a hack to get around the limitations of the traditional Unix paradigm (using inetd to handle incoming connections and invoking a standard program to hand the connection off to).



I wrote a Java server that blocks spam based on my criteria and set it to run like this:

http://manpages.ubuntu.com/manpages/trusty/man8/update-rc.d....

I didn't do any daemonzing like we used to do in the C days so I'm not sure that the systemd approach mentioned in the article does it better (not against the idea, maybe it's needed on non-Ubuntu systems?).

I just needed to write a simple shell script with the basics and this:

...

  case "$1" in
      start)
          startup
      ;;
      stop)
          shutdown
      ;;
      restart)
          restartit
      ;;
  esac
...


If all you're doing is start/stop/restart on Ubuntu you should take a look at Upstart[0]. See a simple example here: http://askubuntu.com/a/251581

[0] http://upstart.ubuntu.com/


upstart is deprecated.


Not that I know of. ChromeOS still uses it and though Ubuntu is phasing it out, they haven't completely gotten rid of it as a session manager yet AFAIK. They need to support it for a few more years anyway due to 14.04 LTS.


That's exactly the difference between deprecated an obsolete. The former means it's on it's way to the latter. Putting work into upstart just means you'll have work migrating to systemd in less than a year or two.


I just needed to write a simple shell script ...

If one is using Ubuntu Linux, then it is daft to begin with System 5 rc scripts, given that one already has upstart on version 14 and has systemd on version 15. Ironically, those "basics" are the bits that one doesn't need to write at all. See http://unix.stackexchange.com/a/200281/5132 .


I tend to always use daemonize to start a Java program because typically I need to pass several startup options (like Xms, Xmx, etc.) and systemd unit files cannot perform even the simplest variable interpolation needed [1].

Someday systemd will go beyond simple key-value substitution, and on that day I'll stop using daemonize.

[1]: http://www.opsfordevelopers.com/2015/05/startup-scripts-for-...


What's wrong with doing something like the following:

...

[Service]

Type=forking

EnvironmentFile=-/usr/local/myJavaConfig

ExecStart=/bin/sh -c '${JAVA_HOME} ${JAVA_OPTS}'

Or if you're after passing things in, use the other technique that lets you use systemctl myservice@your_sub.service


Two things are wrong, neither of which, however, make daemonize appropriate. First, and trivially, there's a missing /bin/java (-: Second, that's a potential readiness protocol and a main PID mismatch. See http://unix.stackexchange.com/a/200365/5132 for details. And then http://askubuntu.com/a/624871/43344 . Remembering to exec is another lesson to bring along from the daemontools world.


That's a case for creating a wrapper shell script that you launch through ExecStart=. Nothing to do with whether your process does the daemon(7) dance or not.


Correct, I should've been clearer that I use daemonize mostly out of frustration with systemd limitations and inability to tell me precisely what's wrong when there are errors launching the service.


You can run a daemon-like process with its stdin/out firmly attached to a TTY: a virtual TTY inside a screen or tmux session that can be detached and reattached.


As was pointed out a few days ago, if the output stream exerts backpressure, write calls will stall. Stopping scrolling at the output end is enough to do that.


But that's the whole point of screen and tmux—It's always got a terminal to suck up stdout.


I've been using daemon-manager for the past couple years—its most interesting point to me is that you can start and stop user daemons without require root privileges, which make it nice for non-root scripts/bots (for the paranoid age) and for multi-user systems (admittedly decreasingly common in the AWS and Docker age).


"systemd offers users the ability to manage services under the user's control with a per-user systemd instance, enabling users to start, stop, enable, and disable their own units."

https://wiki.archlinux.org/index.php/Systemd/User




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: