[Nix-dev] Avoiding threads in the daemon

Ludovic Courtès ludo at gnu.org
Thu Dec 18 17:32:52 CET 2014


Nix commit 524f89 changed libstore to use fork + unshare instead of
clone(2).  The problem is that, in doing so, it also removed use of
CLONE_NEWPID and thus, (1) the build process no longer has PID 1, and
(2) build processes end up in the global PID space.

Adding CLONE_NEWPID to the unshare(2) call appears to break things (for
instance, future calls to pthread_create by that process fail with
EINVAL, other calls to clone(2) fail with ENOMEN) which may be why
CLONE_NEWPID isn’t used here.

The stated reason for this commit is this:

 commit 524f89f1399724e596f61faba2c6861b1bb7b9c5
 Author: Eelco Dolstra <eelco.dolstra at logicblox.com>
 Date:   Thu Aug 21 14:08:09 2014 +0200

     Use unshare() instead of clone()

     It turns out that using clone() to start a child process is unsafe in
     a multithreaded program. It can cause the initialisation of a build
     child process to hang in setgroups(), as seen several times in the
     build farm:

     The reason is that Glibc thinks that the other threads of the parent
     exist in the child, so in setxid_mark_thread() it tries to get a futex
     that has been acquired by another thread just before the clone(). With
     fork(), Glibc runs pthread_atfork() handlers that take care of this
     (in particular, __reclaim_stacks()). But clone() doesn't do that.

     Fortunately, we can use fork()+unshare() instead of clone() to set up
     private namespaces.

     See also https://www.mail-archive.com/lxc-devel@lists.linuxcontainers.org/msg03434.html.

The more general issue is that fork should not be used in a
multi-threaded process, unless the child immediately calls exec* after
fork (POSIX clearly specifies that if a multi-threaded program forks,
the child must only call functions that are async-signal-safe.)  IOW,
the daemon should not use threads in the first place.

Thus, I think Nix commit 49fe95 (which introduces monitor-fd.hh, which
uses std::thread just for convenience) should be reverted, along with
the subsequent commits to that file; then commit 524f89 can be reverted.

WDYT?

Thanks,
Ludo’.


More information about the nix-dev mailing list