[Nix-dev] Avoiding threads in the daemon
Ludovic Courtès
ludo at gnu.org
Thu Dec 18 17:32:52 CET 2014
Nix commit 524f89 changed libstore to use fork + unshare instead of
clone(2). The problem is that, in doing so, it also removed use of
CLONE_NEWPID and thus, (1) the build process no longer has PID 1, and
(2) build processes end up in the global PID space.
Adding CLONE_NEWPID to the unshare(2) call appears to break things (for
instance, future calls to pthread_create by that process fail with
EINVAL, other calls to clone(2) fail with ENOMEN) which may be why
CLONE_NEWPID isn’t used here.
The stated reason for this commit is this:
commit 524f89f1399724e596f61faba2c6861b1bb7b9c5
Author: Eelco Dolstra <eelco.dolstra at logicblox.com>
Date: Thu Aug 21 14:08:09 2014 +0200
Use unshare() instead of clone()
It turns out that using clone() to start a child process is unsafe in
a multithreaded program. It can cause the initialisation of a build
child process to hang in setgroups(), as seen several times in the
build farm:
The reason is that Glibc thinks that the other threads of the parent
exist in the child, so in setxid_mark_thread() it tries to get a futex
that has been acquired by another thread just before the clone(). With
fork(), Glibc runs pthread_atfork() handlers that take care of this
(in particular, __reclaim_stacks()). But clone() doesn't do that.
Fortunately, we can use fork()+unshare() instead of clone() to set up
private namespaces.
See also https://www.mail-archive.com/lxc-devel@lists.linuxcontainers.org/msg03434.html.
The more general issue is that fork should not be used in a
multi-threaded process, unless the child immediately calls exec* after
fork (POSIX clearly specifies that if a multi-threaded program forks,
the child must only call functions that are async-signal-safe.) IOW,
the daemon should not use threads in the first place.
Thus, I think Nix commit 49fe95 (which introduces monitor-fd.hh, which
uses std::thread just for convenience) should be reverted, along with
the subsequent commits to that file; then commit 524f89 can be reverted.
WDYT?
Thanks,
Ludo’.
More information about the nix-dev
mailing list