[Nix-dev] Builder unexpectedly killed
Ludovic Courtès
ludo at gnu.org
Sun Jan 16 23:17:59 CET 2011
Hello,
I’m observing this failure on my GuruPlug (armv5tel-linux) when running:
strace -o ,,s -f nix-store --print-build-trace -vvvv \
-r /nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv
The relevant part of the log goes like this:
--8<---------------cut here---------------start------------->8---
| | | trying user `nixbld1'
| | | killing all processes running under uid `30001'
| | | executing builder `/nix/store/scaqq4v2g1h3pnygckw2cbn5w2hqi11w-bootstrap-tools/bin/sh'
| | | @ build-started /nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv /nix/store/zqr3b1kj44w84dsr0idvg62ymlccicbx-stdenv-linux-boot armv5tel-linux /nix/var/log/nix/drvs/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv
| | waiting for children
| | building of `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv': read 40 bytes
| | | switching to user `nixbld1'
| | waiting for children
| | building of `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv': got EOF
| | building of `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv': woken up
| | building of `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv': build done
| | builder process for `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv' finished
| | killing all processes running under uid `30001'
| | recursively deleting path `/tmp/nix-build-pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv-0'
| | | /tmp/nix-build-pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv-0
| | builder for `/nix/store/pykr4wc1amlqhcg50xqc0iydnv91lnnp-stdenv-linux-boot.drv' failed due to signal 9 (Killed)
--8<---------------cut here---------------end--------------->8---
The corresponding strace output:
--8<---------------cut here---------------start------------->8---
1398 close(1022) = -1 EBADF (Bad file descriptor)
1398 close(1023) = -1 EBADF (Bad file descriptor)
1398 write(2, "| | | switching to user `n"..., 40) = 40
1398 geteuid32() = 0
1398 setgroups32(0, []) = 0
1398 setgid32(30000) = 0
1398 getgid32() = 30000
1398 getegid32() = 30000
1398 setuid32(30001) = 0
1398 getuid32() = 30001
1398 geteuid32() = 30001
1398 rt_sigaction(SIGPIPE, {SIG_DFL, [TRAP KILL USR2 PIPE ALRM TERM STOP TTIN XCPU VTALRM PROF WINCH IO PWR RTMIN], 0x4000000 /* SA_??? */}, {SIG_IGN, ~[KILL STOP RTMIN RT_1], 0x4000000 /* SA_??? */}, 8) = 0
1398 execve("/nix/store/scaqq4v2g1h3pnygckw2cbn5w2hqi11w-bootstrap-tools/bin/sh", ["sh", "-e", "/nix/store/6lxp9b633mm1b4jpnm3jx"...], [/* 26 vars */] <unfinished ...>
1398 +++ killed by SIGKILL +++
^
|
`---- The builder appears to get killed immediately, even though both
.../bin/sh and /nix/store/6lxp...-builder.sh are valid.
1394 <... select resumed> ) = 1 (in [12])
1394 --- SIGCHLD (Child exited) @ 0 (0) ---
1394 gettimeofday({1295213267, 448036}, NULL) = 0
1394 read(12, "| | | switching to user `n"..., 4096) = 40
1394 write(2, "| | building of `/nix/store/"..., 103) = 103
1394 write(2, "| | | switching to user `n"..., 40) = 40
1394 write(11, "| | | switching to user `n"..., 40) = 40
1394 write(2, "| | waiting for children\n", 29) = 29
1394 gettimeofday({1295213267, 449873}, NULL) = 0
1394 select(13, [12], NULL, NULL, NULL) = 1 (in [12])
1394 gettimeofday({1295213267, 450309}, NULL) = 0
1394 read(12, "", 4096) = 0
1394 write(2, "| | building of `/nix/store/"..., 97) = 97
1394 write(2, "| | building of `/nix/store/"..., 98) = 98
1394 write(2, "| | building of `/nix/store/"..., 100) = 100
1394 wait4(1398, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], 0, NULL) = 1398
1394 write(2, "| | builder process for `/ni"..., 105) = 105
1394 close(12) = 0
1394 close(11) = 0
1394 geteuid32() = 0
1394 write(2, "| | killing all processes ru"..., 56) = 56
1394 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x40098068) = 1399
1394 wait4(1399, <unfinished ...>
1399 setuid32(30001) = 0
1399 kill(-1, SIGKILL) = 0
1399 exit_group(0) = ?
1394 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1399
--8<---------------cut here---------------end--------------->8---
Any idea what could be going wrong with this execve call?
It seems that this could be related to
<https://patchwork.kernel.org/patch/444071/>, though that would mean a
SIGKILL was sent while processing the execve call.
(This is with Glibc 2.12.2, Linux 2.6.36.3, Nix 1.0pre24855.)
Thanks,
Ludo’.
More information about the nix-dev
mailing list