[Nix-dev] Using Nix for deterministic and verifiable builds
Gregory Szorc
gregory.szorc at gmail.com
Mon Nov 17 01:02:38 CET 2014
Thank you for taking time to write such a detailed reply! Comments inline.
On 11/16/14 3:21 PM, Michael Raskin wrote:
>> I discovered Nix and NixOS a few weeks ago and love what I see. I love
>> the execution of immutable and standalone environments. IMO Nix is
>> packaging done right. There is so much potential for Nix and NixOS on so
>> many fronts.
>
> A disclaimer from my side: I am not speaking in the name of the entire
> project, and I never had a right to do so. But I am an experienced
> contributor.
>
>> I'm curious about what people in the Nix community feel about the
>> applicability of Nix to the deterministic and verifiable builds problem
>> space.
>
> There are some things that are hard to check, and there are some
> tradeoffs that stop us from using the most deterministic system we have
> manpower to achieve.
>
> I think the biggest of our problems wouldn't impact you as much (and
> maybe, in a sense, your using Nix would _solve_ them by providing
> a large buildfarm exporting the most deterministic variety of core tools
> up to NixPkgs Firefox).
>
>> From my limited experience with Nix, I *think* the determinism part is
>> on solid ground: if you assemble the same environment on different
>> machines and a package is capable of building deterministically, you're
>> in a good position to achieve a bit-identical result. (We really only
>> care that Firefox's output is deterministic - it doesn't matter too much
>> if the toolchain and dependent packages aren't deterministic as long as
>> Firefox builds the same. Although, having bit-identical output
>> everywhere would be awesome.)
>
> Deterministic output means that something has to be done about profile-
> guided optimistion. It is a subject of an old discussion in our
> community as some people would like bit-perfect builds and others think
> that PGO improves performance significantly and should be used.
> In our case the main discussion is about letting GCC to PGO itself
> during bootstrap.
Yes, PGO is a nuisance for Mozilla as well. We ship PGOd Firefox because
it matters for performance. Tor (which I forgot to mention is based on
Firefox) has disabled PGO, sacrificing raw speed for
privacy/security/trust. There is a debate of sorts on whether Mozilla
should offer a non-PGOd Firefox for the crowd that cares about these
things. Personally, I'm holding out hope that we can save and distribute
the PGO profile and use the same profile in distributed environments to
achieve the same binaries. AFAIK that hasn't been proved either way.
Whether this is possible or whether tools exist to sufficiently audit
the PGO profile for malicious intent are very good questions that need
answered. Debian and others are on a big deterministic builds kick right
now, so I hold out hope that smart people will find a way to make PGO
and trust work together.
>> I *think* you can facilitate determinism over time by providing your own
>> channel or version control repository of Nix expressions. Then, you tell
>> people "checkout version X and install the 'firefox-build-env' package."
>> You attain trust via regular auditing of the Nix expressions. Is it
>> really this simple? How do you achieve full isolation of a Nix
>> expressions "database" from other installed channels/sources? Is having
>> a version controlled repository of Nix expressions that can be used to
>> derive the same (hopefully identical) packages over time something that
>> people do? e.g. if I check out a copy of the Nix expressions from a year
>> ago and realize the "firefox-build-env" package, I should compile the
>> environment as it was a year ago, right?
>
> Yes, that's right.
>
> Let me summarize my impression of what Nix reliably provides.
>
> 1) If you check out an old version of the package instruction
> repository, exact same build instructions will be generated. The same
> tarballs will be fetched from the same lists of mirrors and verified
> against the same checksums, then unpacked and built with the same
> patches applied and the same configure flag order etc.
Great to have confirmation of this!
> Tarball availability is an issue here; I guess Mozilla can solve this
> part of the problem for all the tarballs needed for Firefox bootstrap
> build.
Is there a way to insert your own mirror without changing the fetchurl
{} in the .nix file? Isn't there some magic where the source inputs get
realized and can be fetched from a binary cache, just like the outputs?
> Also there is a problem of «trusting trust» and bootstrap build; our
> basic build environment is built from a binary set of bootstrap tools,
> which is the checksumed build environment you want to avoid.
>
> To solve this, there is a procedure of building these bootstrap tools
> and if you disable all non-determinism it should converge to the same
> tarball when you take two acceptable bootstrap toolsets, build the
> entire working environments from these, and use them to generate two new
> sets of bootstrap tools.
Yup. And it's turtles all the way down to silicon. "How do you know the
NSA didn't backdoor your compiler through adjustment to Intel's CPUs?"
Diverse double compilation and other tricks to help defeat poisoned
binary bootstrap tools is certainly on the TODO list. However, it's so
far down the list of priorities for us right now that it barely
registers. Furthermore, I don't believe Tor/Gitian is doing anything
much more creative than Nix and I don't believe people are sweating over
it. Perhaps they should be. We need to walk before we can run and
terrific toolchain trust can wait.
> As for irrelevant-for-the-job tools:
>
> Binary caches say «if you perform build X you get output Y anyway, you
> can save the time and trust me». You will probably want to only use
> binary caches under your complete control. Hope fully some of these
> would also serve the toolchain and library chain to the world…
I should have mentioned that there are 2 consumers of deterministic
builds of Firefox: the privacy/security camp and developers. We want
developers to have access to the same build environment so they can get
local builds that behave just like the official ones. This also allows
us to create a globally distributed "ccache" to speed up compilation.
(Quick note: Mozilla built an S3-backed version of ccache -
https://github.com/glandium/sccache).
I mention this because the developer group likely only interface with
binary caches. We want the overhead to obtain the build environment to
be low. This likely means fetching pre-built Nix packages or packaging
up the full environment in a chroot/Docker container, etc. Trust
verifiers, however, are likely doing everything from source. I like Nix
because it placates both groups.
> Channels give you some snapshots of expressions and also some binary
> caches. The expression snapshots do not have any effect on your results
> if you take care to evaluate only the expressions from the VCS checkout
> during the release process. You don't really need to use channels in
> this specific task. You can provide some, of course.
>
> 2) Even with the same commands there is an issue of build determinism.
>
> We have quite reliable build isolation using chroot environments (and
> hard-to-guess paths). Whatever optional dependencies of a package you
> install, if they are not listed in the package descriptions, it is
> unlikely that the package will be able to notice them (unless user helps
> it in some way in runtime).
>
> There is the issue of «what /bin/sh gets used in chroot», but in your
> case you can use the simple configuration options there are to fix the
> /bin/sh once for every Firefox version.
>
> We have a few approaches for mitigating time information leak… They
> probably need some testing.
>
> One of the problems is when some debug information gets leaked into the
> final package and it contains build directory paths. We want them to
> correcspond to external paths for easier debugging, and so they may leak
> information. With minimal care they turn out to be completely
> deterministic, though.
>
> Many issues are discussed in https://github.com/NixOS/nixpkgs/pull/2281
> which was already mentioned.
In case you haven't seen it, https://wiki.debian.org/ReproducibleBuilds
has a great overview of all the problems and solutions Debian is running
into.
> With Nix it is relatively easy to check determinism step-by-step
> (rebuilding a single package at a time at different machines and
> multiple times) when you have some buildfarm capacity to spare; so you
> can do some test runs and create a summary. And post it at
> AreWeDeterministicYet.net… When there are no problems that actually
> change the output, it is unlikely they will reappear until upstream
> changes the build system.
Something like this is part of our eventual goal. We want to "break the
build" when Firefox isn't built deterministically. No clue when we'll
get there.
> 3) Deterministically producing a Firefox intended for non-NixPkgs
> installation (i.e. not referring to /nix/ for libraries) is a separate
> question. I would say that it is quite easy: we have a few chroot
> generators, so Firefox can simply be built — or just get its ELF header
> edited — inside a chroot mimicking the layout you want by bind-mounting
> /nix/ and then simply symlinking the libraries into /usr/lib. Here you
> won't get any new determinism problems (or any non-trivial problems at
> all).
Ooh, I didn't know about the different chroot generators. I'll have to
go source diving :)
We already hack ELF headers and do custom debug symbol processing, so
that's well within the realm of possibility.
Thanks again for your very detailed response!
More information about the nix-dev
mailing list