[Nix-dev] GHC pointed at the wrong package
Mateusz Kowalczyk
fuuzetsu at fuuzetsu.co.uk
Mon Sep 8 17:42:12 CEST 2014
On 09/07/2014 10:10 PM, Rickard Nilsson wrote:
> On 08/23/2014 03:15 PM, Mateusz Kowalczyk wrote:
>> On 08/23/2014 01:29 PM, Peter Simons wrote:
>>> Hi Mateusz,
>>>
>>> > There are problems in package regex-tdfa-1.2.0:
>>> > dependency "parsec-3.1.5-ca5ed8f175b69e1a085cfeaf3b95f424" doesn't exist
>>> > There are problems in package regex-tdfa-rc-1.1.8.3:
>>> > dependency "parsec-3.1.5-ca5ed8f175b69e1a085cfeaf3b95f424" doesn't exist
>>>
>>> the process that generates those IDs in GHC is non-deterministic. Two
>>> people can compile the same library with the same version of GHC on the
>>> same type of machine yet end up with two distinct IDs. It doesn't happen
>>> often, but it does happen.
>>>
>>> I'd recommend running
>>>
>>> $ nix-store --delete /nix/store/*-haskell-parsec-ghc7.8.3-3.1.5-shared
>>>
>>> on all your machines. Then the next build will download these packages
>>> from Hydra, and you'll have a consistent build again. Note that you may
>>> have to remove packages from your active profiles to make that deletion
>>> process succeed.
>>>
>>> I hope this helps,
>>> Peter
>>>
>>
>> It's very unfortunate to hear about the package ID stuff. Is there a bug
>> open?
>
> There is https://ghc.haskell.org/trac/ghc/ticket/4012
There has actually been a patch to that bug about a day before I asked
so the situation for the simple case may improve.
> Actually, since we started building Haskell packages in parallel
> (https://github.com/NixOS/nixpkgs/commit/817c0e41443a5176baf6dd9b422878fdccecd266),
> this problem might have got more common (but I have no real evidence for
> that).
>
> You can reproduce this by building the haskell http-client package with
> "--cores 4" (non-parallel makes the problem go away). Each build (with
> the exact same dependencies, and hence exact same nix hash) produces a
> package with different package-id (in package-conf.d). It is not only
> the package-id that differs, but the ABI differs, which could make
> linking fail. Look at this:
>
> $ nm -g
> pkg-1/lib/ghc-7.8.3/http-client-0.3.8.1/libHShttp-client-0.3.8.1.a > hc-1-nm
>
> $ nm -g
> pkg-2/lib/ghc-7.8.3/http-client-0.3.8.1/libHShttp-client-0.3.8.1.a > hc-2-nm
>
> $ diff hc-1-nm hc-2-nm | tail -n 10
> 8775,8776c8775,8776
> < U
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfy8a_closure
> < 0000000000000000 D
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfy9a1_closure
> ---
> > U
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfxPa_closure
> > 0000000000000000 D
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfxQa1_closure
> 8780c8780
> < 0000000000000000 D
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfy8a_closure
> ---
> > 0000000000000000 D
> httpzmclientzm0zi3zi8zi1_NetworkziHTTPziClientziTypes_zdLrfxPa_closure
As pointed out on the Trac ticket, non-determinism in presence of
parallelism is a known problem, so there's your evidence.
> I really don't know how this could be worked around in Nix. Of course,
> the problem is not very common, since you would have to build one
> package locally, then fetch a package built somewhere else that depends
> on your local package, and finally build a third package that depends on
> that fetched package. But in a build cluster things like that certainly
> do happen.
I think making Haskell packages only build on a single core again would
be a start.
For me that problem is common: my Hydra builds Haskell packages from
nixpkgs HEAD and uses official Hydra and peti's Hydra as binary caches.
Further, my own-use computer uses official Hydra, my Hydra and peti's
Hydra as caches + I often build packages locally when packaging stuff
for nixpkgs master or when I need some patches from there. It's fairly
easy to see that it's easy for the problem to come up here. In fact my
Hydra right now suffers from the same thing and weirdly it's actually,
again, something to do with pandoc/parsec packages.
Right from my Hydra:
package pandoc-1.13.1 is broken due to missing package
pandoc-types-1.12.4.1-917a8ba6e10664f3ab958ef027071e98
My options when this happens is to either:
1. manually drop in and try to remove broken packages
2. garbage collect everything
3. wait for a big rebuild which will cause these to be rebuild/refetched
Today 3. is actually happening so hopefully it comes out without any
bogus errors but ideally this should never happen. If building each
package on single core makes it more likely to produce non-broken
packages then I think it should be the default until it can be patched
upstream.
> / Rickard
>
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>
--
Mateusz K.
More information about the nix-dev
mailing list