[Nix-dev] Dependency Semantics

Thu Jul 12 08:36:44 CEST 2012

Bryce L Nordgren <bnordgren at gmail.com> writes:

> On Wed, Jul 11, 2012 at 2:10 PM, Mathijs Kwik <mathijs at bluescreen303.nl> wrote:
>
>     Bryce L Nordgren <bnordgren at gmail.com> writes:
>    
>     > On Wed, Jul 11, 2012 at 12:53 AM, Mathijs Kwik <mathijs at bluescreen303.nl> wrote:
>     >
>     > There may be a couple of things I didn't articulate well.
>     >
>     > 1] I don't see these new semantics as "either/or". I fully expect you to be able to declare a weak
>     dependency on
>     > the python environment and a "depends on exact build of" for the imaging thingy.
>     > 2] I would envision the semantics of java packages to other java packages to be "depends on exact build
>     of".
>     > It's the dependency on the environment which needs to be weaker.
>    
>     I understand that part, but even using a weak dependency on environments
>     can lead to problems. Essentially these "weak" dependencies are
>     like a flag to tell the resolver "do not rebuild, although you think you
>     should, I know better". I know that in some cases, this indeed is the
>     case, but in many subtle cases, it really isn't.
>
>  
> I believe the current situation is "this package is completely unrelated to the environment in which it runs. Do
> not rebuild. Don't even install the environment automatically." At least, this would be the result of adopting a
> strategy like only referring to the JRE using "$JAVA_HOME/bin/java". The weak dependencies allow a completely
> missing dependency to be specified. I think we have different "starting points" in mind. Do python packages
> currently depend on python?

If that's indeed the case, I'm in favor.
To document dependencies not picked up by the current mechanism.

But that's something different from your original proposal, which was to
weaken dependencies, so _less_ rebuilds would happen.

Manually specifying extra (not already picked up) dependencies will lead
to the opposite and trigger _more_ rebuilds.

>  
> Note: with java the compiler which builds a package is packaged as a JDK. The runtime environment could either
> be a jdk or a jre. It doesn't make a lot of sense to force all packages to depend on a specific build of the jdk
> when that might not be where it's running. And to make matters worse: you can compile using JDK 1.6 with a
> "-target 1.4" flag on the compiler. That will be buried in the ant or maven configuration file and will never be
> exposed as a parameter Nix can get its hands on. If this is the case, is it even appropriate to say that the
> result depends on jdk 1.6? Is it beneficial to recompile this package simply because we upgrade to 1.7? Not sure
> I know the answer to these questions.

I agree on the jdk/jre problem, jdk should only be a build-time
dependency.

On the 1.6/1.7 issue, yes rebuild all packages. Maybe 1.7 gives faster
code and because of the -target 1.4 possibility, it seems you can always
use the latest version. If 1.7 doesn't work for certain packages
(deprecated interface), finding out why is useful. If everything
compiles on 1.7, 1.6 can be taken out.

>  
>
>     In the "weak dep on python" case however, MyPackage doesn't rebuild
>     (because only python changed), so its build script will not run to
>     detect breakage. Only far later, when I try to run MyPackage on some
>     production server, I find out this problem.
>
> Sometimes (often?) build scripts don't detect this kind of problem. 

I think that far more often, it's the other way around. Some
debian/arch/gentoo package speaks of a ~ 2.3.* dependency, while in
reality, 2.2 or 2.4 works too. Just a case where a maintainer upgraded a
minor release and didn't bother to read the full changelog. (manually)
duplicating stuff that's in the upstream builder just doesn't sound like
the best strategy.

> Things which are coded against an API and
> just use whatever providers are available at runtime are designed not to break at build time when a particular
> provider is missing. It's only when someone tries to exercise a specific provider (specified via a preferences
> file for database connections, or when autodetecting a file format when opening a particular file) that this
> class of problem spits out an error. I wouldn't count on strong dependencies on particular build artifacts to
> help you out here. ;) Many times the user is responsible for separately downloading and making available the
> specific provider they need at the time of deployment. 

Ok, for environments that are _designed_ to be weakly dependent, which
don't have tests (make check or other) to find out it's missing stuff, I
agree that specifying these by hand is the only option. 
But again, this just means _adding_ dependency info, not weakening
existing. Leading to _more_ rebuilds.

>  
>
>     To fix this, MyPackage's nix expression should capture the fact that it
>     depends on Python being built with ImageThingy support.
>
> Correct. I do not see any difference between specifying a dependency on optional parts of an environment and
> specifying a dependency on other python libraries. (After all, 3rd party libraries are just optional extensions
> to the non-optional environment.) You'd have to add dependent libraries to the MyPackage parameter list, why
> shouldn't you have to ensure MyPackage sets a flag on the environment?

I'm not saying you shouldn't. But the current build system finds out
about the breakage, so the maintainer can investigate and put the extra
dependency there. You have to understand that most package expressions
are probably created as a bare minimum, just adding the stuff the
builder/installer complains about. This works fine most of the time.
So in this case, where (up till now) just depending on python sufficed,
there's probably only python in the dependency list. For all we know,
the python maintainer will take care all the optional parts get built
anyway. If this changes in the future, I would like to find out
automatically.

Switching to weak semantics means maintainers really have to do more
work mapping out dependencies by reading code/makefiles/changelogs to
find out if it depends on any optional parts of python. Even if (up till
now) the python maintainer managed to include all optional stuff, you
have to learn about the python build system and which parts may be 
optional, so you can specify your package needs it. 
Miss one and this problem won't be seen until runtime, because
specifying a weak dependency on the python env would lead to the
upstream build checks not running anymore when something changes.

This sounds like a problem to me, one that the current semantics solve
by saying "if any inputs change, we want to make sure everything is
still fine (up to the level that upstream build/test scripts would
accept).

So it's really a matter of trusting upstream build scripts to capture
most issues VS having maintainers take over and specify all this
themselves. I think manually keeping a dependency list up-to-date is
error prone.

Specifying _extra_ deps, that the current build system doesn't see is
fine with me, but I fail to see how that would lead to less rebuilds.

Mathijs

>  
> So, it looks like maybe a "parameterized weak dependency" may be in order, where you list both the environment
> and the optional components to the environment? The particular build of the environment still does not matter as
> long as the environment includes (at least) the correct set of optional components.
>  
> Bryce
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
Url : http://lists.science.uu.nl/pipermail/nix-dev/attachments/20120712/84596eb5/attachment.bin