[Nix-dev] Re: [Nix-commits] SVN commit: nix - 16859 - raskin - in nixpkgs/trunk/pkgs: build-support/builder-defs build-support/upstream-updater development/libraries development/libraries/geoip tools/admin tools/admin/webdruid top-level

Wed Aug 26 15:52:32 CEST 2009

Hi eelco & list

Excerpts from Eelco Dolstra's message of Wed Aug 26 14:04:16 +0200 2009:
> I'm not convinced that this auto-updating is "worth it".  Having two extra Nix

When thinking about this maybe the perfect solution is even writing a
new software update notification service. So the effort writing these
regex which are likely to break can even be shared by most distros ?
The service could provide a API letting clients (such as the
nix-source-update-script) can query the latest version of project-id xy
easily? I guess Debian, gentoo (whatnot) developers would be glad about
this as well?
Moreover if a bug is found you can automatically dowgrade again or
"prune" malicious versions. Moreover maintainers could be notified
automatically by email so that they can pick it ..

Let's don't forget that we don't want all updates on the main branch
because they'll force many rebulids.

  I had a lengthy discussion with MichaelRaskin yesterday. We talked about how
  interfaces can be designed telling update scripts how to automatically update source locactions.

  facts / goals:
  * the name has to be updated.
  * the src attribute has to be updated.
  * It must be understandable
  * we don't want to add huge chunks of code to the expression files
  * don't invent much special syntax (Michael: no special syntax)
  * (Marc: it should support git/hg/svn/ directly as well)

  Example package (current style)

    stdenv.mkDerivation {
      name = foo-2.0;
      src = fetchurl {
        url = "http://.../2.0.tar.gz"
        hash = ..
      }
      patches = [
        ( fetchurl { .. } ? );  # uncommon but the update system should be able to handle this as well (?) whatever this means here
      ];
    }

  short explanation of (non-final ?) implementation committed by Michael focusing
  on how to use it:
  ================================================================================

  If you're familiar with it jump to "Now you can run" below.

    The source information is put into an additional file called /src-for-file.nix:

      rec {
        advertisedUrl="http://downloads.sourceforge.net/gphoto/files/gphotofs/0.4.0/gphotofs-0.4.0.tar.bz2";
        version = "0.4.0";
        url="http://downloads.sourceforge.net/gphoto/files/gphotofs/0.4.0/gphotofs-0.4.0.tar.bz2";
        hash = "07zxnawkyzy6np9zas6byp9ksjkbi16d11zqrznqarfkf3fkg3yq";
      }

    Another file src-info-for-file.nix is added telling the update script how to find more recent source urls:

      {
        downloadPage = "http://nightly.webkit.org/";
        versionExtractorSedScript = "s/.*-(r[0-9]+)[.].*/\\1/";              # [X]
        versionReferenceCreator = "s/-(r[0-9.]+)[.]/-\${version}./";
      }

    so in the end we have three files:
      gcc/default.nix              (+4 lines approx )
      gcc/src-for-default.nix      ( 5 lines approx )
      gcc/src-info-for-default.nix ( 5 lines approx )

    I think having many small files is bad for performance reasons. I may be wrong
    here.

    The derivation itself looks like this now:

      let 
        s = import ./src-for-default.nix;
        version = lib.attrByPath ["version"] s.version args;    << I consider this being a poilerplate code line
      in

      stdenv.mkDerivation {

        name = "foo-${version}";

        src = fetchurl {                                        << I wonder how to use fetchgit, fetchcvs etc ?
          url = s.url;
          sha256 = s.hash;
        };

        [...]
      }

    Now you can run update-upstream-data.sh to create a updated version of
    src-for-file.nix which is written to yet another file name. It's your task to
    replace the existing with the updated version using cp.

    Looking at the src-info-for-file.nix [X] file I notice that some information can't be shared.
    Example: We have about 296 source forge packages:
     nixpkgs/pkgs$ grep -r mirror . | grep sourceforge | wc -l
    296

    So a regex has to be copied 296 times. I consider this beeing a failure.
    Let's have a closer look: the additional files are read by nix and turned into
    an attrset. That's written to a file. That is read by the update script to
    extract the regular expressions.
    So you can use replace the file by something shorter:

      let pkgs = import [..]; in pkgs.srcInfos.sourceforge.default;

    and share the information. Not that bad is it?
    Don't forget that you still have to create 296 files having the same contents !
    (So it's still ugly :-/ )

  I had in mind improving my bleeding edge stuff which serves a similar purpose
  for repositorios in the following way:
  =============================================================================

    let

    ### START BLOCK SRC_UPDATE
      # update-script: default-update http://the-page-containing-links http://..../name-(.*).tar.gz
      version = "..";
      name = "foo-${version}"
      src = fetchurl {
        url = "...${version}...";
        hash = "...";
      };
    ### END BLOCK SRC_UPDATE

    in
    stdenv.mkDerivation {
      inherit src name;
      [...]
    }

    Note that the parenthesis indicate which version name to use.

    I assume that the default-update is smart enough to figure out that a url
    is a sourceforge url and rewrites it as mirror://....
    So one place to put this information compared to 296.
    You don't have to care yourself about this at all which is good.

    you can start by writing the block without contents

    ### START BLOCK SRC_UPDATE
      # update-script: default-update http://the-page-containing-links http://..../name-(.*).tar.gz
    ### END BLOCK SRC_UPDATE

    and let teh update script fill in the missing attr names
    version, name, src = fetchurl { ... };

    differences:

      * one file only

      * it's more likely to fail because the smartness may not have been
        implemented yet. False positives are possible.
        (Eg an url can be rewritten as mirror://sourceforce/...
        even if it belongs to another mirror system!)

      * Michael: some automatic code rewriting / comment fromatting tools may break the
        special syntax.
        Marc: I think we can make the chunk extraction tool smart enough to cope with
        that as well

      summary: less additional code (in this *perfect* example only 5 lines), no additional files.
      This means less chances to get something wrong.

  Biggest difference we'de like to get your thoughts about:

    What do you prefer:

    a) having everything in one block which can be edited without opening additional files
      containing the update information and the src, name, version attributes

    b) put the same information into external files living in the same directory?

    c) totally new option: don't even add block markers. Write the
       smartest script doing everything on its own..

  I think everything else are implementation details.

  Probably you've already noticed that I prefer the block style because it's
  faster to write and commit merge run git status etc..

Marc Weber