[Nix-dev] Re: [Nix-commits] SVN commit: nix - 17661 - raskin - in nixpkgs/trunk/pkgs: development/libraries/avahi development/libraries/consolekit development/libraries/gstreamer development/libraries/gstreamer/gst-ffmpeg development/libraries/gstreamer/gst-pl

Tue Oct 6 16:05:32 CEST 2009

Hi list,

If you don't have time jump to the last headline:
  "How could a playground look like?"

I feel that we're missing kind of playground as well.
Proposing patches which are accepted or reverted by list is natural but
may cause a bad feeling for a short time.

It's also natural that some bugs are found later on.

For example it happened to me:
In rev 17316 I added a patch making it possible to pass a directory
telling alsa where to find plugins so that sound can be routed to jackd
or pulseaudio. 2 days later I noticed that I had to add the same patch
for mixer controls as well because when adding those special items into
.asoundrc alsamixer will fail. I didn't commit yet because this would
have caused all dependencies on alsa to be rebuild again. I didn't
commit yet because probably nobody but me is using it right now?

It was important to me to commit the patch to show others that I did
somework.

Another typical example is:

  commit rev 17638 : added sitecopy
  commit rev 17639 : ssl support for sitecopy (kind fix of previous commit)
  commit rev 17640 : openssl support or sitecopy (kind fix of previous commit)

So for each topic there is kind of a evaluation period

added (very likely to change) -> becoming stable -> stable (very unlikely to change) 

It's very important that we all know about the first step. This way we
work won't be done twice and if there are some conflicts we are aware of
them and can take action by communication (That's what Eelco Dolstra did
telling I should hold on). The first point is very important.  I met a
man on irc telling me he dislikes git because stacking commits and
pushing them all at once can make the work of teams obsolete. However he
worked in an office on systems where more than 100 commits are submitted
each day.

So no matter what happens I think we want

- a playground to show others recent work is is suitable to distribute
  proposals so that they can be discussed. After some discussion and bug
  fixing things can be merged upstream.
  Right now this takes place locally. Eg Michael Raskin said that his
  patch was the result of heavy testing and bugfixing.
  Because traffic is not that high this works in almost all cases for
  nix* repositories very well because things do change slowly in
  general.

- Some kind of more stable cleaned upstream path to follow.
  As Eelco Dolstra said that nixos descriptions is visible to the users
  history is as well. Maybe that's less important but still. Having 
  a clean history can make things such as git bisecting some trouble
  easier. We also want less commits because this means less effort to
  keep up to date.

For nixpkgs things get even more  complicated because some updates do
cause a lot of trouble to others. Eg when much rebuilding has to take
place. If you have slower machine this is very likely to take up to 24h
or more if you want to build OpenOffice, X, kde and whatnot.

I can think about two development styles:
A) Doing heavy updates in a branch (such as stdenv-updates) keeping
   trunk stable
B) Doing heavy updates in trunk/master adding release branches which are
   more stable.

What do we care about?
- Don't rebuild the system if it's not necessary to get a job done.

  looking at A) (stdenv-updates)
    commits causing rebuilds got to stdenv-updates branch. So this is fnie

  looking at B) (release-branches)
    you would work on realease-branch X so you don't have rebuilds

- Security updates. If there is one we want to know about it.

  looking at A) (stdenv-updates)
    security stuff is merged into trunk. So everyone is using it and
    gets those important updates whether he uses them or not..

  looking at B) (release-branches)

    You could add something like this to release branches:
      assert getConfig ["security" "i_know_about" "firefox_flaw_detected" ] false;

    to tell users of a release branch that this package should no longer
    be used. Then they switch to a more recent branch containing those
    updates. So you have fine grained control. How long such a release
    branch lives depends on how many user are back-porting updates from
    trunk. If a branch is considered being dead you could add an
    assertion telling users to update.

    On servers you could run update and nixos-rebuild build. If it fails
    a new assertion has been added. This could be reported to the admin
    by email.

- least effort:

  A) (stdenv-updates);
    There are only two branches. So resources are spend on either one

  B) There are more branches. Resources go to them all.

How do other projects handle this?
The "git" way using topic branches which are merged into "next" or "master":

  git: Most patches are posted to the mailinglist. They are reviewed and
      updated. Finally someone commits them upstream.
  top-git: the same
  upstart: I think they have topic branches as well.

I called it the git way because creating topic/ feature branches is
a lightweight operation in git compared to svn.

What is nice about this? The people knowing their feature branches best
are responsible to keep them up to date. Comparison to nixpkgs:
Someone (Eelco Dolstra ?) has to merge stdenv-updates and trunk.
Of course we all try to help where we can. This still can cause some
headache. We already had this in the bast merging modular-nixos and
nixos. What's makes this approach consume more time? Multiple parties
are involved. Care has to be taken that that commit messages aren't lost
honoring the primary authors of those patches.
If you have feature-branches people will care about this themselves. If
they don't they jsut rebase. The difference is that this means less work
for the one person which merges stdenv-updates and trunk right now.
But maybe Eelco Dolstra should speak himself how much trouble this
actually causes to him.

I'd like to compare SVN and git right comparing features I think are
important:

History:
  SVN: only on server. So a clean history isn't that important because you
    usually don't download it. you only have the current copy

  git: you always have the whole history present. So ensuring that its
    clean is more important. you can also play with the history.

    Example: you intentionally rewrite history to see which commit
          introduced a problem using git bisect. Typical issue:

           commit 1:
           commit 2: source location update
           commit 3: causing trouble
           commit 4: version bump
           commit 5:..

    Now you know that 1 was find and that 5 was bad. Using git you can
    easily ammend the source update to commit 1 rewriting the whole
    history. Result: you can run bisect without problems. I don't even
    know how to do that with svn. Maybe conditionally applying local
    patches after running svn checkout -r $REVISION ?

    However things will be more complicated when more branches are
    present..

speed:
  SVN: slow. Compared to git it just sucks. You have to wait until
        things are transfered etc.
  git: Much faster to work with. One reason is that you can commit on a
      local copy without connecting to the server.

size:
  SVN: constant
  git: will grow over time (disks are getting cheaper. Right now it's
    not a concern. git compresses contents really well)

author verification:

  SVN: you are identified by your account (login + password)

  git: everyone can commit using everyones identity.. Maybe there is a
       way to sign patches. I'm not familiar with this

collaboration, reviewing:

  SVN: you have to send patches. This is limiting. Creating branches on
        the main server isn't done in practice.

  GIT: You can split a patch into chunks so that others can follow  the
  changes introduced by a feature brnach easier.
  you can have you're own repo asking others to pull in
  order to review a change. You can follow others work before they
  commit it.

who edited a file (thus who to ask if you have questions)

  SVN: svn blame: It works, you have to connect to the server -> slow

  git: git blame: Much faster

Why did I compare svn with "git" only?
- it's used by most users here not using svn. That's because git-svn is
  very stable.
- Probably it's going to be very mainstream. Eg people work on a Java
  and a .net implementation of git.
- I know it best. That's a personal reason though.

How could a playground look like?
==================================
First of all a playground should be very easy to use.
Thus there should be a command such as

put-rev-into-playground

which should trigger a mail to a (new) mailinglist telling users that
there is a new patch which the author is going to commit soon.

If we used git this would be very easy to accomplish:
Everyone could take care about creating a repository (maybe on github).
Then this location could be registered using a web site by the author.
A nixos.org cron job could periodically fetch changes from those
repositories sending a mail to a new "playground" mailinglist.

This should be done once a night only. There should be a delay sending
notifications about playground updates to the new mailinglist to reduce
noise.
Example: I update foo. I ask someone on irc to review it. Someone spots that I
forgot to update the name. I repush my commit to the playground. 6 hours
later it will be included in a playground mailinglist message telling
that this will cause rebuild of XX packages on hydra.

On top of that we could introduce things like:
  nix-get-all-playground-branches
  nix-show-unmerged-playground-commits-older-than-2days
  nix-get-hydra-rebuild-amount-estimation  $commit

The idea is to think about a policy that the author asks someone else to
commit his patch. This ensures that
- each patch is reviewed (spellings could be corrected etc)
- using irc this will be very fast for trivial patches.
- the author of the patch should convince the comitter about the value
  of the patch. (This catches things like my attempts to add the patches
  to nix and git which both were reverted)
- I think this system does scale very well in the future
- It will enhance the community feeling which is a good.
  We should try to communicate and improve.
- This is very nice for newcomers. They can add their patches to
  playground. They ask for reviews. They'll get help.
  They feel being integrated from the very beginning which leads to more
  people working on nix* ?

This is a vision only. I hope it boosts your creativity when thinking
about alternatives even if we don't use them.

I also see that this is a great idea which naturally adds quite a lot of
overhead. Maybe it causes even more overhead than committing two patches
fixing a previous commit. So I'm not sure I really want to switch. But
I'd like to improve awareness that we still can add more features making
it easier to decide whether a patch should go to trunk or stdenv-updates
for instance.

I think there are many case studies telling that pair programming leads
to less errors in code. So moving into a direction causing more
collaboration is good.

I'm curious about your feedback.

Marc Weber