[Nix-dev] Use Haskell for Shell Scripting

Sun Feb 1 00:33:58 CET 2015

2015-01-31 10:22 GMT-02:00 Ertugrul Söylemez <ertesx at gmx.de>:
>> At this current point in time, GHC is packaged in a poor manner, with
>> GHC being unbelievably huge. Dynamic linking is the answer, which
>> isn't done by default.
>
> I have actually experimented with using Haskell (and a few other FP
> languages) as a substitute for shells.  It is feasible if you disable
> dynamic linking.  The non-Haskell libraries are still linked
> dynamically, but the reference to the GHC derivation is then gone.  This
> brings the closure of a Haskell hello-world "script" from a huge 1.1 GiB
> down to a mere 131 MiB (on my x86_64 system), which makes it on par with
> shell scripts.
>
> However, static linking is probably not a good idea.  The resulting
> "scripts" are on the order of megabytes and can quickly approach a few
> tens of them.  To really fix this and make Haskell viable as a shell
> substitute we need to split the GHC derivation.  There should be a pure
> library derivation and a separate compiler derivation.  The former
> should be as small as possible.  Ideally there would be one derivation
> per library.
>
> The other languages I have tried are Scheme (via Chicken), Curry (via
> PAKCS), SML (via mlton) and Idris.
>
> Before I present my results, let me clarify what I think a "script" is:
> It is a string that I can run through a simple Nix function, which gives
> me a derivation that contains a runnable version of that string, either
> binary or shebanged.  This derivation pulls a reasonably sized closure
> along with it.  I can choose to combine many such runnable scripts to a
> single derivation using buildEnv, which is often very useful.  In other
> words:  For the language "blah" there is a simple, deterministic,
> unconfigurable function that would have the following signature in a
> hypothetical typed Nix:
>
>     blahScript : String -> Derivation
>
> This function can be a special case of a slightly more powerful function
> that takes a directory and a main entry point, because if we choose to
> use a better language, we might as well choose to utilise its module
> system, if it has one, for some of our larger scripts.
>
> Now to my results:  All of the above languages, except Curry, work more
> or less, if all you need to do is to start programs or move files
> around.  As soon as you need to do operating-system-specific stuff
> (e.g. `unshare` on Linux) it gets less juicy, because unless someone has
> written a nice high-level library you need to touch the FFI.
>
> Chicken Scheme worked best for that, because rather than trying to model
> the syscall in the language, you can just dump C code into it.  Not a
> nice and clean solution, but a working one for the many cases when you
> just need to -- you know -- get stuff done.
>
> Haskell works, because lots of the OS bindings can be found on Hackage,
> including Linux-specific libraries.  But it does require a slightly more
> expressive 'haskellScriptWith' function.  You need to be able to tell it
> what you depend on.
>
> SML works and produces surprisingly small executables.  It loses at the
> library end, because there aren't many OS-specific libraries around (or
> I couldn't find them).  Also some of the advanced FFI tooling that I'm
> used to from Haskell seems to be missing.  Finally I would say that the
> syntax is too verbose for quick scripting (but that's subjective -- I
> have seen people use VB.NET for scripting).
>
> You might be interested why Curry didn't work.  Simple: I couldn't
> figure out how to write a program.  Actually I went through the whole
> tutorial, did all the exercises (they aren't really difficult to a
> Haskell programmer) and then skimmed through the whole PAKCS manual.  I
> could write extremely elegant algorithmic code and was quite amazed at
> the beauty of this language, even compared to Haskell.  But in the end I
> still didn't know how to turn all this beautiful Curry code into an
> executable file that I can run without invoking PAKCS explicitly.
> Something with a shebang or ideally something binary.  It would probably
> be possible to write wrapper scripts, but let's just wait until one of
> the implementations becomes mature enough for systems programming.
>
> Finally there is Idris.  It is a beautiful language that comes with
> reasonable editor integration and a lightweight syntax.  It compiles to
> executable binary code and has a carefully designed yet useful FFI.
> Sounds good for scripting.  On the other hand it is very young and
> documentation is far from mature.  Not that I would mind its youth, but
> I do mind the barrier to entry at this point.  At the very least when
> other authors don't understand my code, it should be reasonably obvious
> where to look for answers.  Also the library landscape is very flat, so
> bootstrapping might use most of your time, if you choose to use Idris
> for systems-level scripting at this point.
>
> The most viable options seem to be Chicken Scheme and Haskell.  Both are
> well documented and have a usable FFI.  Chicken produces much smaller
> executables, and programs are very memory-efficient.  By design it
> compiles via C; because of that instead of providing a carefully
> designed FFI it simply allows you to dump C code into it in the spirit
> of inline assembly.  This may seem poor, but it is very useful in
> practice for systems programming, because even in 2015 our operating
> system are very C-centric.
>
> Haskell performs better on the large scale.  It comes with lots of
> well-designed and safe abstractions, usually gets along with shorter
> code, has a good run-time system (e.g. for concurrency), etc.
>
> All in all while I would use Scheme for small quick-and-dirty batch
> scripts, I would use Haskell for larger scripts or services that
> potentially run for a long time.  But there is no formal line to help
> with this choice.  It would take a while of experimentation to provide a
> more educated answer on when to use which language.

Well, as far as I know, Guix is our Guile-Scheme counterpart. Using a
language like Chicken is almost like using Guix :)

What about other languages as Python, Perl etc.? I know it is against
our purity standards, but they are a far superior to Bash scripting.

In my humble opinion, we can just stick to your choice: a Chicken-Haskell dueto.

>
> _______________________________________________
> nix-dev mailing list
> nix-dev at lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>