Duck! Another blag incoming!

Haskell and Dummy (1)

2017-02-27

Part one: Some essentials

If something is seriously wrong, please file an issue.

I know a tiny bit of haskell, I learned it on paper or in my head only. I read a book about haskell, but most of the time I didn't have a computer with me. I watched a lot of talks about haskell and never bothered to try anything. So I decided to write a program I find very useful: The deduplicator. It recurses over a directory and finds all duplicated files and replaces them with hard links. In these posts I'm just doing the first part of it and I'll scrutinise all basics I only heard of.

A haskell program that takes a directory as an argument and returns the absolute path and the sha256sum of each file in it.

I know, it's going to be hard, because haskell is easy as long as you don't interface with the world. Haskell on paper and in my head is much easier than real world haskell.

Let's it

Haskell read directory recursively

It seems FilePath is what we are looking for, but checking some other links tells me it's not going to be easy.

Let's take a step back. Print the first argument of the haskell program. That's gonna be easy, right?

Easy

You just have to accept the following or watch Brian Beckman's video if you don't:

  1. Data and functions are the same thing
  2. = defines a function/data
  3. The function called main is called when a haskell program is started
import System.Environment

main = do
    arg <- getArgs
    print arg

Well, that was fairly easy, but what the hell am I doing? First of all, I know that IO uses monads and if you use monads you have to use the do-notation. But why exactly? Let's ask Miran:

In a do expression, every line is a monadic value. To inspect its result, we use <-. If we have a Maybe String and we bind it with <- to a variable, that variable will be a String. [1]
[1]Replace Maybe with Monnad.

In the do-notation we chain each function to the next one.

I ask myself:

Is do and <- basic haskell syntax? I think, this a pile of syntactic sugar and since I am on diet, it want it without sugar.

The haskell-wiki agrees:

Do
Syntactic sugar for use with monadic expressions. For example:
do { x ; result <- y ; foo result }

is shorthand for:

x >>
y >>= \result ->
foo result

presto!!

main = getArgs >>= \arg -> print arg

But what on earth is: >>= and ->? What about the backslash?

\arg -> print arg

Well, the backslash gave haskell its logo, its used to define a lamba-function. This is fairly easy: arg goes in, we print arg. And >>= is the bind function: Sequentially compose two actions, passing any value produced by the first as an argument to the second. It chains two functions making them composable, more about that below.

Why the heck isn't it:

main = >>= getArgs (\arg -> print arg)

>>= is infix notation, the dirty little fucker. [2]

[2]Pardon my Irish

mad challange!

Can we make >>= prefix notation?

Of course!

bind x y = x >>= y

main = bind getArgs (\arg -> print arg)

In this case you actually don't need the lambda. The following is perfectly fine.

main = getArgs >>= print

Since I want to print only the first argument, I have to extract it using the head function, but I very much dislike the parens around (head arg). I think, I remember a way to get rid of them.

main = do
   arg <- getArgs
   print (head arg)

becomes

main = do
   arg <- getArgs
   (print . head) arg

This is function composition. We combine print and head - both of which take one argument of the same type-class - and create a new print_head function, that also takes one argument of that type-class. That is the reason for binding the monads: Monoids - functions taking and returning the same class of types, can be composed easily. But monads return a type plus something, bind reduces that type plus something back to the type. Voila, function composition is possible again. While that something allows us to manage nasty things like state.

print_head = print . head

main = do
   arg <- getArgs
   print_head arg

And then the haskell god called me: "We use camelCase in haskell!" I was like: "Serious bitches, that reminds me of the languages I hate most, can we please switch to underscores, like all proper languages do?". That was the first time a god struck me with lighting, so I gave in.

printHead = print . head

main = do
   arg <- getArgs
   printHead arg

So finally we see that monadic composition is the same thing with different scribbles.

printHead = print   .   head        -- function composition
main      = getArgs >>= printHead   -- monadic  composition

And it is a program that compiles and runs.

Brian Beckman gives a very light introduction to functions, monoids and monads. Check it out. [3]

[3]Because this video is a gem and in the spirit of Jason Scott I don't trust the Internet, I archived it.

Why haskell matters (to me).

If you write good software in any language, you are going to use these concepts. For example generics and monads have a lot in common. Or linq is actually monadic. So if you use this concepts, it would be nice to know how they come about. Even if I am never going to write any productive software in haskell, I can use the concepts in any language that has functions. Haskell just makes sure, I really have to gain a deeper understanding. Besides I really believe that I will use haskell in the future.

This entry was tagged as haskell dedupe