Haskell: The Good, Bad, and Ugliness of Types

I’ve started to learn Haskell, for those who don’t know, Haskell is a wonderful little language which is based on Lazy Evaluation, Pure Functional Programming, and Type Calculus.

Effectively, this means that, like Erlang and other sister languages, If I write a function foo in Haskell, and evaluate it at line 10 in my program. Then I evaluate it again at Line 10000, or 10000000, or any other point in my code. It will– given the same input– always return the same value. Furthermore, if I write a function to generate an arbitrary list of one’s, like this:

listOfOnes = 1 : listOfOnes

Haskell just accepts it as valid. No Questions asked. Schemer’s and ML’ers of the world are probably cowering in fear. Recursive data types are scary in an Eager language, But Haskell is lazy. Where the equivalent definition in scheme:

(define list-of-ones
(cons 1 list-of-ones))

would explode and probably crash your computer, (that is, if the interpreter didn’t catch it first.) in Haskell, its not evaluated till its needed, so until I ask Haskell to start working on the listOfOnes structure, it won’t. I like Languages like that, IMO, if a language is at least as lazy as I am, its good.

The third really neat thing about Haskell, and what really drew me to it in the first place, is the Type Checker. I’ve used Scheme for a while now, and I love it to death. Sometimes, though- Scheme annoys me. For instance, I was working on a function like this once:

;count-when-true : [bool] x [num] -> num
;supposed to be a helper for filter, I want do a conditional sum. So I pass in (filter foo some-list-of-numbers) and some-list-of-numbers,
; and I should get out a sum of the elements
(define (count-when-true list-of-bools list-of-numbers)
;or-list : ([a] -> bool) x [[a]] -> bool
;applys a filter across a list of lists and ors the results
(cond [(or-list nil? (list-of-bools list-of-numbers)) 0]
[(car x) (+ (car list-of-number) (count-when-true (cdr list-of-bools) (cdr list-of-numbers)))]
[else (count-when-true (cdr list-of-bools) (cdr list-of-numbers))]))

This probably has bugs in it, doesn’t work right, etc. but the idea is to return a conditional sum, now. I want to use this on lists, thats how its defined, but sometimes the calling function would try to call it on atoms, instead of lists. Big problem? not really, pain in the ass to find, you bet. The issue was, when I was trying to figure out what was wrong, Scheme didn’t realize that the type of the inputs were wrong. This would have made the error obvious, but Scheme doesn’t care about types, thats it’s principle strength, until it starts making bugs hard to find. I HATE it when its hard to find bugs.

Lets face it, as programmers, we suck, we write lots of buggy functions, things are generally done wrong the first (two or three… thousand) times. Programming is a recursive process, we write some code, run it, check for bugs, fix bugs, run it, check, fix, etc. Until we get tired of finding bugs/the program doesn’t come up with any. IMO, languages should not be designed to force programmers to write bug-free code, which seems to be the consensus today. At least, thats what I gather from the interweb and such. The goal should be to make all bugs so blatently obvious, that when the programmer sits down and trys to debug his program, he can’t help but to smack himself in the face and proclaim, “!@#$, I missed that!” This is where Haskell Shines.

When I write Scheme, I typically don’t want to be burdened by knowing which types go where. Scheme is great at this, however, it takes things to far, I think, in that it forces you to never have types. Sure, typed schemes exist, but most of them suck, because scheme isn’t designed for types. Don’t get me wrong, typed schemes are wicked cool, I’ve used types in CL too, and they’re great, especially when you want to compile. So to solve the problem of not having types, we invented contracts, which are cool. For the unenlightened: a contract is a specification of what the given datastructure or function does in terms of its arguments. eg:

+ : num * num -> num
toASCII : string -> num
toCHAR : num ->; string

etc.

These can be read as follows:

literally:

+ is num cross num to num
etc

in english

+ is a function which takes two numbers and returns another number.

In Scheme, these contracts are basically comments, so Type checking is left to the programmer. This is all well and good, but I find it often leads to the practice of what I like to call single-typing. In which the programmer attempts to force all of his data to have the same type, or lists of the same type, or lists of lists, or etc. Typically, this results in convoluted datastructures which give FP in general a bad name. I’ve seen some horrible code written by single-typers, its bad, horrific even, It makes me want to gauge out my eyes with a pencil and tear my brain out… Okay, maybe its not that bad. Still, single-typing is most often bad. So how does Haskell fix it?

By not changing a thing.

Contracts are a wonderful Idea, they work, they just don’t work in Scheme. Because it was designed that way. Haskell has type inference, you don’t ever need to touch the Type Calculus capabilities of Haskell, You can– more or less– literally translate Scheme to Haskell with minimal difficulty. (Though, it may be easier just to write scheme in haskell.) But the brilliance of haskell is this:

Heres the Standard Factorial function in Scheme:

;Fac : int -> int

(define (fac x)
(cond [(= 0 x) 1]
[else (* n (fac (-n 1)))]))

Here it is in Haskell:

fac :: Int -> Int
fac(x)
| (x == 0) 1
| otherwise x * fac(x – 1)

(I used a ML style to make things look the same.)

The only real difference (besides some syntax changes) is the lack of the semicolon in front of the contract.

But what does all this do? Well, the difference comes during evaluation, watch this:

In Scheme:

(fac 1.414)

we have an infinite recursion, because:

(fac 1.414) -> 1 * fac(0.414) -> 1 * 0.414 * (fac -0.586) …

In Haskell:

fac 1.414

is a type error, and the whole thing kersplodes. Over, Evaluation Done, Haskell has Denied your function the right to evaluate.

In short, you have been rejected.Enough about the wonderfulness of the Type system. My title says the Good -> Bad -> Ugliness, obviously we’ve seen the good. How about the Bad?

Type Errors in Haskell:

Type errors in haskell suck, easy as that. They’re hard to understand, and in general, not very helpful. Further, alot of the differences between types are very subtle. For instance, consider the factorial function again, (just the type contracts for succinctness)

fac0 :: Int -> Int
fac1 :: Num -> Num

The look equivalent, right? Wrong. Num != Int, it includes Reals too.* So no lovely type errors here. These things are unfortunate, yes, but nothings really perfect. I could deal with this, but what I can’t deal with is exactly the problem I hoped to solve with Haskell, My bugs are hard to find. Not only that, they’re not hard to locate, I know exactly where they are, I just can’t decipher the cryptic text from the Haskell error stream to know exactly what the bug is. So I have to resort to piecing through the code bit by bit, trying to figure it out.

Silly.

Type Signatures are Ugly:

I Like Contracts, but Haskell doesn’t technically use them. Haskell has type signatures. Which are different.

So far, I’ve written contracts like this:

F : S * T * U * … -> D

I could also have:

F : S * T * … -> (D1, D2, …)

or if I wanted HOF’s

F : (G : X -> Y) * … -> (D, …)

these are all pretty easy to understand, (if you know how to read the shorthand). We know exactly what the arguments should be, elements of the set of elements of type S, or T etc. We also know exactly what the return types are, elements of the typed-set D, or ordered k-tuples of elements of typesets D1 through Dn, etc. Equivalent signatures in Haskell are:

(assuming f = F, and any capital letter is a valid type, and that …’s would be replaced with types in the end result.)**

f :: S -> T -> U -> … -> D
f :: S -> T -> … -> (D1, D2, …)
f :: (X -> Y) -> … -> (D, …)

Now, I understand that, since Haskell is Lazily evaluated, we want the type signatures to be heavily curried, hence the load of arrows. Honestly though, how hard is it to convert all that to a form Haskell can use? I’m not saying get rid of the arrow version, maybe just add an option to provide a “normal form” version, I shouldn’t have to add these in my code as comments, solely so I can understand whats going on. I understand that the implication method more accurately reflects what the compiler is doing, but as a programmer, I don’t really give a rats ass what the compiler is doing. As a mathematician,

foo :: Int -> String -> Num -> Bool

looks ugly, do I know what it means? Yes. Do I like the way it looks? No. I grasp that, as a Haskell Compiler, reading these type of signatures in makes things easier, and further, that these definitions make things easier to prove correct*** but damnit Haskell, I’m a mathematician, not a miracle worker, I want to be able to read those definitions intuitively, and not have to muddle around trying to figure out exactly what that signature represents. It’s ugly, fix it.

On that note, I am beginning to work on some Haskell Code which will convert a Type Signature of the form:

f :: S^n1 * T^n2 * … -> (D1,D2, … Dn)

to the form:

f :: S -> S -> .. n1 times .. -> T -> T -> ..n2 times.. -> (D1, D2, … Dn)

and hopefully, given some user input, the latter to the former as well. (This is not harder, sortof, but I can’t know what the normal form of the type signature should be without some user input about the in-arity (arity) and out-arity (ority) of the function.

Anywho, Haskell is awesome, go play with it.

~~Joe

*= Aside: I’m quite glad Haskell calls them Reals and not
something silly like Float (though that is allowed) or Double. Us
Mathematicians have had these names for years, IEEE can call the format
Double precision floating point of w/e the hell they want, they’re
reals, not doubles. Silly computer scientists…

Edit: Note that in fact I understand that floats != reals, but its about state of mind. I know I’m working on a computer, and so I’m not going to treat things as reals, but I want to be thinking as if I’m not limited, so that when I work with my code, I’m not tuning the algorithm to work with the computer, I’m tuning the computer to work with my algorithm. In this way, the problem becomes a problem of making the compiler better, rather than hacking my algorithm to work.

**= Haskell doesn’t really like capitalized function names.

***= Proofs of correctness are done through the Curry-Howard Isomorphism, which effectively states that if the contract of a given function is a valid statement of Logic, then the function is correct, otherwise its not. Note that this requires the Signature to be correctly written, ie:

concatString :: String -> String -. String as a signature for a function which zipped two strings together would be “correct” but only in the sense that the contract would be satisfied. A Proof of correctness means that the function of that type can exist, there are other methods related to this Isomorphism which allow for a better proof of the semantic correctness, as opposed to the more syntactual flare of Curry-Howard

Published in: on May 1, 2007 at 9:14 pm  Comments (5)  

The URI to TrackBack this entry is: https://lowlymath.wordpress.com/2007/05/01/haskell-the-good-bad-and-ugliness-of-types/trackback/

RSS feed for comments on this post.

5 CommentsLeave a comment

  1. If you think floating point numbers are reals you are very sadly mistaken. And you will be punished for it when you try to use them in any serious way. 🙂
    Floating point numbers are a finite subset of the rationals and they lack most nice properties we take for granted about numbers. The redeeming quality that floating point has it that it is fast.

  2. I understand Reals are not Floating points. My point was that in languages like Haskell, which are (relatively speaking) arbitrarily precise, it’s better to treat them not as doubles/floats but rather as Reals. Obviously we can only ever approximate decimal numbers, I guess the point I was really trying to make was that I don’t want my brain to have to make any more jumps than it has too, the half dozen cycles I spend reminding myself that a float and a double are really just approximations of Reals could be spent actually thinking about the problem.

    I suppose it’s really just a pet peeve.

    ~~Joe

    PS. Wow, someone reads this? I feel important now. Meh, I’m sure it’ll pass.

  3. In your semi-rant about the ->-syntax of functions of multiple arguments, you seem to only think of the case of fully applying the function, i.e.

    f :: X -> Y -> Z -> A

    f foo bar baz

    But in fact, I noticed that /very often/ (!) one applies functions only partially, i.e.

    map (f foo bar) […]

    More higher-orderness is common too:

    map f […]

    And think about user-defined control structures, maybe :: Maybe a -> b -> (a -> b) -> b, monads etc.

    So, to make the long story short: I think that the ->-notation is indeed Good (TM), because it is very common to apply functions only partially.

  4. If you need to remember what’s going on, you can write your type signatures like this:

    f :: A -> (B -> C)

    instead of

    f :: A -> B -> C

    Alternatively, you could write functions like this:

    f :: Num n => (n, n) -> n
    f (x, y) = x + y

    (though this example could have been written as f = uncurry (+) if those pesky monomorphismists hadn’t infiltrated the Haskell 98 committee…)

  5. I have, since this post, rather come to like the -> notation, so I guess that point is moot. It’s just culture shock at first, coming from what little I knew of logic to what little I knew of haskell.

    ~~Joe


Leave a reply to Anonymous Cancel reply