DFAs, Categories, and Typecheckers.

I’ve recently started reading (in parallel) “Type and Programming Languages” and “Basic Category for Computer Scientists.” The latter of which is really only interesting if you’re a math junky, like me. It’s somewhat dry, and very matter-of-fact, but the subject is terribly interesting.

Whilst reading these books, I’ve also been working on a library for Haskell called “HFA” (pronounced: “Huffa”, or if your feeling silly, “Hoffa”), for “Haskell Finite Automata.” The library’s purpose is to create a simple to use, generic, relatively fast implementation of various Automata (Notably (D|N)FA’s, PDAs, Turing Machines, etc.), so that anyone intending to use these abstractions will be able to without knowing much about the internal theory, eg how to minimize a DFA, or how to convert an NFA to a DFA, etc. It’s also intended to be available as a easy to understand tool for learning/teaching about automata, it will eventually have the ability to display Automata as Graphviz graphs, and is currently capable of generating state diagrams (with some extensions to mark final, initial, and initial-final states).

Recently, I had just finished writing some refactor code for HFA, and decided to take a break and read “Basic Category Theory” for a while, it dawned on me upon looking at the diagram of a category that what I was looking at was essentially a DFA, with all states final, and the arrows between them being processing parts of the Delta Functions. That is, if a Category is defined as a a set of objects, and a set of arrows (where an arrow is defined as f : A -> B, where A and B are objects in the category), then the equivalency is as follows:

Category DFA
Objects States
Arrows Transitions

with delta(S,x) = S' iff there is an arrow between x is an arrow between S and S’.

Notably, we can also define a DFA as a Category by simply reversing the definition. I’m pretty sure this fact has been discovered before, its to obvious to believe otherwise (though it would be cool I could name this “The Fredette Isomorphism”, ^_^). The interesting thing about this Isomorphism is that, if we can generalize a DFA, whats to say that we couldn’t generalize the category in the same way? Take Imperative languages for instance. I don’t know if it works out (and I certainly don’t have the skill to prove it if it does work out, at least not yet), but it is a hypuothesis of mine that an imperative program can be represented in a category with multiple arrows going from one object to another simultaneously, that is, an imperative program is a kind of “Nondeterministic” category. Ring any bells? We know (by the simple fact of Turing completeness) that a TC imperative language program can be written in a TC Pure Functional language (assuming there’s a Pure Functional way to deal with State, eg Monads). Similarly (and this is almost to much of a guess to even _think_ of it as a hypothesis) if a TC Imperative Language is a “Nondeterministic” (ND) category, then if a ND Category is isomorphic to a NFA, then we know that NFA’s are isomorphic to DFA’s, and we know that Pure Functional Languages are isomorphic to operations withing a “Deterministic” Category, eg a “DFA”, so that would “prove” (I use that term _very_ loosely) that any old TC Imperative program has an equivalent TC Pure Functional Program.

Pretty Neat, huh? It probably doesn’t work somewhere, but still- it’s cool.

We can further use the DFACategory relationship as a kind of simple “composition” checker.

If the States of our DFA are types in a language, and the transitions functions mapping one type to another, then we can say that if the delta function is defined as above, and in the case there is no defined transition between some state S and some other state S’, and if such a transition is called for in the delta function, then we simply send the output to a non-accepting “fail” state.

Here, the simple language consists of the following.

The Category contains:

Objects = {Int, Float, Char, Bool, Unit}
Arrows = {isZero :: Int -> Bool,
,ord :: Char -> Int
,asc :: Int -> Char
,sin :: Float -> Float
,not :: Bool -> Bool}
Values = {zero :: Int, true :: Bool,
,false :: Bool, unit :: Unit}

The corresponding DFA works something like this

f1,f2, ... fn are symbols which have type an -> bn,
where n is the same value as the nth symbol, and a and b are not type variables, eg: f1 :: Int -> Char, a1 = Int, b1 = Char
v is a type

(f1 . f2 . f3 . ... . fn) v
=> ([f1, f2, ..., fn], v)
=> [a1,b1,a2,b2, ..., an,bn,v]
=> [(init,a1),(b1,a2),(b2,a3),...,(bn,v)]

given this list of pairs, we define the DFA trace function as follows, this presumes a list like the one from above.

trace :: State -> [(Sym,Sym)] -> Bool
trace st [] = (st /= failState)
trace st [(s1,s2):syms]
| s1 /= s2 = False
| otherwise = trace (delta st (head syms)) syms

where failState is a pseudonym for whatever the magic non-accepting failure state is

and where delta simply points to the next state (be it the fail state, or otherwise). I’ll cover that in more detail in my next post (I’m actually going to build this little guy for a simple language like the one above.)

I’ve digressed a bit from my topic, my goal was to show that Categories are really terribly neat, and apparently related to automata, which most people understand pretty easily, if they are explained well. I don’t pretend to be an authority here, but hell, implementing a (very) simple type checker is a pretty cool feat, considering It was only a few months ago I started learning Haskell. I know that this isn’t a robust, powerful mechanism, but as far as I know, given Compose (the (.) function in Haskell) apply (($) in Haskell) and a few other functions, you have a TC language, a la Backus’ FP or FL systems.

Anyway, next time I intend to implement this little type checker, and (hopefully) eventually implement a (full) type checker for something akin to FP or FL, using this DFA style approach. Heck, I’d also be able to play with parsing (another automata rich subject).

Also, for those interesting in looking at HFA, it’s available at


you can just do a darcs get to pull everything down.

I don’t intend to present any of this as proven, either formally or by any other means, the ideas and conjectures in this post are just that, conjectures. Please, don’t believe I’m an expert, I’m still learning about all these things, and I don’t want to lead anyone down the wrong paths under the assumption I actually _know_ what I’m doing.

That said, I do think the conjectures made have some validity, if you know otherwise, please inform me. Thanks


Published in: on August 5, 2007 at 4:43 pm  Comments (3)  

The URI to TrackBack this entry is: https://lowlymath.wordpress.com/2007/08/05/dfas-categories-and-typecheckers/trackback/

RSS feed for comments on this post.

3 CommentsLeave a comment

  1. I think your definition of Category is a little off. A category has a class of objects, and for each pair (A,B) of objects, a set Hom(A,B), of arrows from A to B. (There may even be infinitely many such arrows.)

    Further, if f: A -> B is an arrow in Hom(A,B), and g: B -> C is an arrow in Hom(B,C), then their composite g.f exists and is an arrow in Hom(A,C).

    Additionally, for every object A, there is an arrow id_A: A -> A such that f . id_A = f and id_A . g = g for any f and g such that the composition is defined.

    Also, whenever the compositions are defined, we have (f . g) . h = f . (g . h).

    These laws are actually rather constraining in terms of which graph structures can actually be made into categories. The fact that compositions must be defined implies a certain kind of closure property, and that each object has at least one self-loop is also fairly constraining.

    For example, if you want a single DFA to be a category, with state transitions as arrows, it must also have all compositions of state transitions as arrows as well. You’d also have to have identity arrows, which in the case of a DFA, wouldn’t correspond to real transitions. So the process of turning a DFA into a category will in some sense modify its alphabet into strings on the original alphabet, add transitions on the empty string to every state, and then a transition for every composition of transitions whose symbol is the corresponding string of symbols. As a result, we could read off the language that the DFA accepts simply by collecting up the strings on each of the arrows from the start state to each of the accept states.

    Another natural sort of category to form involving DFAs would be the category which has as objects all DFAs, and as arrows, has DFA-morphisms, where a DFA-morphism f:(Q,S,A,q_0,delta) -> (Q’,S’,A’,q_0′,delta’), consists of a pair of functions, f_Q: Q -> Q’, assigning states of the source DFA to states of the target, and f_S: S -> S’, input symbols of the source DFA to input symbols of the target, such that transitions are preserved:

    delta'(f_S(s),f_Q(q)) = f_Q(delta(s,q))

    initial state is preserved:

    f_Q(q_0) = q_0′

    and accept states are preserved:

    If q is in A, then f_Q(q) is in A’.

    One can form similar definitions in order to form, for example, a category of Moore machines, or a category for most any other kind of automaton.

  2. Interesting, I’ll definitely have to study this some more, like I said I’m no expert, but the idea popped into my head, and I figured it’d be an interesting idea to share.

    I am a little confused, I think that you can still use the DFA I described towards the end as a kind of a type-checker, because it is designed to specifically deal with composition of arrows. You certainly seem to be more knowledgable about this than me. I suppose the idea is just this:

    a valid function created by:
    (f . g . h . i)
    has a signature that looks like

    a -> b
    b -> c
    c -> d
    d -> e
    a -> e

    So if the states are the types, then the functions are arrows between types, composition is then implicitly defined, I think.

    Anyway,thanks for the info.


  3. You might find it interesting to look up “presheaf categories”. Cool things with categories are actually functors, rather than morphisms. A good place to start is http://math.ucr.edu/home/baez/week115.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: