Final Tagless

A continuation for Object Algebra.

Final Tagless

Still the expression problem, without stronger type system, this is as far as we can go. But what if we have some extra bonus ? We will show it with Haskell below.

Initial

“Initial” is from initial algebra, which is an initial object of F-Algebra category of a given endofunctor-F. Here the initial approach is to encode the expression as values of an ADT :

1
2
3
4
data Exp 
= Lit Int
| Neg Exp
| Add Exp Exp

An instance of this Exp can be :

1
2
ti1 :: Exp
ti1 = Add (Lit 8) (Neg (Add (Lit 1) (Lit 2)))

We can implement its evaluator and printer as

1
2
3
4
5
6
7
8
9
eval :: Exp -> Int
eval (Lit n) = n
eval (Neg e) = - (eval e)
eval (Add e1 e2) = eval e1 + eval e2

view :: Exp -> String
view (Lit n) = show n
view (Neg e) = "(−" ++ view e ++ ")"
view (Add e1 e2) = "(" ++ view e1 ++ " + " ++ view e2 ++ ")"

But we can not add new variants to it like the naïve implementation we mentioned above.

Final

The final approach is quite different. If all we ever need is the value of an expression, we can represent the term in our arithmetic language by its value, or by a Haskell expression that computes that value.

A typeclass is perfect for the situation:

1
2
3
4
class ExpSYM repr where
lit :: Int -> repr
neg :: repr -> repr
add :: repr -> repr -> repr

and the same example can be expressed with a different type signature

1
2
tf1 :: ExpSYM a => a
tf1 = add (lit 8) (neg (add (lit 1) (lit 2)))

What we need now is just different ExpSYM instance :

1
2
3
4
5
6
7
8
9
instance ExpSYM Int where
lit = id
neg e = - e
add e1 e2 = e1 + e2

instance ExpSYM String where
lit n = show n
neg e = "(-" ++ e ++ ")"
add e1 e2 = "(" ++ e1 ++ " + " ++ e2 ++ ")"

and their evaluator :

1
2
3
4
5
view :: String -> String 
view = id

eval :: Int -> Int
eval = id

eval tf1 gets 5 and view tf1 gets (8 + (- (1 + 2)))

Extensibility of final embedding

Like object algebra, final embedding can extend easily for data variants :

1
2
3
4
5
6
7
8
class MulSYM exp where
mul :: exp -> exp -> exp

instance MulSYM Int where
mul = (*)

instance MulSYM String where
mul e1 e2 = "(" ++ e1 ++ " * " ++ e2 ++ ")"

and the following expression will have another restriction on its polymorphic type

1
2
tfm1 :: (MulSYM a, ExpSYM a) => a 
tfm1 = add (lit 7) (neg (mul (lit 1) (lit 2)))

Pattern match in final embedding

Consider an operation of pushing Neg down to the leaf in the initial embedding:

1
2
3
4
5
6
push_neg :: Exp -> Exp
push_neg e@(Lit _) = e
push_neg e@(Neg (Lit _)) = e
push_neg (Neg (Neg e)) = push_neg e
push_neg (Neg (Add e1 e2)) = Add (push_neg (Neg e1)) (push_neg (Neg e2))
push_neg (Add e1 e2) = Add (push_neg e1) (push_neg e2)

but we can not match such structure directly in final approach, we need to find another way.

The former instances we implemented in final approach have something in common, is that all the information it needs is inside itself, which is , in other words , compositional. But push_neg is not, it needs extra information from above. So here we offer such information explicitly :

1
2
3
4
5
6
7
8
data Context = Positive | Negative

instance ExpSYM a => ExpSYM (Context -> a) where
lit n Positive = lit n
lit n Negative = neg (lit n)
neg e Positive = e Negative
neg e Negative = e Positive
add e1 e2 ctx = add (e1 ctx) (e2 ctx)

and finally :

1
2
push_neg' :: ExpSYM a => a
push_neg' e = e Positive

Relation between initial and final

We can find the isomorphism between Exp and ExpSYM a

1
2
3
4
5
6
7
8
9
10
11
12
instance ExpSYM Exp where
lit = Lit
neg = Neg
add = Add

initialize :: Exp -> Exp
initialize = id

finalize :: ExpSYM a => Exp -> a
finalize (Lit n) = lit n
finalize (Neg e) = neg (finalize e)
finalize (Add e1 e2) = add (finalize e1) (finalize e2)

With these two morphisms, we can build push_neg' in another way :

1
push_neg' = finalize . push_neg . initialize

Why Tagless ?

Type tag may leads to problem. Here is an example of a simple typed lambda calculi :

1
2
3
4
5
6
data Exp 
= V Var
| B Bool
| L Exp -- lambda
| A Exp Exp -- application
data Var = VZ | VS Var

Var is De Bruijn index representing the parameter of lambda expression.

And here is an example

1
2
ti1 :: Exp
ti1 = A (L (V VZ)) (B True) -- or: ((\x. x) True)

If we try to write an evaluator, we will see that the evaluator may raise runtime error (or inexhaustive pattern-match). At least we should write a type checker and use universe type to get the result of an Exp .

But are we able to use type-system to restrict Exp and ensure the polymorphism of the result, without using dependent type ? Fortunately, there is such a way.

Initial and Final Tagless Interpreter

Algebraic data type is too loose to ensure the result of Exp is safe, so like what we do in “Theorem Proving in Haskell”, we can use Genialized Algebraic Data Types (GADTs) to remove ill-typed expressions.

As usual, we start with the initial embedding :

1
2
3
4
5
6
7
8
9
10
{-# LANGUAGE GADTs #-}
data Exp env t where
B :: Bool -> Exp env Bool
V :: Var env t -> Exp env t
L :: Exp (a, env) b -> Exp env (a -> b)
A :: Exp env (a -> b) -> Exp env a -> Exp env b

data Var env t where
VZ :: Var (t, env) t
VS :: Var env t -> Var (a, env) t

Now we have :

1
2
ti1 :: Exp env Bool
ti1 = A (L (V VZ)) (B True)

And this time a straight-forward evaluator will work fine :

1
2
3
4
5
6
7
8
9
eval :: env -> Exp env t -> t
eval env (V v) = lookp v env
eval env (B b) = b
eval env (L e) = \x -> eval (x, env) e
eval env (A e1 e2) = (eval env e1) (eval env e2)

lookp :: Var env t -> env -> t
lookp VZ (x, _) = x
lookp (VS v) (_, env) = lookp v env

Initial solution is based on GADTs, which is a “lightweight dependent types”. But is that really necessary for a tagless interpreter ?

Here comes the final approach.

Finally Tagless Interpreter

We write tf1 = a (l vz) (b True) as our final embedded expression. As we have found the isomorphism between initial and final, we can try to write the final interpreter just as the initial one :

1
2
3
4
5
6
vz (t, _) = t
vs vp (_, env) = vp env

b bv env = bv
l body env = \x -> body (x, env)
a e1 e2 env = (e1 env) (e2 env)

With the definition above, giving tf1 an environment () and it will get the result True as expected.

These five lines are the entire final interpreter with no type tags. It associates the type of the language with Haskell type system. The signature of these functions can be inferred as :

1
2
3
b :: t -> env -> t
l :: ((t1, env) -> t) -> env -> t1 -> t
a :: (env -> t1 -> t) -> (env -> t1) -> env -> t

Now we rebuild a typeclass Symantics to describe the typed lambda calculi with integer addition :

1
2
3
4
5
6
7
class Symantics repr where
int :: Int -> repr h Int
add :: repr h Int -> repr h Int -> repr h Int
z :: repr (a, h) a
s :: repr h a -> repr (any, h) a
lam :: repr (a, h) b -> repr h (a -> b)
app :: repr h (a -> b) -> repr h a -> repr h b

repr here is a higher-order type * -> * -> * , the first type is the environment and the second is the result type of the expression.

Now we can implement the instance for (->)

1
2
3
4
5
6
7
instance Symantics (->) where
int = const
add e1 e2 = \env -> e1 env + e2 env
z = fst
s var = var . snd
lam e = \x a -> e (a, x)
app l v = \env -> (l env) (v env)

Replace De Bruijn index with native lambda

De Bruijn index is very abstract and actually it is not necessary. Haskell built-in type system is strong enough to construct legal lambda terms.

Instead of retrieve parameters from environment, we can store a Haskell lambda term in it :

1
2
3
4
5
6
7
8
class Symantics repr where
int :: Int -> repr Int
add :: repr Int -> repr Int -> repr Int
lam :: (repr a -> repr b) -> repr (a -> b)
app :: repr (a -> b) -> repr a -> repr b

example :: Symantics a => a (Int -> Int)
example = lam (\x -> add x x)

Now we can implement something more :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class MulSYM repr where
mul :: repr Int -> repr Int

class BoolSYM repr where
bool :: Bool -> repr Bool
leq :: repr Int -> repr Int -> repr Bool
if_ :: repr Bool -> repr a -> repr a -> repr a

class FixSYM repr where
fix :: (repr a -> repr a) -> repr a

example :: (FixSYM a, BoolSYM a, MulSYM a, Symantics a) => a (Int -> Int)
example = fix $ \self ->
lam $ \n ->
if_ (leq n (int 1))
(int 1)
(mul n (app self (add n (int (-1)))))

Instance is trivial for most of them but the fix, we will need fix :: (a -> a) -> a in Haskell

1
2
3
4
5
6
7
8
9
newtype Id a = Id {runId :: a}
instance FixSYM Id where
fix f = let fix' h = h (fix' h) in
Id $ fix' $ runId . f . Id

newtype Printer a = Printer { p :: Int -> String }
instance FixSYM Printer where
fix f = let fix' h = h (fix' h) in
Printer $ \n -> "(fix " ++ (fix' $ p . f . Printer) n ++ ")"

Conclusion

Final tagless or object algebra solves the problem of polymorphism of an ADT, allowing easily extension without modification on the legacy code by encoding the object language term into the host language term in a composition way.

Reference