Thursday, 30 October 2008

2 Minute intro to Associated Types / Type Families

Type Families are an important recent addition to GHC which have been developed over the past couple of years (and indeed are still being developed). They promise to address some of the same issues addressed by functional dependencies whilst avoiding some of the nastier corner cases and allowing for a more solid theoretical underpinning and implementation.

There is a lot of material available on Type Families, in fact it seemed somewhat bewildering to me initially. There doesn't seem to be an "idiot's guide" overview - so that's what this post attempts to do - to provide a just enough background to help you work out which papers etc you want to read next.

Terminology and Syntax

The thing to note is that there are essentially four different concepts, each of which have a couple of different terms for them:

Associated (Data) Type
class ArrayElem e where
data Array e
index :: Array e -> Int -> e

instance ArrayElem Int where
data Array Int = IntArray UIntArr
index (IntArray a) i = ...
Associated (Type) Synonym
class Collection c where
type Elem c
insert :: Elem c -> c -> c

instance Eq e => Collection [e] where
type Elem [e] = e
...
Data (Type) Family
data family Array e

data instance Array Int = IntArray UIntArr

data instance Array Char = MyCharArray a b
(Type) Synonym Family
type family Elem c

type instance Elem [e] = e

type instance Elem BitSet = Char


Associated or Family?


The first "axis" of categorization is "Associated" vs "Family". The "Associated ..." variants (which were invented first) are those which are declared inside a standard typeclass declaration, the "... Family" variants are stand-alone, top-level, use the "family" keyword and were invented a year or so later. The Family variants (collectively known as Type Families) are strict generalizations of the associated ones, and the associated ones are simple syntactic sugar for the family variants.


Data or Type Synonym?


The other "axis" of categorization is "Data" vs "Type Synonym". This distinction mirrors that of normal Haskell "data" and "type" declarations. The key point is that associated data types and data families let you create a bijection (ie one-to-one mapping) from source type to destination type. Associated type synonyms and synonym families on the other hand allow you to map two different souce types onto the same destination type. The best reference for more details on the difference is the first part of section 7 of the "Associated Type Synonyms" paper.


Equality Constraints
The final piece of syntax allows us to assert type-level equalities using the above:

sumCollection :: (Collection c, Elem c ~ Int) => c -> Int
sumCollection c = sum (toList c)

This (as with the other examples) is slightly modified from one of the papers - in this case the "Associated Type Synonyms" paper where the syntax uses "=" rather than the "~" which was ultimately used.


Compatability
Many of the above features are actually available in the GHC 6.8 branch but they are only really supported under 6.10 onwards. (I understand that the main reason the code is in 6.8 at all is to facilitate merging of other, unrelated fixes between those branches). I've had success with both kinds of associated types under 6.8 but ran into problems using equality constraints.