Nattermorphisms

FTP dangers

2015-02-11T11:55:00.000+00:00

I am concerned about the Foldable Traverable Proposal (FTP) (https://ghc.haskell.org/trac/ghc/wiki/Prelude710) :

My first and biggest concern is simply that it's harder to read code which uses highly polymorphic functions unnecessarily.

You can see that even by considering plain-old fmap vs map: it's harder to read "fmap f . fmap g" than "map f . map g" because with the former you're having to manually search for more information about what Functor is being used in each case.

My second concern is that the more you overload, the more you risk having something unexpected happen:

Say we have two variables:

*Borders.Base.Utils> let a' = ["alice", "bob"]
*Borders.Base.Utils> let a  = (True, a')

we want to count the characters so we type:

*Borders.Base.Utils> length $ concat a

..but we've accidentally forgotten the prime character...

... right now we get:

:6:17:
    Couldn't match expected type ‘[[a0]]’
                with actual type ‘(Bool, [[Char]])’
    In the first argument of ‘concat’, namely ‘a’
    In the second argument of ‘($)’, namely ‘concat a’

so we fix it and get our desired result:

*Borders.Base.Utils> length $ concat a'
8

...but under the FTP proposals (where concat would become Data.Foldable.concat) we get:

*Borders.Base.Utils> length $ Data.Foldable.concat a
2

(because pairs are Foldable in their second argument).

This cannot be a good thing.

I believe that the more generalised functions of FTP should be opt-in (i.e. you should - as at present - need to import them explicitly).

Unfolding with View Patterns

2012-06-15T06:55:00.000+01:00

A while back I across a way of looking at fold / unfold duality which I've not seen anywhere else. It makes use of view patterns to highlight the symmetry in the implementation of the two combinators.

Firstly, for reference, the standard implementation:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f b [] = b
foldr f b (x : xs) = f x $ foldr f b xs

Then we rewrite the inputs slightly:

foldr3 :: (() -> b,(a,b) -> b) -> [a] -> b
foldr3 (b,f) [] = b ()
foldr3 (b,f) (x : xs) = f (x, foldr3 (b,f) xs)

...and rewrite them a little more...:

-- | (+) -| Delta (Coproduct bifunctor is left adjoint to Diagonal functor)
foldr4 :: (Either () (a,b) -> b) -> [a] -> b
foldr4 f [] = f $ Left ()
foldr4 f (x : xs) = f $ Right (x, foldr4 f xs)

...now we can create 'unfoldr' just by swapping the LHS and RHS of the definitions of 'foldr4':

-- | Now just swap the LHS and RHS of the '=' !!!
unfoldr2 :: (b -> Either () (a,b)) -> b -> [a]
unfoldr2 f (f -> Left () ) = []
unfoldr2 f (f -> Right (x, unfoldr2 f -> xs)) = (x : xs)

Perspective

2012-02-23T06:47:00.001+00:00

Last night, on the BBC News, I saw the worst thing I ever have.

Whilst covering the deaths of Marie Colvin and Remi Ochlik, they showed footage of a two-year-old child dying from injuries sustained in the bombardment of Homs.

A child. Two years old.

Suddenly my world seemed very different, and much smaller.

The Category Theory of Appendages

2009-01-18T15:32:00.006+00:00

There has been much discussion recently on haskell-cafe about Monoids, much (too much!) focusing on the possibility of renaming of the existing monoid typeclass to "Appendable"... which is a daft idea, not least because then we'd need to rename Monad to be "Appendage".

Anyway, I wanted to focus a bit on the link between the two concepts (monoid and monad), and how that's treated from the Category Theoretic perspective. (I'm not aiming at any kind of rigour ... simply trying to gain some basic intuitions). In terms of assumptions, I assume familiarity with the definitions of Category, Functor and ideally categorical products / co-products, initiality and finality.

This article aims to cover similar ground to one of sigfpe's here, but at a slightly higher (hopefully simpler) level.

Traditional Monoids

The basic concept of Monoid is treated from the standard mathematical (abstract algebra) perspective here, and from the practical (Haskell) perspective sigfpe has a nice article here.

So, from a traditional point of view, a monoid is:

* A set, along with...
* an associative binary function over that set, accompanied by ...
* an identity element

The basic (non-category theoretic) view of monoids in Haskell is pretty much the standard one, viewing Haskell types as sets. So in Haskell common examples of monoids are: (String, (++), []) and (Int, (+), 0) and (Int, (*), 1) and (Bool, (&&), True) and (Bool, (||), False).

In the Haskell setting this means that the function has both arguments and result with the same type - ie the function's type is "a -> a -> a", and the identity element is a distinguished value of type "a". And, of course, it is exactly this which is embodied in the Data.Monoid typeclass.

Category takes this standard view of monoids and generalises it somewhat. This generalisation has two parts - not only does it generalise so that we can talk about monoids in different categories, but it also generalises so that we can have multiple ways of identifying monoids within a single category.

Categories which can contain Monoids

From the category theory perspective, a category can only "play host" to monoids under the following conditions:

* It must be equipped with a functor over the category which maps any pair of objects to another object, and any pair of arrows to another arrow. (This type of functor is similar to a normal functor apart from operating on pairs - and is known as a Bifunctor). Let us call this functor "B".

* It must have a distinguished object (which we'll call "e") which acts as an identity for this functor.

This then gives us what is known as a Monoidal Category. (Note that by making different choices for the (bi)functor and distinguished object we may have several different ways to view an underlying category as monoidal).

The simplest example is on Set - we take the standard product as the (bi)functor (which maps any pair of sets to the set which is their product), and a single-element set (ie terminal object) as the distinguished object.

In the context of Haskell, we take as the (bi)functor, the product (bi)functor (which maps any pair of types, say 'a' and 'b', to their product type "(a,b)", and any pair of functions, say 'a->c' and 'b->d' to the function '(a,b)->(c,d)'). As the distinguished object we take "()" - the unit type whose sole value "()" is written syntactically identically to its type.

Category Theoretical Monoids

Once we have those two components, we have a category in which it is possible to identify categorical monoids, but we don't actually have the definition of a categorical monoid itself. A monoid within a monoidal category is defined to be:

* An object (which we'll call "c")
* An arrow from B (c,c) to c.
* An arrow from e to c.

...such that some basic diagrams commute. Now, recall that "B" is the (bi)functor which we selected when creating our monoidal category, so "B (c,c)" is just another object.

So, to make this concrete - if we consider Haskell as a monoidal category as above, then we can take "c" to be "String", our first arrow to be "(++) :: (String,String) -> String", and our second arrow to be the function ":: () -> String" which takes "()" as its argument and returns "[]".

If we compare this with our first, non-categorical, definition of (String, (++), []) as a monoid above, the parallels are quite clear. The differences are firstly that we now need to pretend that (++) has type "(String,String) -> String" rather than "String -> String -> String", which we can do simply by viewing it as an uncurried function, and secondly that we're representing "[]" by a function rather than a value. This extra baggage only really becomes useful when we look at other categorical monoids (which aren't plain, normal monoids).

Monads

So far, when discussing Haskell, we've implicitly had in mind the category "Hask" which has types as its objects, and Haskell functions between those types as its arrows. What we can now do, is to consider another category which is closely related to "Hask" - namely the category formed by taking as objects all Haskell functors ("Maybe", "[]", etc...) and as arrows all functions from one functor type to another ("listToMaybe :: [a] -> Maybe [a]", "maybeToList :: Maybe a -> [a]", etc...). This is the "endofunctor" category over Hask, and these arrows are natural transformations.

Next we're going to go looking for monoids in this "endofunctor" category using our above definitions. We need to keep a slightly clear head at this point, because we need to remember that the objects of this category are functors over another category - hence when we just say "functor" we need to be clear whether we're talking about one of these objects (ie a functor in the underlying category), or about a functor over this category.

First we need to choose a (bi)functor over this category - we'll choose composition of functors (so this takes a pair of objects - say "Maybe" and "[]" and maps them to their composition "[Maybe]"). Secondly we need to choose a distinguished object - we'll choose the identity functor "Id".

Finally, using our second definition, we can see that a monoid in this category must be an object along with two suitable arrows (ie this will be a functor and two natural transformations in Hask). We can take 'Maybe' as the object, 'join :: Maybe (Maybe a) -> Maybe a' as the first arrow, and 'Just :: a -> Maybe a' as our second arrow.

Thus equipped, "Maybe" can be seen as a monoid in the endofunctor category over Hask. And that, of course, is what a monad is - a monad over a category 'C' is a (categorical) monoid in the endofunctor category over 'C'.

Back to Set (...and Hask?)

We can now go back and take another look at the Set category. This time we can look for some monoids which are monoids from the category theoretic perspective, but not from the traditional perspective.

We can do this by viewing Set as a monoidal category using co-product as the (bi)functor (ie disjoint union rather than product) and taking the initial object (the empty set) as the distinguished object. Under this definition, "monoids" would be objects (ie Sets - as before), equipped with (a) a function to the set from the disjoint union of the set with itself, and (b) a function to the set from the empty set - ie the empty function.

Now, I have to confess I don't remotely understand the implications of this. I haven't ever seen any reference to what such objects would be called in "Set". In Hask, the coproduct of 'a' and 'b' is 'Either a b', and on functions takes 'a->c' and 'b->d' to 'Either a b -> Either c d'. Also, I'm not sure how much sense it makes to talk about an empty type in Hask, or an empty function in Hask.

It seems like there ought to be something useful (back in Haskell land) to drop out of this... I'd be very interested in anyone has any ideas / pointers...

Christmas Profiling Tip

2008-12-24T15:50:00.002+00:00

If you're profiling some code and you get something like this from hp2ps ...


c:/ws/fpfdev/depot/QA/EDG/EDG_priv/FPF_Dev.br/src $ hp2ps -c
GenBarriers.hp
c:\ws\fpfdev\depot\QA\EDG\EDG_priv\FPF_Dev.br\ThirdParty\ghc\ghc-6.10.1\
bin\hp2ps.exe:  Disaster! (bucket out of range)

...then the chances are it's because you have a partially truncated .hp file (eg if your program didn't exit cleanly). This is easily fixed, by searching for the last BEGIN_SAMPLE line in the file, and deleting everything from there onwards.

Why does Functional Programming matter?

2008-11-09T20:15:00.002+00:00

Of course we all know the answer...

...but recently I've been wondering how to explain what I feel is important about FP in a pithy, succinct way. (I've frequently found myself failing to explain it well).

I think in the future I'm going to say this:

"Functional Programming makes programs easier to understand. And that means they're less likely to go wrong."

For my money this is why functional programming is vitally important - because the biggest problem we have right now isn't concurrency ... it's the fact that we can't even write single-threaded programs that work properly. (Of course "go wrong" is used above in the standard rather than the Milner sense).

One other comment I'd make is that I think the "easier to understand" bit becomes more apparent with bigger programs.

I'd be interested in hearing if anyone's got a better way of describing it.

2 Minute intro to Associated Types / Type Families

2008-10-30T09:16:00.005+00:00

Type Families are an important recent addition to GHC which have been developed over the past couple of years (and indeed are still being developed). They promise to address some of the same issues addressed by functional dependencies whilst avoiding some of the nastier corner cases and allowing for a more solid theoretical underpinning and implementation.

There is a lot of material available on Type Families, in fact it seemed somewhat bewildering to me initially. There doesn't seem to be an "idiot's guide" overview - so that's what this post attempts to do - to provide a just enough background to help you work out which papers etc you want to read next.

Terminology and Syntax

The thing to note is that there are essentially four different concepts, each of which have a couple of different terms for them:

Associated (Data) Type

class ArrayElem e where
    data Array e
    index :: Array e -> Int -> e

instance ArrayElem Int where
    data Array Int = IntArray UIntArr
    index (IntArray a) i = ...

Associated (Type) Synonym

class Collection c where
    type Elem c
    insert :: Elem c -> c -> c

instance Eq e => Collection [e] where
    type Elem [e] = e
    ...

Data (Type) Family

data family Array e

data instance Array Int = IntArray UIntArr

data instance Array Char = MyCharArray a b

(Type) Synonym Family

type family Elem c

type instance Elem [e] = e

type instance Elem BitSet = Char

Associated or Family?

The first "axis" of categorization is "Associated" vs "Family". The "Associated ..." variants (which were invented first) are those which are declared inside a standard typeclass declaration, the "... Family" variants are stand-alone, top-level, use the "family" keyword and were invented a year or so later. The Family variants (collectively known as Type Families) are strict generalizations of the associated ones, and the associated ones are simple syntactic sugar for the family variants.

Data or Type Synonym?

The other "axis" of categorization is "Data" vs "Type Synonym". This distinction mirrors that of normal Haskell "data" and "type" declarations. The key point is that associated data types and data families let you create a bijection (ie one-to-one mapping) from source type to destination type. Associated type synonyms and synonym families on the other hand allow you to map two different souce types onto the same destination type. The best reference for more details on the difference is the first part of section 7 of the "Associated Type Synonyms" paper.

Equality Constraints
The final piece of syntax allows us to assert type-level equalities using the above:


sumCollection :: (Collection c, Elem c ~ Int) => c -> Int
sumCollection c = sum (toList c)

This (as with the other examples) is slightly modified from one of the papers - in this case the "Associated Type Synonyms" paper where the syntax uses "=" rather than the "~" which was ultimately used.

Compatability

Many of the above features are actually available in the GHC 6.8 branch but they are only really supported under 6.10 onwards. (I understand that the main reason the code is in 6.8 at all is to facilitate merging of other, unrelated fixes between those branches). I've had success with both kinds of associated types under 6.8 but ran into problems using equality constraints.

References

Join-ing the Blogosphere

2008-08-03T21:05:00.000+01:00

I'd like to share a little Haskell snippet (I wanted to say "idiom" but actually I'm not sure if I've seen it used anywhere else...) that I quite like:

join (,) 9

What does this do? On its own, not much - in fact you just get:

(9,9)

...which you'd have been much better off typing in yourself. Where it can come-in handy though is when writing code in a pointfree style if you have a need to duplicate an anonymous value into both parts of a pair:

functionWhichTakesAPair . join (,) . functionWhichProducesASingleValue

Anyway, how / why does this construct work....?

The (,) is just the standard data constructor for a 2-tuple, a pair. (This constructor has type a -> b -> (a,b) so you can always write (,) 9 9 as an alternative notation for (9,9) if you wish).

The join is the standard function :: Monad m => m (m a) -> m a we know from monadic programming (Control.Monad). But this just raises the question of how it can make sense to apply it to (,) which as we've seen is a function of type a -> b -> (a,b).

The answer is of course that this function may be considered to be of type m (m a). This is because Control.Monad.Instances declares any partially applied function (eg a -> ...) to be a monad (the type a can be seen as representing an encapsulated environment which is threaded through by the monadic bind). This then means that a function of two arguments (such as (,)) can be seen as having type m (m a) which is just what we need to be able to apply join. The net result is that the single argument of the resulting function (join (,)) is passed in twice as desired.

This of course generalises to functions other than (,) - whenever we use join f where f is a function of two arguments, we end up with a function of one argument which is passed into f twice.