Re: [pl-seminar] A monad is just a monoid in the category of endofunctors. What's the problem?


Date: Mon, 24 Jun 2019 16:55:56 +0000
From: JOHN CYPHERT <jcyphert@xxxxxxxx>
Subject: Re: [pl-seminar] A monad is just a monoid in the category of endofunctors. What's the problem?
Now in 4310.


From: JOHN CYPHERT
Sent: Sunday, June 23, 2019 5:54:02 PM
To: pl-seminar@xxxxxxxxxxx
Subject: A monad is just a monoid in the category of endofunctors. What's the problem?
 

Hi, everyone,

Monday's the day. I plan to explain the quote "A monad is just a monoid in the category of endofunctors" in 3310 at noon. This statement is sort of a joke. When someone, usually from a programming background, is trying to understand what a monad really is the dismissive high-minded retort is: "A monad is just a monoid in the category of endofunctors. What's the problem?".

Here's the thing about this statement. Even when we do understand the statement, the quote gives no insight into what monads do and how monads are used in programming. In short, it's pretty useless and completely unhelpful.

So why then am I going to devote a whole lecture to the quote. 
 2 reasons:
   - I feel like just do to the prevalence of the statement I have to talk about. If I am successful on Monday, I want to give you the ability to respond to some other fresh student with such a high sounding response.
    - The better reason for going through the quote is that on the journey to understand the statement, we are going to generalize a lot of concepts you might be familiar with from an abstract algebra class. That is, on the way to understand what a monoid is we are going to see how category theory can effectively be used to generalize many mathematical concepts.

A little primer. Consider an the functor category [C, C]. For more details on the functor category see the definition below. This, is the second half of the quote. It is the category of endo-functors. Great! Now we just need to understand what a monoid is in this category. This is not so easy. We have to go through some steps to try and understand what a monoid in this context means. Here's my plan on how we are going to get there:
 1) (Monoid) We are first going to define a regular old monoid using the standard definition you would see in an abstract algebra class.
 2) (Universal algebra) We are then going to give an alternative definition in the style of universal algebra (don't worry, I will explain). This will still just define the same old object though.
 3) (Object in Set) We are going to take the universal algebra definition and define a (still regular-old) monoid as a particular object in the category Set.
 4) (Object in a cartesian monoidal category) We are going to relax the definition a bit, and say that we can generalize our Set definition with any cartesian monoidal category.
 5) (Object in a monoidal category) We are then going relax even further and say we don't even need the category to be cartesian monoidal. Monoidal is good enough.

After all this, hopefully I will show the monoidal structure of [C, C], and we will see that a monad is then just exactly a monoid in this category.

That's the goal for Monday. If anyone has any questions just let me know. I have also attached brief notes for the previous lectures below.

John


Lecture 1 (Categories)
A category has
 - objects
 - morphisms
Morphisms compose, composition is associative, and each object has an identity morphism.

Many equations in category theory can be represented via commuting diagrams:
Nodes of the diagram are objects in the category, edges are morphisms. The property of a commuting diagram is that every directed path with the same source and ending object yield the same morphism.

We use category theory as a generalization of algebra. That is we understand mathematical objects as objects in a category and morphisms are structure preserving transformations of those objects. Even though it is possible to do so, we don't embed elements of a group, for example, as objects or morphisms in a category. Instead, we study groups through the category of all groups Grp, where objects are groups and morphisms are group homomorphisms. We then learn about groups not through the elements of a groups, which is opaque in the category, instead we understand groups relationships through the morphisms.

We briefly talked about a small list of some examples of categories
Set - Objects sets, morphisms total functions
Set_{\bottom} - Objects sets with bottom, morphisms are partial functions
Grp - Objects groups, morphism group homomorphisms
Partial order (P, <=) - Objects elements of P, morphism between a and b if a<=b.
Cat - Objects (small) categories, morphisms functors

We talked about small vs large categories.
Small categories are ones which objects and morphisms form a set as opposed to a proper class.
Large categories are not small.
I also mentioned locally small categories. A category is locally small if for every pair of objects the class of morphisms between them form a set.
For the above examples only the partial order is a small category. All the other categories are large.

I also want people to make the connection between category theory and functional programming. We talked about the category Hask (which may not be a real category due to some programming constructs). We want to think of objects in this category as types, and morphisms as functions between these types. We will try and use many examples from programming to understand category theory. We may see later how deep this connections goes.

To get in the spirit of category theory we then talked about some patterns of objects and morphisms. I will list them here
                           Initial object          Terminal Object                 Product                    Co-product
Set                           {}                          Singleton Set            Cartesian Product        Disjoint Union
Hask                     void                                ()                                    (,)                            Sum Type
Partial Order      \bottom                       \top                               meet                            join
Grp                 Singleton Group      Singleton Group             Group Product           Free Product


Lecture 2 (Functors)
In the next lecture we talked about functors. A functor is a structure preserving transformation.

A functor F : C -> D maps objects in C to objects in D, and morphisms f : c -> d of C to  morphisms F f : F c -> F d of D with
 - (unit) F (id_x) = id_{F x}
 - (composition) F (g . f) = F(g) . F(f) for morphisms g and f of C.

Intuitively, a functor embeds the category C in the category D. They can smash objects and morphisms together but they cannot break any connections. Functors also preserve commuting diagrams. 

The above definition defines a covariant functor. A contra-variant functor G : C -> D maps objects of C to objects of D and morphisms f : c -> d of C to a morphism G f : G d -> G c with
 - (unit) G (id_x) = id_{G x}
 - (composition) G (g . f) = (G f) . (G g)
A contravariant functor reverses the arrows of C. We can equivalently say a contravariant functor G is just a covariant functor G : C^{op} -> D, where C^{op} is the opposite category of C.

A functor F : C -> C is called an endo-functor.

In programming we only have one category, say Hask. Thus, all functors are endo-functors. In programming, functors are containers, such as Maybe, List, Stream, etc. These functors give a new type for any other type, such as [a] for a type a, and they also provide an fmap :: (a -> b) -> (f a -> f b) for functor f. fmap for lists is just map. A good intuition is that endo-functors are containers.

Finally, we talked about a particular functor in category theory the hom functor, denoted Hom( - , =) or C( - , -), which maps a locally small category C to Set. This functor is actually a bi-functor which maps pairs of objects in C to the set of morphisms between those objects. Remember a bi-functor is a functor from a product category C x D, where objects are pairs of objects and morphisms are pairs of morphisms. The hom functor is covariant in the second argument and contravariant in the first. Equivalently the hom functor is a functor C(-, -) : C^{op} x C -> Set.


Lecture 3 (Natural Transformations)
Natural transformations relate functors having the same domain and co-domain. Many later definitions in category theory rely on natural transformations. Saunders Mac Lane supposedly stated, "I didn't invent categories to study functors; I invented them to study natural transformations."

Suppose we have two categories C and D, and functors F : C -> D and G : C -> D. A natural transformation \alpha : F -> G is a family of morphisms such that
 - For each object a of C, \alpha picks a morphism of D \alpha_a : F a -> G a. \alpha_a is called the component of \alpha at object a.
 - For each morphism f : a -> b of C we have (G f) . alpha_a = alpha_b . (F f). This is called the naturality condition.

This last condition is usually represented as a commuting diagram.

Natural transformations have a specific connection to functional programming. In Haskell at least, as part of some theorems for free, we get the property that any parametrically polymorphic function with an appropriate type signature is a natural transformation.

That is suppose we have endofunctors F and G in our programming environment. A natural transformation is then a polymorphic function of type alpha : forall a . F a -> G a. That is if you plug in a type a, you get a component of the natural transformation. For alpha to be a natural transformation, naturality must be satisfied. That is suppose we have a function f from type a to type b. Then we must have
  (fmap f) . alpha = alpha . (fmap f)
This equation is a little confusing because in Haskell syntax we remove the function annotations. Let's annotate the parts of the equality:
  f : a -> b, for some a and b
  Left side of the equals
    fmap f : G a -> G b
    alpha : F a -> G a
    (fmap f). alpha : F a -> G b

  Right side of the equals
    alpha : F b -> G b
    fmap f : F a -> F b
    alpha . (fmap f) : F a -> G b

The point about theorems for free is that if alpha is parametrically polymorphic  then the naturality condition is automatically satisfied. In high-level terms a function is parametrically polymorphic if it is defined by one formula for all types. That is, the definition of alpha : forall a . F a -> G a, shouldn't depend on what type a is. Parametric polymorphism contrasts with ad-hoc polymorphism, which does allow multiple formulas depending on the input type. An example of ad-hoc polymorphism is an overloaded say add operator. The actual computation of the operator will depend on whether the input type is a complex or int or float or whatever.

An example of a parametric polymorphic function is 
  safeHead :: [a] -> Maybe a
  safeHead [] = Nothing
  safeHead (x : xs) = Just x

The definition of this function does not depend on what the type a is. As part of theorems for free safeHead is a natural transformation, though verifying naturality is a good exercise.

The intuition here is that endofunctors are containers, and a natural transformation is going from one container to another without modifying the contents during the transformation.


Lecture 4 (Composition of Natural Transformations and Monads)


Vertical Composition

Suppose we have two categories C and D, and two functors F : C -> D and G : C -> D. We also have two natural transformations \alpha : F -> G and \beta G -> H. There is a way to construct a natural transformation (\beta . \alpha) : F -> H called the vertical composition of \beta and \alpha. 


 - We define (\beta . \alpha) via components. That is consider some object a of C. By definition, there are morphisms of D \alpha_a : F a -> G a and \beta_a : G a -> H a. Thus, we create the component of (\beta . \alpha) at a simply a by composing \beta_a and \alpha_a using the composition operator of D.

   - That is  (\beta . \alpha)_a = \beta_a \circ \alpha_a, where \circ is the composition of morphisms in D.

 - It remains to check that this definition of components yields a natural transformation. That is naturality must be satisfied. Naturality of  (\beta . \alpha) is implied by naturality of \alpha and \beta, and a relatively simple 3x2 commutative diagram.


Using vertical composition one can define a new category called the functor category.

Let C and D be categories. Denote [C, D] (or Fun(C, D) or D^C) as the functor category between C and D where,

  - Objects of [C, D] are functors F : C -> D

  - Morphisms are natural transformations

  - Composition of morphisms is vertical composition of natural transformations.


For [C, D] to be a category composition needs to be associative. This is satisfied because \beta . \alpha is defined in terms of composition of morphisms in D. Thus, since D is a category composing components of natural transformations is associative and therefore, vertical composition is associative. For [C, D] to be a category composition we also need identity morphisms. Consider an object F (a functor) of [C, D]. Define the identity natural transformation at F, 1_F, with (1_F)_a = id _ F a. It is straightforward to check 1_F satisfies the appropriate identity laws. With these two properties checked we indeed have [C, D] a category.


Horizontal Composition

There is another situation where we can define the composition of natural transformations. Suppose we have three categories C, D, and E, functors F : C -> D, G : C -> D, F' : D -> E, and G' : D -> E, and natural transformations \alpha : F -> G and \alpha' : F' -> G'. We can compose functors, so there are functors F' . F : C -> E and G' . G : C -> E. Is there a natural transformation between F' . F and G' . G? Yes, it is called the horizontal composition of \alpha and \alpha', often denoted (\alpha' \circ \alpha). 


  - Defined in terms of components (\alpha' \circ \alpha)_a = \alpha'_G a \circ (F' \alpha_a) = (G' \alpha_a) \circ (\alpha'_F a)

    - The equality of the definitions is because \alpha' satisfies naturality.

    - Stepping through the pieces of the definition

      - a; object of C

      - \alpha_a : F a -> G a; morphism of D

      - F' \alpha_a : F' F a -> F' G a; morphism of E

      - G a; object of D

      - \alpha'_G a : F' G a -> G' G a; morphism of E

      - F a; object of D

      - \alpha'_F a: F' F a -> G' F a; morphism of E

      - G' \alpha_a : G' F a -> G' G a


  Naturality can be verified with a 2x3 diagram.


Interchange law

Consider the situation, where we have categories C, D, and E, functors F : C -> D, G : C -> D, H : C -> D, F': D -> E, G' : D -> E, and H' : D -> E, and natural transformations \alpha : F -> G, \beta : G -> H, \alpha' : F' -> G', and \beta' : G' -> H'. There are two ways of constructing a natural transformation from F' F to H' H. It turns out based on the definitions both of these constructions give the same result. This is known as the interchange law and it is written as

  (\beta' . \alpha') \circ (\beta . \alpha) = (\beta' \circ \beta) . (\alpha' \circ \alpha)

The interchange law can be proven with a 3x3 diagram.


I then started talking about monads. In the context of functional programming we can think of a monad as just a design pattern.


Suppose we have some functions in a program f : a -> b and g : b -> c. Intuitively, we think we can create a computation where f is executed then g is executed by composing f and g with g . f : a -> c. Now suppose we want to create an "embellished" version of this computation via some container. The container is an endo-functor say m. So suppose we create appropriate embellished functions f_m : a -> m b and g_m : b -> m c. The goal is to create a way to compose f_m and g_m, and this is accomplished via a monad.


Call this composition operator (<=<) : (b -> m c) -> (a -> m b) -> a -> m c. We can then implement <=< with the use of a polymorphic function join : m (m a) -> m a.

  - g <=< f = join . (fmap g) . f

For a monad we also need a polymorphic function which wraps a type in some default context. We call this function return : a -> m a.


This gives us one way to define a monad:

class Functor m => Monad m where

  join : m (m a) -> m a

  return : a -> m a


I then showed two examples of monads: one based on a logger type, and another using Maybe.



Lecture 5 (More Monads)

In the previous lecture we talked about defining a monad with join. For this lecture I gave alternative and more standard definitions.


We could also define a monad by defining <=< directly:

class Monad m where

  (<=<) : (b -> m c) -> (a -> m b) -> a -> m c

  return : a -> m a


The more common way of defining a monad is with an operator >>=, pronounced bind.

class Monad m where

  (>>=) : m a -> (a -> m b) -> m b

  return : a -> m a


All three definitions are equivalent. Given one you can derive the others. However, the last one is the most common because it gives code a more imperative style, where we give names to intermediate values.


Imagine f_m : a -> m b and g_m : b -> m c and process = g_m <=< f_m. Coding in this completely point free way is overly terse and may be confusing to other readers who are not sure what is going on during the computation.


An alternative would be

  process x = (f_m x) >>= (\y -> g_m y)


Here we have given intermediate names to the input of f_m and the input of g_m. This pattern is common enough Haskell allows for alternative syntax using a do block.

  process x = do

                 y <- f_m x

                 g_m y


Monadic computations can be strung together using the do notation.


I then went to mention that for an endo-functor to actually be a monad we want a few monad laws to be satisfied. There are really two conceptual laws: one, says <=<, join, or >>= is associative, and the other says return is the unit with respect to composition.


What follows are the monad laws for each of the definitions in haskell syntax.


  join . (fmap return) . f = f

  join . (fmap f) . return = f

  join . (fmap h) . (join . (fmap g) . f) = join . (fmap (join . (fmap h) . g)) . f


  return <=< f = f

  f <=< return = f

  h <=< (g <=< f) = (h <=< g) <=< f


  \x -> (return x >>= \y -> f y) = \x -> f x

  ma >>= return = ma

  ma >>= (\x -> f x >>= \y -> g y) = (ma >>= \x -> f x) >>= \y -> g y


In terms of programming these laws ensure statements in do blocks are associative and return gives an identity context.


I then showed some more examples of monads in programming.

[← Prev in Thread] Current Thread [Next in Thread→]