Brandon Ling

Natural transformations: mappings between mappings

March 08, 2021category theory, Typescript

Today’s post introduces the third fundamental concept of category theory: natural transformations. We’ve already talked about how a functor F:CDF: \mathcal{C} \to \mathcal{D} relates a category C\mathcal{C} to another category D\mathcal{D}; going one step further, if we have another functor G:CDG: \mathcal{C} \to \mathcal{D}, can we relate the functor FF to the functor GG?

A functor FF maps an object AA of C\mathcal{C} to an object FAFA1 in D\mathcal{D} and maps a morphism f:ABf: A \to B in C\mathcal{C} to a morphism Ff:FAFBFf: FA \to FB in D\mathcal{D}, and similarly with GG. Here’s a picture2 to start:

The images of functors F and G on objects A and B and a morphism f from A to B. There’s a vertical arrow from FA to FB for Ff and another one from GA to GB for Gf.
The images of functors F and G on objects A and B and a morphism f from A to B. There’s a vertical arrow from FA to FB for Ff and another one from GA to GB for Gf.

If we want to relate functor FF to functor GG, we need to first relate FAFA to GAGA and FBFB to GBGB. The only things we have to relate objects are morphisms, so for each pair of objects FAFA and GAGA we choose a morphism αA:FAGA\alpha_A: FA \to GA3, which is a morphism in D\mathcal{D}. So our picture becomes a square:

Relating FA to GA and FB to GB. A natural transformation requires the two paths (the composition of the morphisms shown) to be equal.
Relating FA to GA and FB to GB. A natural transformation requires the two paths (the composition of the morphisms shown) to be equal.

What about the morphisms FfFf and GfGf? We don’t have morphisms between morphisms like we do for objects. Instead, we want the transformation from FF to GG to respect the structure of the category, and that means we need to respect the composition of morphisms.

Looking at the diagram above, we see that we can compose GfαAGf \circ \alpha_A and αBFf\alpha_B \circ Ff. Respecting composition means we want these to be equal to each other:

GfαA=αBFf.Gf \circ \alpha_A = \alpha_B \circ Ff.

We say that the diagram commutes. A commuting diagram is a diagram in which all morphisms that have the same sources and the same targets are equal (we call them parallel morphisms); these of course include morphisms that are the composition of other morphisms, as is the case here. Commuting diagrams are used extensively in category theory, as we’ll see later in the big example.

We can think of this as saying we can either map FAFA to GAGA first with αA\alpha_A and then map to GBGB with GfGf, or we can map FAFA to FBFB with FfFf first and then translate to GBGB with αB\alpha_B, and we get the same result either way. It’s a consistency condition, and I’ll refer to it as the “naturality condition.”

The important point is that the same αA\alpha_A and αB\alpha_B have to work for any f:ABf: A \to B. This is the “natural” part of “natural transformation.” John D. Cook writes

Natural transformations are meant to capture the idea that a transformation is “natural” in the sense of not depending on any arbitrary choices. If a transformation does depend on arbitrary choices, the arrows αA\alpha_A and αB\alpha_B would not be reusable but would have to change when ff changes.

A conceptual diagram of a natural transformation alpha relating functor F to functor G.
A conceptual diagram of a natural transformation alpha relating functor F to functor G.

Putting it all together, a natural transformation α:FG\alpha: F \Rightarrow G is a collection of morphisms αA:FAGA\alpha_A: FA \to GA in D\mathcal{D}, one for each object AA of C\mathcal{C}, such that GfαA=αBFfGf \circ \alpha_A = \alpha_B \circ Ff for all morphisms f:ABf: A \to B in C\mathcal{C}. The morphism αA\alpha_A is called the AAcomponent of α\alpha.

Example: categories in programming

In programming, particularly in functional programming, the model is that the types of a programming language are the objects of a category and functions are the morphisms between them.4

For example, we might have a string and int type, and a function f:f: string \to int that given a string returns the length of a string.

We can also talk about functors, and since we only have one category C\mathcal{C} to talk about (for a given programming language), our functors map from C\mathcal{C} to itself (such a functor is known as an endofunctor).

Since types are the objects in our category our functors map types to types. These correspond to type constructors, which we can think of a type that is parameterized by another type, like generic types in Typescript.

As a simple example, we can think of a generic list as a functor. So for example, string is mapped to string[], which is just a list of strings.

But functors also map morphisms to morphisms, so we also need to map functions to functions. There needs to be a higher-order function fmap that maps f: A => B to fmap(f): A[] => B[], for any function f that takes an argument of type Aand returns something of typeB.

Of course, fmap needs to obey the properties that define a functor, namely that identity functions are mapped to identity functions and that composition of functions is preserved.

For our list functor, fmap(f) simply takes each item in the list of A’s and applies f to it, giving us a list of B’s. You should check that identities and function composition is preserved. In Typescript we could write

function fmap<A, B>(f: (a: A) => B) {
  return (aArr: A[]) => aArr.map(f)
}

In fact, this is a bit unnecessary, since we can directly use the map method on any particular instance aArr instead of calling fmap(f)(aArr). The point is that for each function between types A and B there is a corresponding function between A[] and B[].

Another example of a functor in Typescript is the Maybe type5:

const Nothing = Symbol("nothing")
type Nothing = typeof Nothing
type Maybe<T> = T | Nothing

Try implementing the corresponding fmap. It should map f:A => B to fmap(f): Maybe<A> => Maybe<B> while making sure identity functions and composition are respected.

Now that we’ve talked about functors, we can talk about natural transformations between them. To be concrete, let’s talk about a natural transformation from [] to Maybe. A natural transformation in this context means for each type A we need to choose a function alpha<A>: (aArr: A[]) => Maybe<A> such that the naturality condition holds.

The naturality condition says that for any functions f: (aArr: A[]) => B[] and g: (mA: Maybe<A>) => Maybe<B>, fmap(g)(alpha<A>(aArr)) equals alpha<B>(fmap(f)(aArr)).

You can check that the following works for the alpha<A>’s:

function alpha<A>(aArr: A[]) {
  return aArr.length === 0 ? Nothing : aArr[0]
}

alpha returns Nothing if aArr is empty, otherwise it returns the first element of aArr. It’s a parametrically polymorphic, or generic, function. In fact, any generic function f<A> between A[] and Maybe<A> is a natural transformation between [] and Maybe.

Some people, like Bartosz Milewski, like to think of functors (in the programming context) as containers, like the list functor we looked at. Natural transformations don’t modify the contents of their arguments, but simply repackage them, like alpha above. The idea is that it doesn’t matter whether you repackage first and then apply your function, or apply your function first and then repackage, which is exactly the naturality condition.

If you want to read more about category theory in programming, I highly suggest checking out:

Composing natural transformations and functor categories

At the end of my earlier post on functors, I mentioned that functors from a category C\mathcal{C} to a category D\mathcal{D} form a category with the functors themselves comprising the objects. This functor category is often denoted by DC\mathcal{D}^{\mathcal{C}}. By now you probably see that natural transformations should be the morphisms between functors.

Recalling the definition of a category, we need to identify identity morphisms and define composition of morphisms.

Suppose we have two natural transformations α:FG\alpha: F \Rightarrow G and β:GH\beta: G \Rightarrow H. We can define the composition βα\beta \circ \alpha by composing the components of α\alpha and β\beta:

(βα)A=βAαA(\beta \circ \alpha)_A = \beta_A \circ \alpha_A
A diagram of composing natural transformations to get another natural transformation.
A diagram of composing natural transformations to get another natural transformation.

I leave it to you to check that βα\beta \circ \alpha is indeed a natural transformation from FF to HH (hint: try drawing a diagram that combines the naturality diagram “squares” for α\alpha and β\beta that we saw earlier when talking about the definition of a natural transformation).

As for identity natural transformations, the components of idF:FFid_F: F \Rightarrow F are simply the identity morphisms (idF)A=idFA(id_F)_A = id_{FA}, as you probably guessed.

Natural isomorphisms and equivalence of categories

If all the components of a natural tranformation α:FG\alpha: F \Rightarrow G are isomorphisms, then we say that α\alpha is a natural isomorphism.

Remember that an isomorphism is just a morphism f:XYf: X \to Y that can be “undone” in the sense that there is another morphism f1:YXf^{-1}: Y \to X such that f1f=idXf^{-1} \circ f = id_X and ff1=idYf \circ f^{-1} = id_Y. f1f^{-1} is called the inverse of ff.

The notion of isomorphism is category theory’s preferred notion of sameness (as opposed to strict equality). See the section on isomorphisms here for more on this point.

An example of an isomorphism between two sets A and B. For example, f(a_1) = b_4.
An example of an isomorphism between two sets A and B. For example, f(a_1) = b_4.

A natural isomorphism, then, is a natural transformation that can be reversed (try proving that the inverses αA1:GAFA\alpha_A^{-1}: GA \to FA form a natural transformation α1:GF\alpha^{-1}: G \Rightarrow F).

Natural isomorphisms allow us to talk about equivalent categories. Suppose we have two different categories C\mathcal{C} and D\mathcal{D} that we want to say are the same without actually being the same category. One possibility is to say that C\mathcal{C} and D\mathcal{D} are isomorphic, meaning that there exist inverse functors F:CDF: \mathcal{C} \to \mathcal{D} and G:DCG: \mathcal{D} \to \mathcal{C} (that is, G=F1G = F^{-1}).

But this is often too restrictive of a statement for categories. C\mathcal{C} and D\mathcal{D} might be the same for all practical purposes, but aren’t isomorphic. Instead, we consider a more general notion of isomorphism (for categories) called equivalence.

Another way to think about this is that with objects in a general category, we only have morphisms between objects, so the best we can do is say that two objects are isomorphic for our notion of sameness.

But in the case where our objects are categories, our morphisms are functors, and functors between two categories form a category, so we have morphisms between functors: natural transformations.

In other words, there is more structure in the category of categories than in a general category (there are morphisms of morphisms), and we can use this extra structure to refine our notion of sameness for categories.

The idea is that instead of requiring the existence of inverse functors FF and GG such that the composition GF=idCGF = id_\mathcal{C} and FG=idDFG = id_\mathcal{D} (note strict equality), we just require that there are natural isomorphisms between GFGF and idCid_\mathcal{C} and between FGFG and idDid_\mathcal{D}.

This is a more general notion of sameness for categories than isomorphism, because isomorphic categories are also equivalent categories, but equivalent categories don’t need to be isomorphic, as we’re about to see.

When I first came across this idea of equivalence, I didn’t quite understand the significance. I didn’t see how one category could be “the same” as another but not actually be isomorphic. I finally understood it once I worked through the following example.

Big example

This example also allows us to review the concepts we’ve discussed: categories, functors, isomorphisms, and natural transformations. This example is a bit of a challenge, but if you get confused, take another look at the relevant definitions. I highly suggest following along by writing everything out on paper.

We’re gonna show that the categories C\mathcal{C} and D\mathcal{D} are equivalent. C\mathcal{C} is the category of finite6 sets n\underline{n} of the form {1,2,3,,n}\{1, 2, 3, \ldots, n \} for all positive integers nn. The morphisms of C\mathcal{C} are just the functions between n\underline{n} and m\underline{m} for all nn and mm.

The category D\mathcal{D} is the category FinSet of all finite sets and functions between them. For example, A={A = \{‘apple’, ‘tomato’, ‘red’}\} and B={3,2,π}B = \{3, 2, \pi\} are finite sets and f:ABf: A \to B defined by f(f(‘apple’)=3) = 3, f(f(‘tomato’)=2) = 2, and f(f(‘red’)=π) = \pi is a function from AA to BB (in fact ff is an isomorphism).

For each nn, FinSet has an infinite number of sets of size nn, whereas C\mathcal{C} has just one: n={1,2,3,,n}\underline{n} = \{1, 2, 3, \ldots, n \}.

Sets of the same size are isomorphic, meaning each element of one set can be mapped uniquely to an element of the other set, and no elements in either set are missed, as with ff in the example above between AA and BB. Isomorphic sets are essentially the same in the sense that they are interchangeable.

The intuition behind the equivalence between C\mathcal{C} and FinSet is that if we group all sets of the same size in FinSet, we get C\mathcal{C}. Let’s prove this.

What we need to show is that there are functors F:CF: \mathcal{C} \to FinSet and G:G: FinSetC\to \mathcal{C} such that there is a natural isomorphism η\eta7 between GFGF and the identity functor idCid_\mathcal{C}, and a natural isomorphism ϵ\epsilon8 between FGFG and idFinSet.id_{\text{FinSet}}.

Defining functors F and G

Our first step is to define the functor F:CF: \mathcal{C} \to FinSet. We have to decide where to map each n\underline{n} to in FinSet. You could choose your favorite set of size nn, but since n\underline{n} is also a set in FinSet, let’s just map it to itself:

Fn=n.F\underline{n} = \underline{n}.

As for the morphisms of C\mathcal{C}, which are functions f:nmf: \underline{n} \to \underline{m}, FF just maps ff to itself:

Ff=f.Ff = f.

Identity functions are obviously preserved, as is composition: FgFf=gf=F(gf)Fg \circ Ff = gf = F (gf).

That’s all we need to define FF. FF just recognizes the fact that C\mathcal{C} is a subcategory of FinSet9. Now let’s define G:G: FinSetC\to \mathcal{C}.

For each set XmX_m of size mm of FinSet, we simply map it to m={1,2,3,,m}\underline{m} = \{ 1,2,3, \ldots, m\}:

GXm=m.G X_m = \underline{m}.

The functions are a bit trickier. For each function f:XmYnf: X_m \to Y_n, we need to map it to a function Gf:nmGf: \underline{n} \to \underline{m}, while ensuring that identities are mapped to identities, and composition is preserved.

Since each XmX_m is a finite set, we can list (which is the same as ordering) all its elements. If you think about it, this is the same as defining a function iXm:Xmmi_{X_m}: X_m \to \underline{m}. Moreover, iXmi_{X_m} is an isomorphism, so there exists iXm1:mXmi_{X_m}^{-1}: \underline{m} \to X_m such that iXm1iXm=idXmi_{X_m}^{-1} \circ i_{X_m} = id_{X_m} and iXmiXm1=idmi_{X_m} \circ i_{X_m}^{-1} = id_{\underline{m}}. Note that there are usually many different ways to list the elements of a given XmX_m, so there are many such iXmi_{X_m}. We just need to choose one for each XmX_m.

An example ordering for a set X_4. This picture defines i_X_4 and its inverse.
An example ordering for a set X_4. This picture defines i_X_4 and its inverse.

Then, GG maps each α:XmYn\alpha: X_m \to Y_n to

Gα=iYnαiXm1G\alpha = i_{Y_n} \circ \alpha \circ i_{X_m}^{-1}

which is a function from m\underline{m} to n\underline{n}. Here’s how we can interpret this: first map m\underline{m} to XmX_m using our ordering function iXm1i_{X_m}^{-1}, then map XmX_m to YnY_n with α\alpha, then finally map YnY_n to n\underline{n} with iYni_{Y_n}.

We need to check that identities are preserved:

GidXm=iXmidXmiXm1=(iXmidXm)iXm1=iXmiXm1=idmG id_{X_m} = i_{X_m} \circ id_{X_m} \circ i_{X_m}^{-1} \\[5pt] = (i_{X_m} \circ id_{X_m}) \circ i_{X_m}^{-1} \\[5pt] = i_{X_m} \circ i_{X_m}^{-1} \\[5pt] = id_{\underline{m}}

The third equality above follows from the fact that identities act like the number 1 with multiplication: it doesn’t change anything. The last equality follows from the fact that iXmi_{X_m} is an isomorphism.

We also need to check that composition of functions is preserved. Let α:XmYn\alpha: X_m \to Y_n and β:YnZp\beta: Y_n \to Z_p be functions. Then GG maps the composition βα\beta \alpha to

G(βα)=iZp(βα)iXm1=iZp(βidYnα)iXm1=iZpβ(iYn1iYn)αiXm1=(iZpβiYn1)(iYnαiXm1)=GβGαG(\beta \alpha) = i_{Z_p} \circ (\beta \alpha) \circ i_{X_m}^{-1} \\[5pt] = i_{Z_p} \circ (\beta \circ id_{Y_n} \circ \alpha) \circ i_{X_m}^{-1} \\[5pt] = i_{Z_p} \circ \beta \circ (i_{Y_n}^{-1} \circ i_{Y_n}) \circ \alpha \circ i_{X_m}^{-1} \\[5pt] = (i_{Z_p} \circ \beta \circ i_{Y_n}^{-1}) \circ (i_{Y_n} \circ \alpha \circ i_{X_m}^{-1}) \\[5pt] = G\beta \circ G\alpha

The second equality follows from the fact that I can insert an identity function anywhere since it doesn’t change anything. The third equality follows from rewriting idYnid_{Y_n} in terms of iYni_{Y_n} and its inverse iYn1i_{Y_n}^{-1}.

Whew. We’re halfway done. Now we need to prove that there are natural isomorphisms between GFGF and idCid_{\mathcal{C}} and between FGFG and idFinSetid_{\text{FinSet}}.

The natural isomorphisms η\eta and ϵ\epsilon

Here’s the picture to have in mind (remember that idCm=mid_\mathcal{C} \underline{m} = \underline{m} and idCf=fid_\mathcal{C} f = f):

We need to show this diagram commutes to prove that GF and the identity functor of C are naturally isomorphic.
We need to show this diagram commutes to prove that GF and the identity functor of C are naturally isomorphic.

We need to show the existence of an isomorphism ηm:GFmm\eta_m: GF \underline{m} \to \underline{m} for every m\underline{m} such that the above diagram commutes. In particular, this means that

fηm=ηnGFff \circ \eta_{\underline{m}} = \eta_{\underline{n}} \circ GF f

and also

GFfηm1=ηn1f.:GFf \circ \eta_{\underline{m}}^{-1} = \eta_{\underline{n}}^{-1} \circ f.:

The idea of the natural isomorphism between GFGF and idCid_\mathcal{C} is that the objects GFmGF\underline{m} and m\underline{m} are isomorphic, and those isomorphisms allow us to express every ff and GFfGFf in terms of each other:

GFf=ηn1fηmf=ηnGFfηm1:GFf = \eta_{\underline{n}}^{-1} \circ f \circ \eta_{\underline{m}} \\[5pt] f = \eta_{\underline{n}} \circ GFf \circ \eta_{\underline{m}}^{-1:}

Looking at the diagram above, this just says that, for example, I can go from GFmGF\underline{m} to GFnGF\underline{n} directly with GFfGFf, or I can first map to m\underline{m} with ηm\eta_{\underline{m}}, then map to n\underline{n} with ff, then map to GFnGF\underline{n} with ηn1\eta_{\underline{n}}^{-1}, and the result is the same either way.

Let’s think a little more about the functor GF:CCGF: \mathcal{C} \to \mathcal{C}. It sends each n\underline{n} to

GFn=G(Fn)=Gn=n.GF \underline{n} = G(F\underline{n}) = G\underline{n} = \underline{n}.

For each f:mnf: \underline{m} \to \underline{n}, GFGF sends it to

GFf=G(Ff)=Gf=infim1GFf = G (Ff) = Gf = i_{\underline{n}} \circ f \circ i_{\underline{m}}^{-1}

Remember that im:mmi_{\underline{m}}: \underline{m} \to \underline{m} is just an ordering of the elements of m={1,2,,m}\underline{m} =\{1,2,\ldots,m\}, and we can just choose to order them in the obvious way, which is the same as saying imi_{\underline{m}} is the identity function idmid_{\underline{m}}.

Then

GFf=idnfidm1=fGFf = id_{\underline{n}} \circ f \circ id_{\underline{m}}^{-1} = f

(remember that an identity morphism is its own inverse). So in fact (with these choices for ini_{\underline{n}}), GFGF is the identity functor idCid_\mathcal{C}. Our naturality square above collapses into just a vertical line.

The natural isomorphism η\eta is then really simple: just choose ηm=ηm1=idm\eta_{\underline{m}} = \eta_{\underline{m}}^{-1} = id_{\underline{m}} to be the components of η\eta for all m\underline{m}.

The natural isomorphism between FGFG and idFinSetid_{\text{FinSet}} is less trivial. Here’s the diagram that we need to commute:

The diagram that we need to show commutes to prove that FG and the identity functor of FinSet are naturally isomorphic.
The diagram that we need to show commutes to prove that FG and the identity functor of FinSet are naturally isomorphic.

Let’s think about what FGFG does. It maps any set XmX_m with mm elements to

FGXm=F(n)=n.FGX_m = F(\underline{n}) = \underline{n}.

It sends a function α:XmYn\alpha: X_m \to Y_n to

FGα=F(iYnαiXm1)=iYnαiXm1FG\alpha = F(i_{Y_n} \circ \alpha \circ i_{X_m}^{-1}) = i_{Y_n} \circ \alpha \circ i_{X_m}^{-1}

since FF just maps each morphism in C\mathcal{C} to itself in FinSet.

For the components of our natural isomorphism, we need an isomorphism ϵXm:FGXmXm\epsilon_{X_m}: FGX_m \to X_m, which is an isomorphism from m\underline{m} to XmX_m since FGXm=mFGX_m = \underline{m}. In fact we’ve already been using an isomorphism between XmX_m and m\underline{m}: the function iXm:Xmmi_{X_m}: X_m \to \underline{m} (remember this is our ordering of the elements of XmX_m). It seems we should choose ϵXm\epsilon_{X_m} to be

ϵXm=iXm1.\epsilon_{X_m} = i_{X_m}^{-1}.

We just have to check that the naturality conditions hold, which requires αϵXm=ϵYnFGα\alpha \circ \epsilon_{X_m} = \epsilon_{Y_n} \circ FG\alpha and FGαϵXm1=ϵYn1αFG\alpha \circ \epsilon_{X_m}^{-1} = \epsilon_{Y_n}^{-1} \circ \alpha for every α:XmYn.\alpha: X_m \to Y_n.

Let’s check the first condition:

ϵYnFGα=ϵYniYnαiXm1=iYn1iYnαiXm1=idYnαiXm1=αiXm1=αϵXm.\epsilon_{Y_n} \circ FG\alpha = \epsilon_{Y_n} \circ i_{Y_n} \circ \alpha \circ i_{X_m}^{-1} \\[5pt] = i_{Y_n}^{-1} \circ i_{Y_n} \circ \alpha \circ i_{X_m}^{-1} \\[5pt] = id_{Y_n} \circ \alpha \circ i_{X_m}^{-1} \\[5pt] = \alpha \circ i_{X_m}^{-1} \\[5pt] = \alpha \circ \epsilon_{X_m}.

Finally,

FGαϵXm1=iYnαiXm1ϵXm1=iYnαiXm1iXm=iYnαidXm=iYnα=ϵYn1α.FG\alpha \circ \epsilon_{X_m}^{-1} = i_{Y_n} \circ \alpha \circ i_{X_m}^{-1} \circ \epsilon_{X_m}^{-1}\\[5pt] = i_{Y_n} \circ \alpha \circ i_{X_m}^{-1} \circ i_{X_m} \\[5pt] = i_{Y_n} \circ \alpha \circ id_{X_m} \\[5pt] = i_{Y_n} \circ \alpha \\[5pt] = \epsilon_{Y_n}^{-1} \circ \alpha.

That completes the proof that C\mathcal{C} and FinSet are equivalent categories. But they are not isomorphic, intuitively because FinSet has a lot more objects than C\mathcal{C} does10.

If you’re able to follow this whole proof, then I’d say you have a pretty good grasp of the basic concepts of category theory!

Final notes

With this post we complete our crash course on the basics of category theory (categories, functors, and natural transformations). There is a lot more to category theory; there are some very important concepts we haven’t even mentioned, like limits, colimits, and adjunctions. We could also talk about special types of categories, like monoidal and compact closed categories, or we could generalize the notion of categories to enriched categories, for example.

Some of the resources I have been using to learn category theory that you might want to check out include:

The last two resources are great, but they do assume a higher level of mathematical knowledge than the first three.


  1. From now on I will generally drop parentheses to save space and clutter unless it’s unclear without them, so in this case FAFA is the same as F(A)F(A). I will also sometimes drop the composition symbol \circ for the same reasons, so for example gfgf is the same as gfg \circ f.
  2. The diagrams in today’s post were made with quiver.
  3. α\alpha is the greek letter “alpha.”
  4. Probably the most well-known programming language that explicitly incorporates these ideas is Haskell. But there is debate on whether Haskell’s types and functions form a true category.
  5. I saw this particular implementation first by Estus Flask here.
  6. By finite we mean that each set only has a finite number of elements.
  7. η\eta is the Greek letter “eta”.
  8. ϵ\epsilon is the Greek letter “epsilon”.
  9. Mathematicians say FF is an inclusion of C\mathcal{C} in FinSet.
  10. A more technical statement would be that C\mathcal{C} has a countable set of objects, while FinSet has an uncountable collection of objects.

Written by Brandon Ling — physicist now developer.