7 Basic category theory for programmers
Programming is fundamentally about building layers of abstraction, and category theory offers a high-level, unifying way to reason about those abstractions. Rather than studying numbers, it studies relationships and composition, earning the tongue-in-cheek label “generalized abstract nonsense.” The chapter motivates its use both practically (as compositional design patterns) and theoretically (as a way to see deeper structure in everyday programming), then sets out to connect category-theoretic ideas—categories, objects, morphisms, and functors—to real language design questions like whether a sequence of giraffes can be treated as a sequence of animals.
A category consists of objects and morphisms (arrows) with two key properties: each object has an identity arrow, and arrows compose (if you can go from A to B and B to C, you can go from A to C). Simple “posetal” categories model familiar partial orders, like numbers ordered by ≥ or people ordered by ancestry. Beyond ordinary functions, the chapter introduces functors—mappings between categories that preserve morphisms. When such a mapping goes from a category to itself it’s an endofunctor; when it preserves both the existence and direction of arrows, it’s covariant.
Viewing types as objects and “assignment compatibility” as morphisms links category theory directly to variance in type systems. Covariant interfaces like sequences model a covariant endofunctor T ↦ IE<T> when all nested generic constructions are included; this justifies why IEnumerable<Giraffe> can be used where IEnumerable<Animal> is expected, while mutable lists are not safely covariant. Contravariant interfaces (e.g., comparers) reverse arrow direction: if Giraffe → Animal, then IC<Animal> → IC<Giraffe>, explaining “contravariant.” The chapter closes by noting how this lens clarifies safety-driven language rules and hints at lattice-theoretic structure in the space of types.
A small category with three objects, represented by the nodes, and six morphisms, represented by the arrows.
A strangely familiar small category with four objects and ten morphisms.
The function F is represented as dashed arrows mapping from one category to another
The category of four types with the assignment compatibility morphism; the identity morphisms are elided for clarity. Note that both Giraffe and Tiger have morphisms to Animal but not to each other. A Giraffe can’t be assigned to a variable of type Tiger and a Tiger can’t be assigned to a variable of type Giraffe, but both can be assigned to a variable of type Animal or Object.
Adding four constructed generic types to our category. Notice how the structure of the morphisms of the interface types looks very similar to that of the original four types.
The mapping of a function from types to types, shown as dashed arrows.
Part of another infinite category of types, this time with a contravariant interface.
Summary
- Category theory is a modern mathematical discipline that studies relationships between arbitrary objects. It’s cheekily characterized as “generalized abstract nonsense”.
- A category is a collection of objects that have reflexive, transitive morphisms; we can think of a morphism as an arrow going from one object to another.
- A function that maps objects of one category to another such that morphisms are preserved is a covariant functor. One that preserves but reverses morphisms is a contravariant functor. If the two categories are the same, it is an endofunctor.
- The assignment compatibility relation “an expression of type X may be assigned to a variable of type Y” is a common morphism on a category where types are objects.
- Covariant and contravariant interfaces are called that because generic interfaces can be thought of as covariant or contravariant functors: functions from types to types that preserve assignment compatibility morphisms.
- Posetal categories, that define a partial order on a set of objects, are common and particularly useful when objects are types in a programming language. Semilattices are posets that have the additional nice property that you can always find the most specific type that is assignment compatible with any two types.
- The relationship between category theory and analysis of programming languages is very deep; we’ve just scratched the surface. We’ll go a little deeper in subsequent interludes.
FAQ
What is category theory, and why do some call it “generalized abstract nonsense”?
Category theory studies relationships between things, regardless of what the “things” are. Because it abstracts away the nature of the objects and focuses purely on structure and relationships, practitioners jokingly call it “generalized abstract nonsense.”What are objects and morphisms in a category?
Objects are the “things” in a category (they can be anything). Morphisms are directed arrows between objects that capture “you can get from here to there.” Two rules apply: every object has an identity morphism to itself (reflexive), and if there’s A→B and B→C then there’s A→C (transitive).How do functions, morphisms, and functors differ?
- A function maps an object to a single related object (possibly across categories) without necessarily respecting the category’s arrows.- A morphism is an arrow inside a category that represents a permissible move from one object to another.
- A functor maps objects (and arrows) from one category to another, preserving the existence of morphisms. Covariant functors preserve arrow direction; contravariant functors reverse it.
What is a posetal category?
A posetal category arises from a partial order: there’s a morphism X→Y exactly when X ≤ Y. Examples include numbers ordered by “≥” and a family tree ordered by “is an ancestor of (or the same person).”What is a covariant functor? Can you give an intuitive example?
A covariant functor maps objects between categories and preserves arrow direction: if X→Y then F(X)→F(Y). Example: map 0,1,2 to family members Bob,Bruce,Eric in a way that preserves “≤” arrows as “is ancestor of” arrows. Every arrow in the source is mirrored in the target with the same direction.What is an endofunctor, and why does mapping T ↦ IE<T> require infinitely many types?
An endofunctor maps a category to itself while preserving morphisms. The mapping T ↦ IE<T> (a simplified enumerable) is an endofunctor only if it’s defined for every object in the category, including nested types like IE<IE<T>>, IE<IE<IE<T>>>, and so on—hence an infinite collection of constructed types.How does category theory explain covariance and contravariance in generic interfaces?
- Covariance (e.g., IE<out T>) preserves direction: if Giraffe → Animal, then IE<Giraffe> → IE<Animal> is allowed.- Contravariance (e.g., IC<in T>) reverses direction: if Giraffe → Animal, then IC<Animal> → IC<Giraffe> is allowed. A comparer that can handle animals can also handle giraffes.
Fabulous Adventures in Data Structures and Algorithms ebook for free