table of content

1 Thinking in distributed systems: Models, mindsets, and mechanics

1.1 Software engineering and mental models

1.1.1 Mental models: The foundation of reasoning

1.1.2 Correct mental models

1.1.3 Complete mental models

1.2 Mental model of software systems

1.3 Different types of models

1.3.1 Different models describing the same aspects

1.3.2 Different models describing different aspects of a system

1.4 Thinking about distributed systems

1.4.1 Correctness

1.4.2 Scalability and reliability

1.4.3 Responsiveness

1.5 Two big ideas

1.5.1 Systems of systems

1.5.2 Global view vs. local view

1.6 Distributed Systems Incorporated

1.7 Navigating complexity

1.7.1 Simple yet complex

1.7.2 Emergent behavior

1.7.3 Changing perspective

1.7.4 Think globally; act locally

1.8 Thinking above the code

2 System models, order, and time

2.1 System models

2.1.1 Theory and practice

2.1.2 Synchronous distributed systems

2.1.3 Asynchronous distributed systems

2.1.4 Partially synchronous systems

2.1.5 Component and network behavior

2.1.6 Realistic system models

2.2 Order and time

2.2.1 The happened-before relationship

2.2.2 Time and clocks

2.2.3 Physical time and physical clocks

2.2.4 Logical time and logical clocks

2.2.5 Physical clocks vs. logical clocks

3 Failure tolerance

3.1 In theory

3.2 Types of failure tolerance

3.2.1 Masking failure tolerance

3.2.2 Nonmasking failure tolerance

3.2.3 Fail-safe failure tolerance

3.2.4 None of the above

3.3 In practice

3.3.1 System model

3.3.2 Failure handling

3.3.3 Failure classification

3.3.4 Failure detection

3.3.5 Failure mitigation

3.3.6 Putting everything together

4 Message delivery and processing

4.1 Exchanging messages

4.2 The uncertainty principle of message delivery and processing

4.2.1 Before sending the request

4.2.2 After sending the request and before receiving a response

4.2.3 After receiving a response

4.3 Silence and chatter

4.4 Exactly-once processing semantics

4.5 Idempotence

4.6 Case study: Charging a credit card

5 Transactions

5.1 Abstractions

5.2 The magic of transactions

5.2.1 Concurrency

5.2.2 Failure

5.3 The model of transactions

5.3.1 Correctness

5.3.2 Serializability

5.3.3 Completeness

5.3.4 Application-level abort

5.3.5 Platform-level abort

6 Distributed transactions

6.1 Atomic commitment: From a single RM to multiple RMs

6.1.1 Transaction on a single RM

6.1.2 Transaction on multiple RMs

6.1.3 Blocking and nonblocking

6.2 The essence of distributed transactions

6.3 Two-Phase Commit protocol

6.3.1 In the absence of failure

6.3.2 In the presence of failure

6.3.3 Improvement

7 Partitioning

7.1 Encyclopedias and volumes

7.2 Thinking in partitions

7.3 The mechanics of partitioning and balancing

7.4 (Re)partitioning

7.4.1 Types of partitioning

7.4.2 Data item to partition assignment strategies

7.5 Common item-based assignment strategies

7.5.1 Range partitioning

7.5.2 Hash partitioning

7.6 Repartitioning

7.6.1 Range partitioning

7.6.2 Hash partitioning

7.7 Consistent hashing

7.8 (Re)balancing and overpartitioning

8 Replication

8.1 Redundancy

8.2 Thinking about replication and consistency

8.3 Replication

8.4 The mechanics of replication

8.4.1 System model

8.4.2 Replication lag

8.4.3 Synchronous vs. asynchronous replication

8.4.4 State-based vs. log-based replication

8.4.5 Single-leader, multileader, and leaderless systems

9 Consistency

9.1 Consistency models

9.1.1 Common consistency models

9.1.2 Virtues and limitations

9.2 Linearizability

9.2.1 Queue and stack

9.2.2 Formal definition of linearizability

9.3 Eventual consistency

9.3.1 The shopping cart

9.3.2 Variants of eventual consistency

9.3.3 Implementation

9.4 Consistency, availability, and partition tolerance

9.4.1 History

9.4.2 Conjecture vs. theorem

9.4.3 CAP theorem

10 Distributed consensus

10.1 The challenge of reaching agreement

10.2 System model

10.3 State machine replication

10.4 The origin—and irony—of consensus

10.5 Implementing consensus

10.5.1 Leader-based consensus

10.5.2 Quorum-based consensus

10.5.3 Combining leader and quorum

10.6 Raft

10.6.1 The log

10.6.2 Terms

10.6.3 Leader Election protocol

10.6.4 Log Replication protocol

10.6.5 State machine safety

10.7 Raft puzzles

10.7.1 Puzzle 1

10.7.2 Puzzle 2

10.7.3 Puzzle 3

11 Durable executions

11.1 The pitfalls of partial executions

11.2 System model

11.2.1 Process definition

11.2.2 Process execution

11.3 The concept of failure-transparent recovery

11.4 Strategies of failure-transparent recovery

11.4.1 Restart

11.4.2 Resume

11.5 Implementation of failure-transparent recovery

11.5.1 Application-level implementation: Sagas

11.5.2 Platform-level implementation: Durable execution

12 Cloud and services

12.1 From proactive to reactive

12.2 Cloud computing

12.3 Cloud-native computing

12.4 Serverless computing

12.4.1 Traditional

12.4.2 Serverless

12.4.3 Cold path vs. hot path

12.5 Service

12.5.1 Global view vs. local view

12.5.2 Example recommendation service

12.6 Final thoughts

Overview

1 Thinking in distributed systems: Models, mindsets, and mechanics

Modern software is unavoidably distributed: multiple concurrent components communicate over a network to deliver functionality that a single machine cannot provide alone. This chapter motivates why we distribute—so systems can remain correct, scalable, and reliable as load grows and failures occur—and frames the core challenge: complexity emerges from many parts and their interactions. It argues that progress comes from strong mental models, not just terminology or tool familiarity, and defines good models as both correct (no falsehoods) and complete (no relevant omissions). With an emphasis on moving from “knowing” to “understanding,” it sets the goal of reasoning about distributed systems confidently through precise, shared abstractions.

The chapter builds a baseline model of a distributed system as a state machine that advances in discrete steps, each taken by a component or the network, through internal work and message exchanges. It clarifies global versus local viewpoints—observers can see the whole system; components see only their local state and their network channel—and shows how correctness can be specified via safety (nothing bad happens) and liveness (something good eventually happens). Scalability and reliability are treated as responsiveness under load and failures, respectively, formalized with service-level indicators, objectives, and error budgets. The text highlights that multiple, even contrasting, models can be valid depending on focus, warns about overreliance on analogies, and introduces “systems of systems” thinking (holons/holarchies) to flexibly zoom in and out of complex architectures.

To make abstractions tangible, the chapter presents an office-building metaphor for components, network, and external interfaces, which cleanly captures crash and message-delivery semantics such as loss, duplication, and reordering. It advocates “thinking above the code,” modeling concurrency through interleavings to generalize race conditions and connect to database serializability, and distills the central design problem: think globally, act locally—craft global guarantees via only local observations and actions. Throughout, short “AHA!” moments reinforce that guarantees are application-specific and emergent, encouraging readers to juggle multiple mental models, embrace changing resolutions of view, and build the disciplined reasoning needed to design functional, scalable, and reliable distributed systems.

Mental model and system

Different models describing the same aspects of a system (the set of facts of each model totally overlaps)

The network as the buffer of inflight messages

The components as the buffer for inflight messages

Different models describing different aspects of a system (the set of facts of each model partially overlaps)

A distributed system as a set of concurrent, communicating components (local state of network not shown)

Behavior of a system as a sequence of states

Safety and liveness

Behavior space of a distributed transaction with two participants

A distributed system as a set of concurrent, communicating subsystems

Holons and holarchies

Two different holarchies, representing the same system

Global point of view

C1’s point of view

Distributed Systems Incorporated

Black box versus white box, a global point of view

Local point of view

Splitbrain

Reasoning about race conditions

Reasoning about serializability

Summary

A mental model is the internal representation of the target system and is the basis of comprehension and communication.
Striving for a deep understanding of distributed systems is better than merely knowing about their concepts.
A distributed system is a set of concurrent components that communicate by sending and receiving messages over a network.
The core challenge in designing distributed systems is creating a coherent system that functions as a whole despite each component having only local knowledge.
Ultimately, we are interested in the guarantees a system provides. We reason about these guarantees in terms of correctness—that is, in terms of safety and liveness guarantees as well as scalability and reliability guarantees.
Distributed systems can be visualized as a corporation, where rooms represent concurrent components, pneumatic tubes represent the network, and a mailbox represents the external interface.

FAQ

What is a distributed system, and how should we reason about its behavior?

A distributed system is a set of collaborating, concurrent components that communicate by exchanging messages over a network. A clear way to reason about its behavior is as a state machine: the system advances in discrete steps, each taken by exactly one component or the network, via external actions (send/receive) or internal actions (local compute/state access).

Why build distributed systems if they are notoriously hard to design?

To meet fitness goals under real-world conditions. A single component can be functional, but cannot handle unbounded load or survive inevitable failures. Distributing enables correctness at scale: scalability (responsive under load) and reliability (responsive under failure) emerge from multiple components and their interactions.

What’s the difference between knowing and understanding in this context?

Knowing captures facts (e.g., definitions and rules). Understanding comes from dependable mental models that let you predict and explain behavior. Like learning chess rules versus mastering strategy, the chapter emphasizes building accurate, concise models so you can reason with confidence.

What makes a good mental model of a system?

A good model is both correct and complete: - Correctness: It contains no falsehoods (every fact in the model is true in the system). - Completeness: It omits no relevant facts (every fact needed for the purpose at hand is included). Relevance is application-specific.

Can different models of the same system all be valid?

Yes. Multiple models can be equivalent (express the same facts differently), such as modeling in-flight messages as part of the network versus part of each component. Other models deliberately focus on different aspects (e.g., an abstract transaction model that ignores messages versus a protocol model that includes message loss/duplication/reordering). Choose the model that best fits the point you need to make.

How do safety and liveness define correctness?

Correctness = every possible behavior satisfies safety and liveness: - Safety: nothing bad ever happens (e.g., no two participants make conflicting decisions). - Liveness: something good eventually happens (e.g., every participant eventually decides). Together, they prevent both inconsistency and getting stuck.

How are scalability and reliability defined in this chapter?

Through responsiveness to Service Level Objectives (SLOs): - Scalability: ability to meet SLOs under increasing load. - Reliability: ability to meet SLOs in the presence of failures. The chapter formalizes responsiveness using SLIs (measurements), SLOs (predicates), error rate, and error budget (upper bound on tolerated errors).

Why is the global versus local viewpoint important?

Analysts often reason with a global, omniscient view of system state, but each component only sees its local state and its channel to the network. Designing distributed algorithms is therefore “think globally, act locally”: local steps, limited knowledge, and unreliable communication must still yield global guarantees (e.g., avoid split-brain leadership).

What are holons and holarchies, and how do they help model systems of systems?

A holon is an entity that is both a whole and a part; a holarchy is a hierarchy of such entities. This lens lets you zoom in/out: a “database” can be treated as an atomic service or as a higher-order composition of nodes; a “cluster” can be one unit or many subsystems. It matches how real distributed systems are organized and reasoned about at different resolutions.

What is the “Distributed Systems Inc.” analogy, and what concerns does it capture?

The system is an office building: rooms = components with local state; pneumatic tubes/mailroom = network; mailbox = external interface. It vividly captures: - Message delivery semantics: loss, duplication, reordering. - Crash semantics: transient (short absence), intermittent (vacation), permanent (departure). The analogy helps explore consequences and countermeasures (e.g., retries, idempotency, coordination).

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$54.99 $34.64

you save $20.35 (37%)

include audio $24.99 $15.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$54.99 $34.64

you save $20.35 (37%)

include audio $24.99 $15.74

eBook

pdf, ePub, online

$54.99 $34.64

you save $20.35 (37%)

include audio $24.99 $15.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more