Overview

1 Getting to know Kafka as an architect

Apache Kafka has evolved from a high‑throughput message bus into a foundational platform for real‑time, event‑driven systems. This chapter introduces why architects increasingly favor events over synchronous, point‑to‑point integrations: producers publish once, many consumers react, and systems become more autonomous and resilient. It sets expectations for an architect’s perspective—understanding how Kafka works, the tradeoffs it imposes, and the patterns and anti‑patterns that determine long‑term value—focusing on design, data modeling, schema evolution, integration strategies, and the balance between performance, ordering, and fault tolerance rather than on client API details.

The chapter outlines core concepts and components that shape architectural decisions. Kafka implements publish‑subscribe with persistent, replicated storage; producers write events, consumers pull and can replay them, and reliability is achieved through acknowledgments, retries, and fault‑tolerant clustering. Controllers using KRaft manage cluster metadata and broker health, while the commit‑log model preserves ordering per log and immutability under configurable retention. Beyond the broker, a broader ecosystem supports enterprise needs: Schema Registry formalizes data contracts and compatibility; Kafka Connect moves data between Kafka and external systems through configuration; and streaming frameworks such as Kafka Streams or Apache Flink provide stateful transformations, routing, joins, and exactly‑once processing—enabling low‑latency analytics and operational workflows at scale.

Applying Kafka in the enterprise requires clear fit and operational readiness. The chapter contrasts two common uses: durable event delivery for decoupled microservices and long‑lived logs for event‑sourcing and real‑time enrichment—while noting Kafka’s limits as a general query store compared with databases. It highlights non‑functional considerations—requirements gathering, sizing, SLAs, security, observability, testing, and disaster recovery—as well as deployment choices between on‑premises and managed cloud services, each with cost, control, and tooling implications, and acknowledges alternative streaming platforms. Overall, it frames a practical roadmap for initiatives such as customer 360 views and modernization programs, emphasizing governance, reliability, and sustainable operations as cornerstones of successful Kafka adoption.

Request-response design pattern
The EDA style of communication: systems communicate by publishing events that describe changes, allowing others to react asynchronously.
The key components in the Kafka ecosystem are producers, brokers, and consumers.
Structure of a Kafka cluster: brokers handle client traffic; KRaft controllers manage metadata and coordination
Publish-subscribe example: CustomerService publishes a “customer updated” event to a channel; all subscribers receive it independently.
Acknowledgments: Once the cluster accepts a message, it sends an acknowledgement to the service. If no acknowledgment arrives within the timeout, the service treats the send as failed and retries.
Working with Schema Registry: Schemas are managed by a separate Schema Registry cluster; messages carry only a schema ID, which clients use to fetch (and cache) the writer schema.
The Kafka Connect architecture: connectors integrate Kafka with external systems, moving data in and out.
An example of a streaming application. RoutingService implements content-based routing, consuming messages from Addresses and, based on their contents (e.g., address type), publishing them to ShippingAddresses or BillingAddresses.

Summary

  • There are two primary communication patterns between services: request-response and event-driven architecture.
  • In the event-driven approach, services communicate by triggering events.
  • The key components of the Kafka ecosystem include brokers, producers, consumers, Schema Registry, Kafka Connect, and streaming applications.
  • Cluster metadata management is handled by KRaft controllers.
  • Kafka is versatile and well-suited for various industries and use cases, including real-time data processing, log aggregation, and microservices communication.
  • Kafka components can be deployed both on-premises and in the cloud.
  • The platform supports two main use cases: message delivery and state storage.

FAQ

Why move from synchronous REST integrations to an event-driven architecture?Chained synchronous calls increase fragility, coordination complexity, and the risk of cascading failures. Event-driven architecture decouples producers and consumers, enables asynchronous communication and fan-out, and allows systems to operate and evolve independently. The tradeoff is managing eventual consistency, idempotency, and out-of-order events.
How does Kafka’s publish-subscribe model decouple services?Producers publish events to topics without knowing who will consume them. Any number of consumers can subscribe and process the same event independently, at their own pace. Because delivery is asynchronous and durable, services can be offline temporarily without disrupting others, improving flexibility and resilience.
Who are the key players in the Kafka ecosystem and what do they do?Producers send messages; brokers persist and replicate them; consumers pull and process them. KRaft controllers manage cluster metadata and broker health. Schema Registry governs message contracts and compatibility. Kafka Connect moves data between Kafka and external systems without custom code. Streaming frameworks (Kafka Streams, Apache Flink) transform and enrich data in motion.
How does Kafka ensure reliable and durable message delivery?Producers receive acknowledgments and retry on failure; brokers replicate messages across the cluster for fault tolerance; consumers track progress and can resume after outages. Retention policies allow replay so consumers can reprocess missed or historical events as needed.
What is Kafka’s commit log, and why does immutability matter?Kafka appends messages to an ordered, immutable log, preserving arrival order per partition. Messages aren’t edited or deleted individually; corrections are new events. This design enables replay, state reconstruction, and event-sourcing patterns, though Kafka isn’t a general-purpose query store.
How should architects handle eventual consistency, ordering, and idempotency with Kafka?Design for eventual consistency by tolerating delays between producers and consumers. Use keys and partitions to achieve per-key ordering, and implement idempotent processing to handle retries or duplicates safely. Plan for out-of-order events with timestamps, versioning, and reconciliation logic where needed.
Why do I need a Schema Registry, and how does schema evolution work?Kafka treats messages as opaque bytes, so structure isn’t enforced by brokers. Schema Registry stores and versions schemas; producers embed a schema ID in messages, and consumers fetch the matching schema to deserialize. Compatibility rules ensure changes don’t break existing consumers.
When should I use Kafka Connect instead of writing custom producers and consumers?Use Kafka Connect for configuration-driven data replication to and from databases, data warehouses, and object storage. Source and sink connectors (for example, JDBC) eliminate boilerplate code and standardize operations. Connect supports simple stateless transforms; complex logic belongs in a streaming application.
Where should message transformation logic live, and how do streaming frameworks help?Options include producers, each consumer, or a dedicated processing layer. Frameworks like Kafka Streams and Apache Flink implement the processing layer, supporting filtering, joins, aggregations, windowing, and content-based routing—often with exactly-once semantics—so multiple services get data in the form they need.
Should we run Kafka on-premises or use a managed cloud service?On-premises offers full control and tuning but requires significant operations, monitoring, and maintenance. Managed services simplify provisioning and provide SLAs but may restrict version choice, low-level tuning, tool options, or ecosystem components. Choose based on cost, skills, compliance, and scalability needs.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Kafka for Architects ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Kafka for Architects ebook for free