Overview

1 Introducing event streams

Businesses can be viewed as generators and responders to a continuous flow of events. While people think in terms of tasks, systems, and teams, computers thrive on modeling the world as event streams. Framing operations this way unlocks fresher insights, a single version of the truth, faster reactions, and simpler architectures by replacing ad hoc integrations with well-modeled streams.

An event is a discrete occurrence at a specific point in time; it’s not an ongoing state, a generic recurring phenomenon, a collection of events, or a duration. A continuous event stream is an unterminated, time-ordered succession of such events: it may begin before we start observing and extends into the future. Many familiar tools already work this way—application logs produce time-stamped events, web analytics tags emit interaction events, and publish/subscribe messaging distributes event messages by topic—though their schemas and uses are often siloed and inconsistent.

The chapter traces an evolution across three eras. In the classic era, on-prem transactional systems fed a batch-loaded data warehouse: reliable, but plagued by high latency, point-to-point “spaghetti,” and rigid modeling assumptions. The hybrid era added SaaS platforms, Hadoop “log everything” stacks, and selective low-latency pipelines, but created new issues: no single source of truth, fragmented decision loops, proliferating brittle integrations, and a tradeoff between latency and coverage.

The unified era centers on a unified log—an append-only event log read at low latency by multiple consumers at their own pace, retaining a rolling window with historical archives elsewhere. All systems write their event streams to this log and, unless strict transactional guarantees are needed, communicate through it rather than via bespoke connections. This yields a practical single version of the truth upstream of warehouses, dramatically fewer point-to-point links, and “unbundled” local loops where applications collaborate through shared streams.

  • Fresher insights: stream processing reduces time to analysis and action from hours to seconds.
  • Single truth: the unified log plus archive align operational and analytical views on the same data.
  • Simpler architectures: write-once, read-many streams replace bespoke integrations.
  • Use cases: real-time customer feedback loops, holistic systems monitoring across all signals, and hot-swapping data application versions for zero-downtime upgrades and A/B comparisons.

Overall, adopting event streams and a unified log turns fragmented, latency-prone data flows into a coherent, real-time fabric that supports consistent analytics and rapid, intelligent reactions across the business.

1.5   Summary


[1] See “Fact Tables and Dimension Tables” by Ralph Kimball (www.kimballgroup.com/2003/01/fact-tables-and-dimension-tables/) for more information about these dimensional modeling techniques.

FAQ

What is an “event” in this chapter’s terms?An event is anything you can observe at a specific point in time and attach a timestamp to—for example, “order placed,” “API error at 12:03,” or “user clicked play.” If you can tie it to a moment, it’s likely an event.
What kinds of things are not events?Common pitfalls include: describing ongoing state (“the service is slow”), summarizing recurring occurrences (“the market opens every day”), bundling many happenings into one (“the entire campaign”), or time-spanning activities (“the sale ran all day”). Instead, express them as point-in-time events (start, end, occurrence).
What is a continuous event stream, and why is it called “unterminated”?A continuous event stream is an ordered sequence of individual events by their occurrence time. It’s “unterminated” because the stream’s start may predate our observation and its end lies in the future; we process a moving window while older data is archived.
Where might I already be using event streams without calling them that?Three familiar places: application logging (logs are time-ordered events), web analytics tags (page views, clicks, custom events), and publish/subscribe messaging (topics carrying event messages to subscribers).
What is a unified log?A unified log is an append-only, low-latency stream that all systems write their events to and many applications can read from independently at their own pace. It usually retains a rolling window, with historical events archived (for example, to HDFS or S3). Platforms like Kafka and Kinesis implement this pattern.
How does a unified log relate to a data warehouse?In the unified era, the unified log is the upstream “single version of the truth,” and the warehouse becomes a consumer of that same event stream for analytics. This aligns operational apps and analytical reporting on the same facts while lowering latency.
What problems in classic and hybrid architectures does the unified log address?It tackles high batch latency, fragile point‑to‑point “spaghetti” integrations, siloed/local decision loops, lack of a single source of truth, and the trade-off between low latency and wide data coverage in hybrid setups.
How does the unified log reduce point-to-point integrations and simplify architectures?Systems publish their events to the log and consume others’ events from it, decoupling producers from consumers. This dramatically lowers the number of bespoke connections and unbundles local processing loops into shared, near‑real‑time flows.
What new or improved use cases does this approach enable?- Real-time customer feedback loops (e.g., cart-abandonment offers based on live behavior) - Holistic systems monitoring (correlating client, server, and infra events in one stream) - Hot-swapping data applications (run multiple versions against the same stream for seamless upgrades and A/B comparisons)
How does hot-swapping data application versions work with a unified log?Each consumer tracks its own position (cursor) in the log. You can start a new version, let it replay and catch up to the current position, switch traffic to it, and retire the old version—enabling zero-downtime upgrades and side-by-side algorithm tests.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Event Streams in Action ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Event Streams in Action ebook for free