Listen "SE-Radio Episode 272: Frances Perry on Apache Beam"
Episode Synopsis
Jeff Meyerson talks with Frances Perry about Apache Beam, a unified batch and stream processing model. Topics include a history of batch and stream processing, from MapReduce to the Lambda Architecture to the more recent Dataflow model, originally defined in a Google paper. Dataflow overcomes the problem of event time skew by using watermarks and other methods discussed between Jeff and Frances. Apache Beam defines a way for users to define their pipelines in a way that is agnostic of the underlying execution engine, similar to how SQL provides a unified language for databases. This seeks to solve the churn and repeated work that has occurred in the rapidly evolving stream processing ecosystem.
More episodes of the podcast Software Engineering Radio - the podcast for professional software developers
SE Radio 697: Philip Kiely on Multi-Model AI
03/12/2025
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.