Accelerating Multi-stream Join by Stream Graph Computing

Zhao Qingwen

Chinese Session #streaming

Stream computing is becoming increasingly critical in domains such as anomaly detection, search, recommendation systems, financial transactions, etc. Traditional stream computing engines like Flink and Spark Streaming typically handle multi-stream join scenarios using table-based join operations. However, as analytical dimensions deepen, these table-based stream engines face significant performance limitations. GeaFlow, an open-source streaming graph computing engine, breaks these constraints with a novel approach. It not only efficiently executes multi-hop queries by leveraging the inherent relationships within graph data but also incorporates built-in incremental graph algorithms. These algorithms drastically reduce the scale of subgraphs involved in computations while still delivering results equivalent to those obtained from full-graph computations. This topic explores the limitations of traditional stream engines in multi-stream join analysis and explains how GeaFlow, as a streaming graph engine, accelerates analytical performance through graph-native features and unique incremental algorithms.

Speakers:


Joined AntGroup in 2019 and is one of the core developers of GeaFlow.