Accelerating Multi-stream Join by Stream Graph Computing
Zhao Qingwen
Chinese Session #streamingStream computing is becoming increasingly critical in domains such as anomaly detection, search, recommendation systems, financial transactions, etc. Traditional stream computing engines like Flink and Spark Streaming typically handle multi-stream join scenarios using table-based join operations. However, as analytical dimensions deepen, these table-based stream engines face significant performance limitations. GeaFlow, an open-source streaming graph computing engine, breaks these constraints with a novel approach. It not only efficiently executes multi-hop queries by leveraging the inherent relationships within graph data but also incorporates built-in incremental graph algorithms. These algorithms drastically reduce the scale of subgraphs involved in computations while still delivering results equivalent to those obtained from full-graph computations. This topic explores the limitations of traditional stream engines in multi-stream join analysis and explains how GeaFlow, as a streaming graph engine, accelerates analytical performance through graph-native features and unique incremental algorithms.
Speakers:
Joined AntGroup in 2019 and is one of the core developers of GeaFlow.