Celeborn’s Revolution in Multi-Engine Support, Performance Mastery, and Enterprising Innovation
Jiashu Xiong, Ethan Feng
Chinese Session 2025-07-26 16:15 GMT+8 (ROOM : Mtn WanShou Hall) #datastorageApache Celeborn has made significant progress over the past year, introducing new capabilities, performance optimizations, and expanded engine support.
Functional enhancements include: end-to-end validation for data integrity, Multi-layer storage and HybridShuffle for flexible data management, CLI tools and RESTful API for enhanced usability, Multi-level quota for resource governance, worker tags, dynamic configuration, etc.
Performance improvements address: Spark skew optimization to eliminate extra sorting in skewed scenarios, SortShuffle partition splitting to prevent performance degradation from uneven partitions, Reduced latency in Commit and Fetch phases by optimizing synchronization bottlenecks.
Engine support now includes Blaze and Flink’s HybridShuffle alongside existing support for MR and Spark.
Additionally, stability has been strengthened, and the community remains active, with Celeborn becoming the preferred shuffle service for many organizations globally.
Speakers:
Jiashu Xiong: Alibaba Cloud, Senior Development Engineer
Apache Celeborn PMC member, mainly focused on the optimization of Apache Celeborn and the integration of Apache Celeborn with engines such as Flink and Spark
Ethan Feng: Aliyun,senior developer
PMC member of Apache Celeborn, mainly focused on optimizing Apache Celeborn.