Replicas as a Strength, Not a Burden: Scalable Consensus for OLAP Engines with LSM Storage

Junzhi Peng, Xinyu Tan

Chinese Session 2025-07-27 15:45 GMT+8  (ROOM : Mtn BaiWang Hall) #iot

This session challenges the norm by presenting a new way of thinking about consensus as a scalability solution. In this session, we first analyze the mismatch between conventional consensus designs like Raft or Paxos and the unique semantics of LSM-based OLAP systems. Then we introduce a state replication framework tailored to LSM’s hierarchical storage structure to address this gap. By leveraging the high compression ratio of LSM’s sorted string tables (SSTables), this approach enables parallel state propagation across replicas, reducing consensus overhead and allowing performance to scale with the number of replicas rather than decline. We’ve implemented this innovative consensus approach within Apache IoTDB. Testing has shown remarkable results. Our consensus engine scales linearly with replica count, turning what was once a performance burden into a throughput booster and enhancing both system durability and throughput without sacrificing performance.

Speakers:


Junzhi Peng: Student at Tsinghua University, Core Contributor to Apache IoTDB

Graduated from Nanjing University, currently studying at the School of Software, Tsinghua University. Interested in distributed systems, consensus protocols, and performance optimization. Participated in Apache IoTDB open source contribution in 2023, was responsible for the improvement and optimization of Apache IoTDB Metric framework, and now mainly works on Apache IoTDB’s distributed system framework. Responsible for the design, implementation and iteration of IoTDB’s new generation scalable consensus protocol.


Xinyu Tan: TianMou Technology Database Kernel Development Engineer

Apache IoTDB/TsFile PMC