Optimization and practice of Doris on Paimon in Xiaomi
Wang Long
Chinese Session #olap-
How does Doris on Paimon integrate with Xiaomi’s OLAP architecture system and integrate with OLAP multi engines (Spark, Presto) a. Unified SQL gateway proxy access: Implement traffic control, unified authentication, and simplify Doris' access through a unified proxy layer protocol and client SDK b. SQL automatic routing: By analyzing RBO and CBO rules, the most suitable SQL is routed to Doris for execution, achieving the effect of accelerating and reducing costs c. SQL rewriting and syntax compatibility: A unified SQL parsing layer converts incompatible SQL into Doris dialect, solving syntax compatibility issues
-
Optimization of Doris on Paimon a. Doris supports Paimon aggregation table data merge optimization: Paimon’s data merge on read performance is poor, so pushing its merge operation to Doris skips merge sort and unnecessary merge operations. At the same time, utilizing Doris' large-scale parallel ability for aggregation can greatly improve Paimon’s read speed b. Doris supports Paimon real-time materialized view: supports Paimon changelog level real-time materialized view updates, optimizes business query speed by twice+ c. Snapshot metadata cache optimization: Increase Paimon snapshot scheduled cache refresh capability to solve metadata cache latency and slow metadata retrieval from HDFS
-
Doris on Paimon’s Landing Effect a. Compared to before optimization, query performance has improved by 5 times
Speakers:
Senior Development Engineer of OLAP Engine Group at Xiaomi, mainly responsible for the kernel development of open source projects Doris, Spark, and Kyuubi