Apache Ozone: Balance Data Through Disk Balancer

Sammi Chen

Chinese Session 2025-07-26 15:45 GMT+8  (ROOM : Mtn WanShou Hall) #datastorage

Apache Ozone is a distributed storage system in the Hadoop ecosystem. As a distributed storage system, it’s important to make sure that data is evenly distributed across Datanodes and disks, so that storage spaces and resources can be efficiently and fully utilized. To achieve this goal, Ozone supports Container Balancer and Disk Balancer, one to address the requirement of data evenly distributed across Datanodes, the other to address the requirement of data evenly distributed across all disks in each Datanode. In this session, we will share how the Disk Balancer feature is designed, and how to use the Disk Balancer feature to ensure an even disk utilization for Datanode.

Speakers:


Sammi Chen: Cloudera Principal Storage Engineer

Cloudera principal storage engineer, focusing on Apache Hadoop and Apache Ozone kernel development, currently being the Chair of Ozone PMC and Hadoop PMC, former big data storage tech leader of Tencent and Intel.