Building a Unified Lakehouse Solution with Apache Cloudberry
Rose Duan
Chinese Session 2025-07-26 15:45 GMT+8 (ROOM : WanChun Hall) #datalakeData warehouses excel at fast analytics, while data lakes focus on scalable storage and flexible data management. The lakehouse architecture aims to combine the best of both—seamlessly integrating data across lakes and warehouses for efficient analysis and unified governance.
As a next-generation open-source MPP database, Apache Cloudberry extends its technical boundaries to build an open lakehouse solution. This talk introduces Cloudberry’s key capabilities in enabling a unified lakehouse architecture:
- Accelerated lake queries on Parquet/ORC without data movement
- Unified data gateway for querying and writing across heterogeneous sources
- Integrated data processing and sync pipeline, enabling end-to-end flow from ingestion to analytics
- Open metadata and storage formats for easier ecosystem integration and reduced migration cost
Speakers:
Rose Duan: Apache Cloudberry Database Developer
Apache Cloudberry contributor, database kernel developer at HashData.