Building a Unified Lakehouse Solution with Apache Cloudberry
Rose Duan
Chinese Session #datalakeData warehouses excel at fast analytics, while data lakes focus on scalable storage and flexible data management. The lakehouse architecture aims to combine the best of both—seamlessly integrating data across lakes and warehouses for efficient analysis and unified governance.
As a next-generation open-source MPP database, Apache Cloudberry extends its technical boundaries to build an open lakehouse solution. This talk introduces Cloudberry’s key capabilities in enabling a unified lakehouse architecture:
- Accelerated lake queries on Parquet/ORC without data movement
- Unified data gateway for querying and writing across heterogeneous sources
- Integrated data processing and sync pipeline, enabling end-to-end flow from ingestion to analytics
- Open metadata and storage formats for easier ecosystem integration and reduced migration cost
Speakers:
Apache Cloudberry 贡献者,HashData 的数据库内核开发人员。