From Data to AI: Building a Unified Analytics Platform with Apache Cloudberry

Chuanxin Bian

Chinese Session 2025-07-25 16:15 GMT+8 (ROOM : JingMing Hall) #ai

Enterprises today struggle to harness AI’s full potential due to fragmented data systems, inefficient pipelines, and silos between analytics and machine learning. Apache Cloudberry, an open-source MPP data warehouse, redefines this paradigm by deeply integrating data processing with AI - eliminating barriers and accelerating innovation. In this session, we’ll demonstrate how Cloudberry enables:

Unified Execution – Run native AI/ML models (e.g., PyTorch, Scikit-learn) directly on warehouse data.
Multi-Modal Analytics – Process structured and unstructured data (PDFs, images, and other documents) in a unified framework.
Smart Data Applications – Build RAG-enhanced QA, ChatBI, and multimodal search. You can learn how to converge data and intelligence into one platform, reducing complexity while scaling AI workloads in this session.

Speakers:

Chuanxin Bian: HashData, Data & AI Engineer

Dr. Bian Chuanxin is a data scientist and applied mathematician specializing in deep learning, NLP, and time-series modeling. He holds a PhD in Applied Mathematics from The Hong Kong Polytechnic University. Currently at HashData, he develops AI tools like HashML and ChatData, and works on AIGC applications. Previously at Baidu, he contributed to Ernie Bot, built time-series models with PaddleTS, and advanced user profiling systems. Proficient in Python and deep learning frameworks, he bridges theory and practice to drive AI innovation.