Catalogs as Context: Using metadata to power and govern the next wave of AI development
Jerry Shao
Chinese Session 2025-07-25 15:45 GMT+8 (ROOM : JingMing Hall) #aiDeveloping powerful AI tooling has been our theme of the year, with agents and foundational models picking up steam across the board. Therein still lies the question though: how do we serve data for these applications to work effectively? What about at enterprise scale? What even is context? In this talk we discuss the current big data landscape, challenges to data platforming for AI, and why data catalogues and metadata are the only viable path forward to effective, governed AI - development. In this talk we use the open source framework, Apache Gravitino as a key example for why such a solution needs vendor neutrality.
Speakers:
Jerry Shao: Datastrato, CTO
Jerry Shao is the co-founder and CTO of Datastrato, focused on open source Big Data are for more than 10 years. He is an Apache member, committer and PMC member of Apache Spark and Apache Inlong, the creator of Apache Gravitino (incubator) project.