Apache DolphinScheduler at Yili Group: Best Practices in Customization, Monitoring, and Operations

Xuetong Zhu

Chinese Session #dataops

This session will delve into Yili Group’s implementation journey with Apache DolphinScheduler, a powerful open-source workflow orchestration tool. We will share practical insights into how we tailored DolphinScheduler to meet the specific needs of large-scale dairy production and supply chain management, including:

  1. Custom Development:
    • Extended functionalities for complex ETL workflows, resource quota management, and integration with internal systems (e.g., ERP, IoT platforms).
    • Optimization strategies for high-concurrency task scheduling in hybrid cloud environments.
  2. Monitoring & Alerting:
    • Enhanced monitoring dashboards for real-time tracking of workflow health, task latency, and resource utilization.
    • Custom alerting mechanisms integrated with enterprise communication tools (e.g., DingTalk, WeCom).
  3. Operational Best Practices:
    • Lessons learned in cluster performance tuning, disaster recovery, and version upgrade strategies.
    • Metrics-driven approaches to improve system stability and reduce O&M costs.
  4. Community Collaboration:
    • Contributions back to the Apache DolphinScheduler community, including feature enhancements and bug fixes.

Speakers:


a Big Data Architect & Tech Lead with 10+ years of experience, specializing in OLAP optimization (StarRocks, Spark), real-time data lakes (Flink, Paimon), and multi-cloud integration. At Yili Group, he led the adoption of StarRocks, improving query speeds by 10x and reducing ETL development time by 20x via DolphinScheduler and DataX