iFLYTEK Wins CNCF End User Case Study Contest for Scalable AI Infrastructure Breakthroughs with Volcano

10.06.25 03:00 Uhr

Company to present large-scale Kubernetes model training success at KubeCon + CloudNativeCon China 2025

HONG KONG, June 9, 2025 /PRNewswire/ -- The Cloud Native Computing Foundation® (CNCF®), which builds sustainable ecosystems for cloud native software, today announced iFLYTEK as the winner of the CNCF End User Case Study Contest. Selected for its impactful implementation of Volcano, iFLYTEK will present its success scaling large AI model training at KubeCon + CloudNativeCon China 2025, 10–11 June in Hong Kong.

CNCF is the open source, vendor-neutral hub of cloud native computing, hosting projects like Kubernetes and Prometheus to make cloud native universal and sustainable. (PRNewsfoto/Cloud Native Computing Foundation)

iFLYTEK is the winner of CNCF's End User Case Study Contest for its impactful implementation of Volcano.

iFLYTEK, a Chinese tech firm focused on speech and language AI, faced scaling issues as its workloads grew. Inefficient scheduling left GPUs underused, workflows became harder to manage, and teams competed for resources. These challenges slowed progress and strained infrastructure. With Volcano, iFLYTEK adopted elastic scheduling, DAG-based workflows, and multi-tenant isolation, resulting in simplified operations and improved resource usage.

"Before Volcano, coordinating training under large-scale GPU clusters across teams meant constant firefighting, from resource bottlenecks and job failures to debugging tangled training pipelines," said DongJiang, senior platform architect, iFLYTEK. "Volcano gave us the flexibility and control to scale AI training reliably and efficiently. We're honored to have our work recognized by CNCF, and we're excited to share our journey with the broader community at KubeCon + CloudNativeCon China."

Volcano is a cloud native batch system built on Kubernetes, designed for high-performance workloads such as AI/ML training, big data processing, and scientific computing. It offers advanced scheduling capabilities such as job orchestration, resource fairness, and queue management, which are essential for managing large-scale, distributed tasks efficiently. Accepted into the CNCF Sandbox in 2020 and promoted to Incubating maturity level in 2022, Volcano has become a foundational tool for organizations running compute-intensive workloads.

As AI demand increased, iFLYTEK turned to Volcano to support the growing complexity and scale of their training infrastructure. The engineering team was looking for a way to more efficiently allocate resources, manage complex multi-stage training workflows, and minimize job disruptions; all while ensuring fair access for different teams. With Volcano, they are now able to streamline operations, better utilize GPUs, and stabilize long-running jobs:

  • 40% increase in GPU utilization, cutting infrastructure costs and reducing idle compute.
  • 70% faster recovery from job failures, ensuring uninterrupted training processes.
  • 50% acceleration in hyperparameter search, enabling faster iteration and innovation.

"iFLYTEK's case study shows how open source can solve complex, high-stakes challenges at scale," said Chris Aniszczyk, CTO of CNCF. "By using Volcano to boost GPU efficiency and streamline training workflows, they've cut costs, sped up development, and built a more reliable AI platform on top of Kubernetes, which is essential for any organization striving to lead in AI."

As AI workloads grow more complex and resource-intensive, iFLYTEK's experience shows how cloud native tools like Volcano can help teams simplify operations and improve scalability. Their upcoming KubeCon + CloudNativeCon China presentation will share practical insights on managing distributed training more effectively in Kubernetes environments.

For more information and the full event schedule, including iFLYTEK's session "Scaling Large Model Training in Kubernetes Clusters with Volcano" on 11 June, visit: https://events.linuxfoundation.org/kubecon-cloudnativecon-china/

Additional Resources

About Cloud Native Computing Foundation
Cloud native computing empowers organizations to build and run scalable applications with an open source software stack in public, private, and hybrid clouds. The Cloud Native Computing Foundation (CNCF) hosts critical components of the global technology infrastructure, including Kubernetes, Prometheus, and Envoy. CNCF brings together the industry's top developers, end users, and vendors and runs the largest open source developer conferences in the world. Supported by more than 800 members, including the world's largest cloud computing and software companies, as well as over 200 innovative startups, CNCF is part of the nonprofit Linux Foundation. For more information, please visit www.cncf.io.

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page. Linux is a registered trademark of Linus Torvalds.

Media Contact
Kaitlin Thornhill
The Linux Foundation
pr@cncf.io

Cision View original content to download multimedia:https://www.prnewswire.com/news-releases/iflytek-wins-cncf-end-user-case-study-contest-for-scalable-ai-infrastructure-breakthroughs-with-volcano-302476880.html

SOURCE Cloud Native Computing Foundation