Novita AI Partners with SGLang to Power Next‐Gen AI Inference

22.05.25 17:00 Uhr

SAN FRANCISCO, May 22, 2025 /PRNewswire/ -- Novita AI, a leading global artificial intelligence (AI) cloud platform, is proud to announce a strategic partnership with SGLang, a fast serving engine for large language models and vision language models. Through this collaboration, Novita AI will provide high-performance GPU cloud resources for SGLang's ongoing research, benchmarking, and optimization efforts.

Novita AI logo

"We're thrilled to partner with SGLang to bring powerful inference performance to applications across industries."

SGLang is a leading inference engine that co-designs a structured generation language with a highly optimized runtime, enabling powerful performance gains such as efficient RadixAttention cache reuse and zero-overhead batch scheduling for large language and vision-language models. By aligning language-level control with backend optimizations, it empowers developers to build complex generation workflows, multi-modal applications, and parallel inference pipelines with reliability and scale. SGLang is supported by leading institutions including NVIDIA, AMD, xAI, Oracle Cloud, Google Cloud, LinkedIn, Cursor, alongside research groups at Stanford, University of California, Berkeley, and University of California, Los Angeles—evidence of strong community engagement and broad industry adoption.

"SGLang's integration of language-level primitives with runtime optimizations demonstrates the value of aligning software and hardware to unlock new performance levels," said Junyu Huang, Co-Founder & COO at Novita AI. "By contributing our infrastructure and expertise, we've already supported the development of SGLang's first end-to-end multi-turn reinforcement learning (RL) framework and the Prism multi-large language model serving system, and remain committed to fueling its ongoing innovations for developers everywhere."

"We're thrilled to partner with the SGLang team," added Junyu Huang. "Having supported their RL framework and multi-LLM serving system, we're excited to see these achievements accelerate their work and bring powerful inference performance to applications across industries."

Novita AI is also collaborating on SGLang's large-scale expert parallelism project, an open-source implementation designed to approach the throughput benchmarks detailed in the official DeepSeek blog, partnering to bring this milestone to fruition.

This collaboration reflects Novita AI's ongoing commitment to advancing an open ecosystem of inference engines and supporting diverse research initiatives through shared infrastructure and joint development efforts.

Through collaborations with pioneering open-source projects like SGLang, Novita AI continues to advance its mission of democratizing AI, making cutting-edge inference capabilities readily available to developers worldwide.

About Novita AI

Novita AI is an AI cloud platform that helps developers easily deploy AI models through a simple API, backed by affordable and reliable GPU cloud infrastructure. By supporting open-source libraries for LLM inference and serving—Novita AI is driving the future of AI and encouraging innovation across the industry. Learn more about Novita AI at www.novita.ai.

Cision View original content to download multimedia:https://www.prnewswire.com/news-releases/novita-ai-partners-with-sglang-to-power-nextgen-ai-inference-302461266.html

SOURCE Novita AI