Release v0.4.5 #5119

ispobock · 2025-04-07T08:55:30Z

ispobock
Apr 7, 2025
Maintainer

Highlights

The SGLang team is excited to the release of v0.4.5! This version introduces several significant features, including Llama 4 support, FlashAttention 3 backend, EAGLE3 speculative decoding, DeepEP integration, and disaggregated prefill and decoding.

New Features

Llama 4 Support: We supported Llama 4 model with accuracy matching official benchmark numbers, achieving a zero-shot score of 75.2 on the MMLU Pro dataset for Llama-4-Scout-17B-16E-Instruct model and 80.7 for Llama-4-Maverick-17B-128E-Instruct model.
FlashAttention 3 Backend: Our implementation of the FlashAttention 3 backend delivers significant acceleration for long-context tasks.
EAGLE3 Speculative Decoding: We’re proud to be the first to support EAGLE3 speculative decoding, offering substantial gains in decoding throughput. Learn more in our documentation and the EAGLE3 paper.
DeepEP Integration: By incorporating DeepEP, we enhanced performance for MoE inference.
Disaggregated Prefill and Decoding: We introduced a prototype for disaggregated prefill and decoding, with plans for further optimizations.

Thanks very much to the NVIDIA team, LinkedIn team, EAGLE team, Oracle team, Meituan team, and our incredible open-source community for their invaluable contributions!

Coming Soon

Disaggregated Prefill and Decoding: [Roadmap] Prefill and Decoding Disaggregation #4655
Llama 4 Optimization: [Roadmap] Llama 4 Support #5118
EP Enhancement: [Roadmap] EP Enhancement #4734
FA3 Enhancement: [Roadmap] FlashAttention3 Support as SGLang Attention Backend #4709

We’re thrilled about these advancements and eager to hear your feedback! Join us on our Slack channel at slack.sglang.ai to connect and share your thoughts. Cheers!

remixer-dec · 2025-04-07T14:06:27Z

remixer-dec
Apr 7, 2025

[BREAKING CHANGES] Quantization support requires a separate installation of vllm and does not work without it

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release v0.4.5 #5119

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Release v0.4.5 #5119

Uh oh!

Uh oh!

ispobock Apr 7, 2025 Maintainer

Highlights

New Features

Coming Soon

Replies: 1 comment

Uh oh!

remixer-dec Apr 7, 2025

ispobock
Apr 7, 2025
Maintainer

remixer-dec
Apr 7, 2025