Release v0.2.4 · THUDM/slime

v0.2.4 is here! Thanks to everyone who contributed to this release.

Major Updates

In addition to a broad set of bug fixes and stability improvements, v0.2.4 brings several major updates:

Profiling and observability improvements
Added a rollout trace timeline viewer and W&B reporting for dynamic ITL / TTFT percentile metrics.
Router stack unified on sgl-router
Consolidated the router stack onto sgl-router and removed slime-router.
Expanded multimodal and model support
Improved support for GLM-4.6V / GLM4V, Multimodal OPD, and Qwen3.5-related workflows.

Other Notable Changes

Fixed CUDA IPC cache leaks during weight updates
Fixed SP/CP gradient inflation in FLA layers

What's Changed

feat: add GLM-4.6V MoE VL bridge with CP support by @zhuzilin in #1715
fix: resolve rope_theta from rope_parameters dict in HF config validation by @zhuzilin in #1720
[docker] patches for glm4.6v, kimi k2.5 and dsa cp only by @zhuzilin in #1722
Fix CUDA IPC cache leaks during weight updates by @zhuzilin in #1731
[docker] update megatron by @zhuzilin in #1729
[docker] Fix IndexCache with mla model by @zhuzilin in #1736
[slime-router] support pd disaggregation and remove radix tree middleware by @zhuzilin in #1735
Fix glm4v megatron bridge by @zhuzilin in #1738
[docker] update sglang patch by @zhuzilin in #1743
feat: GLM4V multimodal support improvements by @zhuzilin in #1745
feat: placeholder worker type, metrics router, and GPQA letter range by @zhuzilin in #1746
always enable_metrics and remove dp context by @zhuzilin in #1747
fix: resolve SP/CP gradient inflation in FLA (linear attention) layers by @zhuzilin in #1748
Update MTP example configs, rename GLM-4.5 to GLM-4.7, clean scripts by @zhuzilin in #1749
Support qwen3.5 loss mask for multi-turn SFT by @huang3eng in #1742
fix: propagate moe_token_dispatcher_type in bridge model provider by @nanjiangwill in #1737
fix: resolve rope_theta from rope_parameters in DeepseekV32Bridge by @stevewx in #1734
chore: translate remaining Chinese comments to English by @WangHong-yang in #1726
feat: add Qwen3.5-4B model support by @shihaohou in #1721
fix: http_utils. disable system proxy for internal SGLang httpx clients by @DongzhuoranZhou in #1714
fix: auto-detect GPUs in qwen3-4b script by @ailuntz in #1700
fix: quote $MOE_LAYER_FREQ by @lawrence-harmonic in #1689
disable router health_check and allow prompt_data is None by @zhuzilin in #1751
small fix on qwen3-235b-a22b launch script by @Zhuohao-Li in #1719
sync internal bugfix by @zhuzilin in #1765
Fix uploading sglang metrics to wandb by @zhuzilin in #1768
use zhuzilin/sgl-router for sglang-router by @zhuzilin in #1770
[docker] update sgl-router by @zhuzilin in #1772
[Multimodal] Add Multimodal OPD support by @coding-famer in #1760
refactor: remove slime router by @zhuzilin in #1773
Add rollout trace timeline viewer by @zhuzilin in #1776
[Fix] Fix duplicate Megatron LR scheduler resume when optimizer state is not loaded by @kaysonyu in #1775
Support FP8 conversion for Qwen3.5 by @peterjc123 in #1769
fix typo by @albaNnaksqr in #1759
[Fix]Fix some bugs/clean up by @coding-famer in #1756
(fix):not have encoder_only attr cause run failed by @wangyufak in #1741

New Contributors

@stevewx made their first contribution in #1734
@WangHong-yang made their first contribution in #1726
@shihaohou made their first contribution in #1721
@DongzhuoranZhou made their first contribution in #1714
@ailuntz made their first contribution in #1700
@peterjc123 made their first contribution in #1769
@albaNnaksqr made their first contribution in #1759
@wangyufak made their first contribution in #1741

Full Changelog: v0.2.3...v0.2.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.4

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Major Updates

Other Notable Changes

What's Changed

New Contributors

Contributors

Uh oh!