-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[fix]: Disable ASCII escaping for Chinese characters to prevent redundant backslashes in tool_call outputs during streaming responses. #6156
#6174
opened May 10, 2025 by
tdeng521
Loading…
6 tasks
Fix OpenAI Client error with single request via batch api
#6170
opened May 10, 2025 by
ravi03071991
Loading…
6 tasks
fix: handle None multimodal_inputs during merging and filtering batches in disaggregation decode mode
#6169
opened May 10, 2025 by
GaoYusong
Loading…
1 of 6 tasks
[Docs] [QUANT] Install vLLM for specific quant methods
#6167
opened May 10, 2025 by
JiangJiaWei1103
Loading…
2 of 6 tasks
[Fix] fix assert error in disaggregatin decoder
#6155
opened May 9, 2025 by
zeroorhero
Loading…
1 of 6 tasks
[doc] add a note for --n-share-experts-fusion args
#6154
opened May 9, 2025 by
BBuf
Loading…
6 tasks
[Feat] optimize Qwen3 on H20 by hybrid Attention Backend
#6151
opened May 9, 2025 by
TianQiLin666666
Loading…
6 tasks
[Bug Fixed] fixed the triton kernel bug of assign_draft_cache_locs for page_size > 1 in eagle mode
#6150
opened May 9, 2025 by
DavidChan0519
Loading…
Enable native ModelOpt quantization support (1/3)
#6142
opened May 9, 2025 by
Edwardf0t1
Loading…
1 of 6 tasks
Implement
return_hidden_states
for the OpenAI API
#6137
opened May 9, 2025 by
kyle-pena-kuzco
Loading…
2 of 6 tasks
Support precomputed multimodal features for qwen-vl models.
#6136
opened May 9, 2025 by
ysulsky
Loading…
4 of 6 tasks
Support multi-round conversations in bench_serving
#6135
opened May 9, 2025 by
fzyzcjy
Loading…
6 tasks
Tiny refactor bench_serving to improve extensibility
#6134
opened May 9, 2025 by
fzyzcjy
Loading…
6 tasks
[ROCm][CI]: add VLM PR CI for parity with NVIDIA
visIon-LM
#6130
opened May 8, 2025 by
OrenLeung
Loading…
4 of 6 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-05-07.