-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[XPU] Fix spec-decode UTs under tests/v1/spec_decode
intel-gpu
Related to Intel GPU
speculative-decoding
v1
#38491
opened Mar 30, 2026 by
yma11
Loading…
[Misc] Always use
forward_mulmat for Conv3d on newer versions of torch.
#38487
opened Mar 29, 2026 by
ywang96
Loading…
5 tasks
[Build] Add SM121 (DGX Spark / GB10) to published build targets
ci/build
nvidia
#38484
opened Mar 29, 2026 by
JCorners68
Loading…
3 tasks
fix(v1): Handle max_model_len overflow gracefully instead of crashing
v1
#38483
opened Mar 29, 2026 by
machov
Loading…
(security) Fix SSRF in batch runner download_bytes_from_url
documentation
Improvements or additions to documentation
frontend
#38482
opened Mar 29, 2026 by
jperezdealgaba
Loading…
Fix potential infinite loop in SonnetDataset.sample when using short input-len
performance
Performance-related issues
#38481
opened Mar 29, 2026 by
frankie-ys
Loading…
1 of 5 tasks
[Attention Backend] TurboQuant: 2-bit KV cache compression with 4x capacity
nvidia
v1
#38479
opened Mar 29, 2026 by
vibhavagarwal5
Loading…
[Bug fix][Quantization] Fix dummy weight loading
bug
Something isn't working
needs-rebase
#38478
opened Mar 29, 2026 by
Josephasafg
Loading…
3 of 5 tasks
[WIP] Add TRITON_MLA_SPARSE backend for SM80 sparse MLA support
documentation
Improvements or additions to documentation
nvidia
rocm
Related to AMD ROCm
v1
fix(p2p_nccl): free KV recv_store entries immediately to prevent OOM (#38472)
kv-connector
v1
#38475
opened Mar 29, 2026 by
saifmb0
Loading…
fix: Add apply_with_spec_decode() method to LogitBiasLogitsProcessor
v1
#38469
opened Mar 29, 2026 by
ranger2571
Loading…
5 tasks
Add platform manual_seed_all API
intel-gpu
Related to Intel GPU
nvidia
performance
Performance-related issues
rocm
Related to AMD ROCm
speculative-decoding
v1
#38468
opened Mar 29, 2026 by
yma11
Loading…
[Feature] Add apply_with_spec_decode() to LogitBiasLogitsProcessor
v1
#38467
opened Mar 29, 2026 by
NJX-njx
Loading…
[Bugfix] Fix limit_mm_per_prompt being ignored for encoder cache profiling
bug
Something isn't working
multi-modality
Related to multi-modality (#4194)
#38465
opened Mar 29, 2026 by
NJX-njx
Loading…
[Logging] Improve DCP error message to suggest VLLM_ATTENTION_BACKEND
v1
#38464
opened Mar 29, 2026 by
WJYuuuu
Loading…
3 of 5 tasks
[Quantization] Consolidate experts_int8 with fp8 online quantization
needs-rebase
#38463
opened Mar 29, 2026 by
Josephasafg
•
Draft
3 of 5 tasks
[Logging] Add JIT compilation progress log for FlashInfer
nvidia
v1
#38462
opened Mar 29, 2026 by
WJYuuuu
Loading…
3 of 5 tasks
[Perf] Batch KV cache swap copies via cuMemcpyBatchAsync
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#38460
opened Mar 29, 2026 by
Etelis
Loading…
[Docs] Add vLLM CI overview documentation for contributors
documentation
Improvements or additions to documentation
#38458
opened Mar 29, 2026 by
khluu
Loading…
3 tasks
[ROCm] [DOC] Update the Documentation to include ROCm Nightly Wheel support
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#38457
opened Mar 29, 2026 by
tjtanaa
Loading…
5 tasks
[CI] Fix online FP8 quantization materializing tensors on CPU
bug
Something isn't working
needs-rebase
#38456
opened Mar 29, 2026 by
haosdent
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.