MORPH
®
EXPLORE
SEARCH
/
SIGN IN
SIGN UP
EXPLORE
SEARCH
vllm-project
/
vllm
UNCLAIMED
A high-throughput and memory-efficient inference and serving engine for LLMs
74511
0
0
Python
CODE
ISSUES
RELEASES
WIKI
ACTIVITY
ANALYTICS
BRANCHES
20 branches
main
DEFAULT
497e234
claude/refactor-cmake-includes-XK2Xl
7d41785
copilot/add-sp-min-token-to-e2e-tests
6cf77e4
cursor/test-quality-improvements-eeea
f0b888f
fix-pixtral-lora
019afb3
integrate-deepgemm-cmake
9a86a53
khluu/mig
5e87f99
khluu/mig-small-model-swaps
c0be8b1
lucas/sparse-indexer-logits-budget
f950710
luka/vllm-ir/rms-norm
08709ef
releases/v0.18.1
9fdc0f3
sm103
257c0c5
vadim/qwen35-no-deppgemm
a9bf5d2
wentao-fix-qwen3.5-batch-invariant
1c794bf
wentao-optimize-async-scheduling-copy
63dd4db
wentao-remove-redundant-prompt-copy
72f577b
wentao-skip-work-when-empty
200bef2
wentao-sp-support-for-v2
aff224d
woosuk/ds-exp
a17a1f1
woosuk/mrv2-expert-indices
58b0c78
Previous
PAGE 1
Next