A high-throughput and memory-efficient inference and serving engine for LLMs
[](/packages/vllm)
<a href="/packages/vllm"><img src="/api/badges/vllm?period=month" alt="PyPI Stats"></a>