GeeeekExplorer / nano-vllm Public

Notifications You must be signed in to change notification settings
Fork 1.1k
Star 9.3k

Code
Issues 19
Pull requests 21
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: GeeeekExplorer/nano-vllm

Labels 9 Milestones 0

New pull request New

21 Open 40 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[ADD] Add TTFT, TPOT metrics in tqdm bars.

#133 opened Nov 16, 2025 by mumupika

Loading…

Add Qwen3-VL multimodal support

#132 opened Nov 11, 2025 by 86MaxCao

Loading…

attention without flash-attn for windows with GTX1650

#124 opened Nov 5, 2025 by tiehexue

Loading…

feat: Add support for Qwen2.5vl model

#123 opened Nov 4, 2025 by SoulSniper1212

Loading…

Docs: Clarify and requirements in README (Fixes #57)

#122 opened Nov 3, 2025 by SoulSniper1212

Loading…

新增block_manager, scheduler, sequence的代码注释

#119 opened Nov 3, 2025 by Ekk0hardter

Loading…

Prevented fully cached sequences from being processed in prefill phase

#117 opened Oct 16, 2025 by codingliuyg

Loading…

Add FuseMoeLinear and support Qwen3-Moe

#116 opened Oct 16, 2025 by Tokisakix

Loading…

Add Multi-Environment Support for Nano-vLLM

#113 opened Oct 10, 2025 by BaoZhuhan

Loading…

fix uv sync bug

#104 opened Sep 17, 2025 by CzealChen

Loading…

Fix crash on prefill-after-preemption: conditionally serialize token_ids

#103 opened Sep 15, 2025 by Terry-Uv

Loading…

增加了MiniCPM4的支持，以及新注册模型的功能

#95 opened Aug 11, 2025 by LDLINGLINGLING

Loading…

future: add qwen2 and llama support

#88 opened Jul 28, 2025 by leo-hancock

Loading…

[ROCm] add amd gpu guide and performance

#84 opened Jul 25, 2025 by billishyahao

Loading…

Update README.md to add AMD GPU instructions

#83 opened Jul 25, 2025 by zhangnju

Loading…

fix: update can_append and may_append logic for block allocation

#71 opened Jul 6, 2025 by skyloevil • Draft

add Qwen2 model support

#70 opened Jul 6, 2025 by Zlzzzupup

Loading…

Optimize block management in decode phase

#68 opened Jul 4, 2025 by xiaohajiayou

Loading…

Fix bug in block manager's may_append

#66 opened Jul 3, 2025 by yue-zhang-2025

Loading…

Fix: can_append function returns incorrect result

#65 opened Jul 2, 2025 by YjyJeff

Loading…

Add Serving Benchmark Script

#29 opened Jun 21, 2025 by tiannuo-yang

Loading…

ProTip! Updated in the last three days: updated:>2025-11-23.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!