-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Basically use to improve TTFT. This is very important especially for Ryzen AI Max+ 395 where the prefill stage is slow and unsuitable to be used in Agentic workflow. Hope it can be implemented with
Motivation
This is very important especially for Ryzen AI Max+ 395 where the prefill stage is slow and unsuitable to be used in Agentic workflow. Hope it can be implemented with GLM 4.7 flash
Possible Implementation
https://github.com/[Jingyu6/speculative_prefill](https://github.com/Jingyu6/speculative_prefill)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request