Skip to content
Discussion options

You must be logged in to vote

Try

Reducing --mem-fraction-static
Using a NVFP4 checkpoint (https://huggingface.co/llmat/Mistral-Small-Instruct-2409-NVFP4). The memory footprint is lower

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by b8zhong
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants