-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server: fix json_schema response_format handling
#18963
opened Jan 20, 2026 by
khasinski
Loading…
3 tasks done
common, server : use the same User-Agent by default
examples
server
#18957
opened Jan 20, 2026 by
angt
Loading…
jinja : implement mixed type object keys
jinja parser
Issues related to the jinja parser
testing
Everything test related
#18955
opened Jan 20, 2026 by
CISC
Loading…
model-conversion : add tensor-info.py utility
examples
python
python script changes
#18954
opened Jan 20, 2026 by
danbev
Loading…
CUDA: add gqa_ratio 4 for GLM 4.7 flash
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#18953
opened Jan 20, 2026 by
am17an
Loading…
quant : manual overrides of tensor types take precedence
#18952
opened Jan 20, 2026 by
ggerganov
Loading…
vulkan: Enable topk_moe fusion for GLM-4.7-Flash
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18947
opened Jan 20, 2026 by
jeffbolznv
Loading…
vulkan: Remove transfer_ctx, do everything in compute_ctx.
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18945
opened Jan 20, 2026 by
jeffbolznv
Loading…
vulkan: support flash attention GQA/split_k with small batches
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#18938
opened Jan 19, 2026 by
jeffbolznv
Loading…
chore: Update outdated GitHub Actions versions
devops
improvements to build systems and github actions
#18935
opened Jan 19, 2026 by
pgoslatara
Loading…
ggml-cuda: enable cuda-graphs for changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
script
Script related
n-cpu-moe
ggml
#18934
opened Jan 19, 2026 by
am17an
Loading…
ggml-rpc: Add graceful error handling for graph compute operations
ggml
changes relating to the ggml tensor library for machine learning
#18933
opened Jan 19, 2026 by
mfahsold
Loading…
ggml-cpu: add RISC-V Vector support for SSM scan (f32) in Mamba-2
ggml
changes relating to the ggml tensor library for machine learning
#18926
opened Jan 19, 2026 by
ixgbe
Loading…
metal : support virtual devices
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#18919
opened Jan 18, 2026 by
ggerganov
Loading…
vulkan: Use mul_mat_vec_id for small values of n
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18918
opened Jan 18, 2026 by
jeffbolznv
Loading…
fix: Use
tabular-nums for chat message statistics
examples
server
#18915
opened Jan 18, 2026 by
nathanlesage
Loading…
introduce legacy-torch flag for backward compatibility on older Intel…
python
python script changes
#18908
opened Jan 18, 2026 by
csabakecskemeti
Loading…
graph : utilize improvements to build systems and github actions
model
Model specific
ggml_build_forward_select() to avoid reallocations
devops
#18898
opened Jan 17, 2026 by
ggerganov
Loading…
HIP: add mmf for CDNA
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18896
opened Jan 17, 2026 by
zhang-hui-yulo
Loading…
2 of 3 tasks
llama: fix integer type consistency in split helpers
#18894
opened Jan 17, 2026 by
MaheshJakkala
Loading…
examples: llama evaluation tool for mmlu, aime, gsm8k
examples
python
python script changes
#18892
opened Jan 17, 2026 by
gatbontonpc
•
Draft
fit-params : Handle n_ctx 0 for models that entirely fit with n_ctx_train
#18890
opened Jan 17, 2026 by
65a
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.