Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

server: fix json_schema response_format handling
#18963 opened Jan 20, 2026 by khasinski Loading…
3 tasks done
cli : fix reasoning responses in CLI
#18961 opened Jan 20, 2026 by ngxson Loading…
CUDA: use mmvq for mul-mat-id for small batch sizes
#18958 opened Jan 20, 2026 by am17an Loading…
jinja : implement mixed type object keys jinja parser Issues related to the jinja parser testing Everything test related
#18955 opened Jan 20, 2026 by CISC Loading…
model-conversion : add tensor-info.py utility examples python python script changes
#18954 opened Jan 20, 2026 by danbev Loading…
CUDA: add gqa_ratio 4 for GLM 4.7 flash ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#18953 opened Jan 20, 2026 by am17an Loading…
vulkan: Enable topk_moe fusion for GLM-4.7-Flash ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18947 opened Jan 20, 2026 by jeffbolznv Loading…
vulkan: Remove transfer_ctx, do everything in compute_ctx. ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18945 opened Jan 20, 2026 by jeffbolznv Loading…
vulkan: support flash attention GQA/split_k with small batches ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend
#18938 opened Jan 19, 2026 by jeffbolznv Loading…
chore: Update outdated GitHub Actions versions devops improvements to build systems and github actions
#18935 opened Jan 19, 2026 by pgoslatara Loading…
ggml-cuda: enable cuda-graphs for n-cpu-moe ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes script Script related
#18934 opened Jan 19, 2026 by am17an Loading…
ggml-rpc: Add graceful error handling for graph compute operations ggml changes relating to the ggml tensor library for machine learning
#18933 opened Jan 19, 2026 by mfahsold Loading…
2
2
ggml-cpu: add RISC-V Vector support for SSM scan (f32) in Mamba-2 ggml changes relating to the ggml tensor library for machine learning
#18926 opened Jan 19, 2026 by ixgbe Loading…
common: add reranking server configuration preset
#18923 opened Jan 18, 2026 by ingyukoh Loading…
metal : support virtual devices Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#18919 opened Jan 18, 2026 by ggerganov Loading…
vulkan: Use mul_mat_vec_id for small values of n ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18918 opened Jan 18, 2026 by jeffbolznv Loading…
graph : utilize ggml_build_forward_select() to avoid reallocations devops improvements to build systems and github actions model Model specific
#18898 opened Jan 17, 2026 by ggerganov Loading…
HIP: add mmf for CDNA ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18896 opened Jan 17, 2026 by zhang-hui-yulo Loading…
2 of 3 tasks
ProTip! Mix and match filters to narrow down what you’re looking for.