:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
COMMITS
/ backend/cpp/llama-cpp/grpc-server.cpp March 29, 2026
E
feat: add distributed mode (#9124)
Ettore Di Giacinto committed
March 21, 2026
E
feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092)
Ettore Di Giacinto committed
March 20, 2026
E
chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' (#9079)
Ettore Di Giacinto committed
March 12, 2026
R
fix(llama-cpp): Set enable_thinking in the correct place (#8973)
Richard Palethorpe committed
March 8, 2026
E
feat(functions): add peg-based parsing and allow backends to return tool calls directly (#8838)
Ettore Di Giacinto committed
March 5, 2026
E
feat: pass-by metadata to predict options (#8795)
Ettore Di Giacinto committed
February 27, 2026
E
chore(deps): bump llama.cpp to 'ecbcb7ea9d3303097519723b264a8b5f1e977028' (#8672)
Ettore Di Giacinto committed
February 17, 2026
R
fix(llama-cpp): Pass parameters when using embedded template (#8590)
Richard Palethorpe committed
February 14, 2026
January 28, 2026
E
chore(llama.cpp): bump to 'f6b533d898ce84bae8d9fa8dfc6697ac087800bf' (#8275)
Ettore Di Giacinto committed
January 22, 2026
E
feat: detect thinking support from backend automatically if not explicitly set (#8167)
Ettore Di Giacinto committed
January 20, 2026
E
chore(deps): Bump llama.cpp to '1c7cf94b22a9dc6b1d32422f72a627787a4783a3' (#8136)
Ettore Di Giacinto committed
January 9, 2026
E
chore(llama.cpp): propagate errors during model load (#7937)
Ettore Di Giacinto committed
E
chore(deps): Bump llama.cpp to '480160d47297df43b43746294963476fc0a6e10f' (#7933)
Ettore Di Giacinto committed
January 2, 2026
E
fix(llama.cpp/mmproj): fix loading mmproj in nested sub-dirs different from model path (#7832)
Ettore Di Giacinto committed
December 23, 2025
E
chore(deps): Bump llama.cpp to '5b6c9bc0f3c8f55598b9999b65aff7ce4119bc15' and refactor usage of base params (#7706)
Ettore Di Giacinto committed
December 22, 2025
E
chore(deps): bump llama.cpp to '0e1ccf15c7b6d05c720551b537857ecf6194d420' (#7684)
Ettore Di Giacinto committed
December 15, 2025
E
chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Ettore Di Giacinto committed
December 14, 2025
S
fix(7355): Update llama-cpp grpc for v3 interface (#7566)
Simon Redman committed
December 12, 2025
E
fix(llama.cpp): handle corner cases with tool array content (#7528)
Ettore Di Giacinto committed
December 9, 2025
E
chore(deps/llama-cpp): bump to '2fa51c19b028180b35d316e9ed06f5f0f7ada2c1' (#7484)
Ettore Di Giacinto committed
December 4, 2025
E
chore(deps): bump llama.cpp to 'bde188d60f58012ada0725c6dd5ba7c69fe4dd87' (#7434)
Ettore Di Giacinto committed
December 1, 2025
E
chore: :arrow_up: Update ggml-org/llama.cpp to `7f8ef50cce40e3e7e4526a3696cb45658190e69a` (#7402)
Ettore Di Giacinto committed
November 29, 2025
E
chore(deps): bump llama.cpp to 'd82b7a7c1d73c0674698d9601b1bbb0200933f29' (#7392)
Ettore Di Giacinto committed
November 26, 2025
E
chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a (#7358)
Ettore Di Giacinto committed
November 21, 2025
E
fix(llama.cpp): handle corner cases with tool content (#7324)
Ettore Di Giacinto committed
November 16, 2025
E
feat: add support to logitbias and logprobs (#7283)
Ettore Di Giacinto committed
November 14, 2025
E
fix: handle tool errors (#7271)
Ettore Di Giacinto committed
E
chore(deps): bump llama.cpp to `c4abcb2457217198efdd67d02675f5fddb7071c2` (#7266)
Ettore Di Giacinto committed
November 12, 2025
E
feat: import models via URI (#7245)
Ettore Di Giacinto committed
M
fix(reranker): llama-cpp sort score desc, crop top_n (#7211)
Mikhail Khludnev committed
November 9, 2025
E
feat: respect context and add request cancellation (#7187)
Ettore Di Giacinto committed
November 7, 2025
E
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
Ettore Di Giacinto committed
November 2, 2025
E
feat(llama.cpp): allow to set cache-ram and ctx_shift (#7009)
Ettore Di Giacinto committed
October 23, 2025
C
fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc (#6672)
Chakib Benziane committed
October 10, 2025
E
fix(llama.cpp): correctly set grammar triggers (#6432)
Ettore Di Giacinto committed
E
chore(deps): bump llama.cpp to '1deee0f8d494981c32597dca8b5f8696d399b0f2' (#6421)
Ettore Di Giacinto committed
September 26, 2025
E
chore(deps): bump llama.cpp to '835b2b915c52bcabcd688d025eacff9a07b65f52' (#6347)
Ettore Di Giacinto committed
September 25, 2025
J
fix: reranking models limited to 512 tokens in llama.cpp backend (#6344)
jongames committed
September 13, 2025
E
fix(llama-cpp): correctly calculate embeddings (#6259)
Ettore Di Giacinto committed
August 31, 2025
E
feat(flash_attention): set auto for flash_attention in llama.cpp (#6168)
Ettore Di Giacinto committed
August 23, 2025
E
chore(deps): bump llama.cpp to '45363632cbd593537d541e81b600242e0b3d47fc' (#6122)
Ettore Di Giacinto committed
August 15, 2025
E
chore(deps): bump llama.cpp to 'df36bce667bf14f8e538645547754386f9516326 (#6062)
Ettore Di Giacinto committed
August 6, 2025
E
fix(llama.cpp): do not default to linear rope (#5982)
Ettore Di Giacinto committed
July 18, 2025
E
feat: do not bundle llama-cpp anymore (#5790)
Ettore Di Giacinto committed