COMMITS
/ llama_cpp/server/model.py March 22, 2026
A
misc: Add Ruff formatting (#2148)
Andrei committed
July 3, 2025
A
feat: Add support for new mtmd api, add Qwen2.5-VL chat handler
Andrei Betlen committed
September 20, 2024
A
feat: Add option to configure n_ubatch
Andrei Betlen committed
August 29, 2024
A
feat: Add server chat_format minicpm-v-2.6 for MiniCPMv26ChatHandler
Andrei Betlen committed
July 17, 2024
G
fix(server): Use split_mode from model settings (#1594)
Grider committed
June 13, 2024
J
feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)
Junpei Kawamoto committed
June 4, 2024
N
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
nullname committed
May 29, 2024
A
fix: fix string value kv_overrides. Closes #1487
Andrei Betlen committed
May 3, 2024
D
fix(server): Propagate `flash_attn` to model load. (#1424)
Daniel Thuerck committed
May 2, 2024
A
feat: Add llama-3-vision-alpha chat format
Andrei Betlen committed
April 30, 2024
April 1, 2024
L
feat: add support for KV cache quantization options (#1307)
Limour committed
February 28, 2024
A
misc: Format
Andrei Betlen committed
February 26, 2024
A
February 8, 2024
A
fix: broken import
Andrei Betlen committed
January 31, 2024
A
Add speculative decoding (#1120)
Andrei committed
January 21, 2024
A
January 19, 2024
A
January 15, 2024
P
Implement GGUF metadata KV overrides (#1011)
Phil H committed
December 22, 2023
D
[Feat] Multi model support (#931)
Dave committed