COMMITS
/ llama_cpp/llama.py March 24, 2026
V
feat: expose attention_type parameter in Llama.__init__ (#2143)
Victor Biederbeck committed
March 23, 2026
A
fix: Qwen 3.5 support (#2152)
Andrei committed
March 22, 2026
A
misc: Add Ruff formatting (#2148)
Andrei committed
August 7, 2025
S
fix: rename op_offloat to op_offload in llama.py (#2046)
sergey21000 committed
July 5, 2025
A
fix: Update reference to in Llama.embed. Closes #2037
Andrei Betlen committed
July 1, 2025
A
fix(minor): Fix type hint for older versions of python
Andrei Betlen committed
A
misc: Fix support for new parameters, deprecate rpc_servers parameter
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
January 29, 2025
S
fix: error showing time spent in llama perf context print (#1898)
Shaka Huang committed
A
feat: Update llama.cpp
Andrei Betlen committed
December 6, 2024
D
fix logit-bias type hint (#1802)
ddh0 committed
A
fix: Fix pickling of Llama class by setting seed from _seed member. Closes #1769
Andrei Betlen committed
October 31, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
September 26, 2024
A
fix: Additional fixes for speculative decoding
Andrei Betlen committed
X
misc: Rename all_text to remaining_text (#1658)
Xu Song committed
A
fix: Fix speculative decoding
Andrei Betlen committed
September 20, 2024
A
feat: Add option to configure n_ubatch
Andrei Betlen committed
September 19, 2024
J
A
X
fix: Fix memory allocation of ndarray (#1704)
Xu Song committed
A
misc: Format
Andrei Betlen committed
A
feat: Update sampling API for llama.cpp (#1742)
Andrei committed
August 31, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
August 29, 2024
B
feat: Enable detokenizing special tokens with `special=True` (#1596)
benniekiss committed
August 15, 2024
J
fix: Llama.close didn't free lora adapter (#1679)
Junpei Kawamoto committed
A
feat: Update llama.cpp
Andrei Betlen committed
August 12, 2024
L
fix: only print 'cache saved' in verbose mode (#1668)
Laurent Sorber committed
August 7, 2024
B
feat: Enable recursive search of HFFS.ls when using `from_pretrained` (#1656)
Benedikt Heidrich committed
X
feat: Add more detailed log for prefix-match (#1659)
Xu Song committed
July 31, 2024
S
fix : Missing LoRA adapter after API change (#1630)
Shamit Verma committed
July 18, 2024
July 17, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
July 9, 2024
A
fix(minor): Minor ruff fixes
Andrei Betlen committed
A
fix(misc): Format
Andrei Betlen committed
June 19, 2024
A
fix: Make destructor to automatically call .close() method on Llama class.
Andrei Betlen committed
June 13, 2024
J
feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)
Junpei Kawamoto committed
S
feat: Support SPM infill (#1492)
Sigbjørn Skjæret committed
June 4, 2024
N
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
nullname committed
A
fix: fix logprobs when BOS is not present (#1471)
Asghar Ghorbani committed
S
fix: Avoid duplicate special tokens in chat formats (#1439)
Sigbjørn Skjæret committed
May 29, 2024
A
fix: fix string value kv_overrides. Closes #1487
Andrei Betlen committed
May 24, 2024
L
feat: Improve Llama.eval performance by avoiding list conversion (#1476)
Linghan Zhong committed
May 16, 2024
A
fix: segfault for models without eos / bos tokens. Closes #1463
Andrei Betlen committed
May 14, 2024
S
misc: Remove unnecessary metadata lookups (#1448)
Sigbjørn Skjæret committed
May 9, 2024
S
feat: Support multiple chat templates - step 1 (#1396)
Sigbjørn Skjæret committed
May 8, 2024
S
feat: fill-in-middle support (#1386)
Sigbjørn Skjæret committed
B
fix: chat_format log where auto-detected format prints `None` (#1434)
Bruno Alvisio committed