Commits: llama_cpp/llama.py - abetlen/llama-cpp-python

abetlen / llama-cpp-python UNCLAIMED

Python bindings for llama.cpp

0 0 1 Python

COMMITS

/ llama_cpp/llama.py

main

March 24, 2026

feat: expose attention_type parameter in Llama.__init__ (#2143)

Victor Biederbeck committed 3d ago

7b38c31

March 23, 2026

fix: Qwen 3.5 support (#2152)

Andrei committed 4d ago

11e7a55

feat: Update llama.cpp to ggerganov/llama.cpp@49bfddeca18e62fa3d39114a23e9fcbdf8a22388 (#2151)

Andrei committed 4d ago

18aa31e

March 22, 2026

misc: Add Ruff formatting (#2148)

Andrei committed 4d ago

a9b4a06

August 7, 2025

fix: rename op_offloat to op_offload in llama.py (#2046)

sergey21000 committed 7mo ago

30ddd56

July 5, 2025

fix: Update reference to in Llama.embed. Closes #2037

Andrei Betlen committed 8mo ago

9e5a4ea

July 1, 2025

fix(minor): Fix type hint for older versions of python

Andrei Betlen committed 8mo ago

5a635f4

misc: Fix support for new parameters, deprecate rpc_servers parameter

Andrei Betlen committed 8mo ago

51dce74

feat: Update llama.cpp

Andrei Betlen committed 8mo ago

0d475d7

January 29, 2025

fix: error showing time spent in llama perf context print (#1898)

Shaka Huang committed 1y ago

4442ff8

feat: Update llama.cpp

Andrei Betlen committed 1y ago

80be68a

December 6, 2024

fix logit-bias type hint (#1802)

ddh0 committed 1y ago

fa04cdc

fix: Fix pickling of Llama class by setting seed from _seed member. Closes #1769

Andrei Betlen committed 1y ago

2523472

October 31, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

e712cff

September 26, 2024

fix: Additional fixes for speculative decoding

Andrei Betlen committed 1y ago

e975dab

misc: Rename all_text to remaining_text (#1658)

Xu Song committed 1y ago

11d9562

fix: Fix speculative decoding

Andrei Betlen committed 1y ago

9992c50

September 20, 2024

feat: Add option to configure n_ubatch

Andrei Betlen committed 1y ago

6c44a3f

September 19, 2024

feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=[...]) . Closes #1341

Jonathan Jordan committed 1y ago

84c0920

fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context. Closes #1542

Andrei Betlen committed 1y ago

29afcfd

fix: Fix memory allocation of ndarray (#1704)

Xu Song committed 1y ago

22cedad

misc: Format

Andrei Betlen committed 1y ago

9b64bb5

feat: Update sampling API for llama.cpp (#1742)

Andrei committed 1y ago

f8fcb3e

August 31, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

9769e57

August 29, 2024

feat: Enable detokenizing special tokens with `special=True` (#1596)

benniekiss committed 1y ago

d981d32

August 15, 2024

fix: Llama.close didn't free lora adapter (#1679)

Junpei Kawamoto committed 1y ago

3c7501b

feat: Update llama.cpp

Andrei Betlen committed 1y ago

63d65ac

August 12, 2024

fix: only print 'cache saved' in verbose mode (#1668)

Laurent Sorber committed 1y ago

9bab46f

August 8, 2024

fix: typo

Andrei Betlen committed 1y ago

7aaf701

August 7, 2024

feat: Enable recursive search of HFFS.ls when using `from_pretrained` (#1656)

Benedikt Heidrich committed 1y ago

5e39a85

feat: Add more detailed log for prefix-match (#1659)

Xu Song committed 1y ago

e966f3b

July 31, 2024

fix : Missing LoRA adapter after API change (#1630)

Shamit Verma committed 1y ago

1f0b9a2

July 18, 2024

fix: Change repeat_penalty to 1.0 to match llama.cpp defaults (#1590)

ddh0 committed 1y ago

0700476

July 17, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

7613d23

July 9, 2024

fix(minor): Minor ruff fixes

Andrei Betlen committed 1y ago

08f2bb3

fix(misc): Format

Andrei Betlen committed 1y ago

c1ae815

June 19, 2024

fix: Make destructor to automatically call .close() method on Llama class.

Andrei Betlen committed 1y ago

4c1d74c

June 13, 2024

feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)

Junpei Kawamoto committed 1y ago

320a5d7

feat: Support SPM infill (#1492)

Sigbjørn Skjæret committed 1y ago

dbcf64c

June 4, 2024

feat: adding `rpc_servers` parameter to `Llama` class (#1477)

nullname committed 1y ago

d634efc

fix: fix logprobs when BOS is not present (#1471)

Asghar Ghorbani committed 1y ago

6e0642c

fix: Avoid duplicate special tokens in chat formats (#1439)

Sigbjørn Skjæret committed 1y ago

027f7bc

May 29, 2024

fix: fix string value kv_overrides. Closes #1487

Andrei Betlen committed 1y ago

df45a4b

May 24, 2024

feat: Improve Llama.eval performance by avoiding list conversion (#1476)

Linghan Zhong committed 1y ago

5cae104

May 16, 2024

fix: segfault for models without eos / bos tokens. Closes #1463

Andrei Betlen committed 1y ago

d99a6ba

May 14, 2024

feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)

twaka committed 1y ago

5212fb0

misc: Remove unnecessary metadata lookups (#1448)

Sigbjørn Skjæret committed 1y ago

389e09c

May 9, 2024

feat: Support multiple chat templates - step 1 (#1396)

Sigbjørn Skjæret committed 1y ago

5ab40e6

May 8, 2024

feat: fill-in-middle support (#1386)

Sigbjørn Skjæret committed 1y ago

4a7122d

fix: chat_format log where auto-detected format prints `None` (#1434)

Bruno Alvisio committed 1y ago

a50d24e