Commits: llama_cpp/llama.py - abetlen/llama-cpp-python

abetlen / llama-cpp-python UNCLAIMED

Python bindings for llama.cpp

0 0 3 Python

COMMITS

/ llama_cpp/llama.py

patch-3

September 19, 2024

Improve LlamaState test and fix rng / seed

Andrei Betlen committed 1y ago

8306d1c

Merge branch 'main' into patch-3

Andrei committed 1y ago

0b64530

misc: Format

Andrei Betlen committed 1y ago

9b64bb5

Merge branch 'main' into patch-3

Andrei committed 1y ago

d8dbdfd

feat: Update sampling API for llama.cpp (#1742)

Andrei committed 1y ago

f8fcb3e

September 18, 2024

Merge branch 'main' into patch-3

Andrei committed 1y ago

4f222b2

August 31, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

9769e57

August 29, 2024

feat: Enable detokenizing special tokens with `special=True` (#1596)

benniekiss committed 1y ago

d981d32

August 25, 2024

Fix memory allocation of ndarray

Xu Song committed 1y ago

f993209

August 15, 2024

fix: Llama.close didn't free lora adapter (#1679)

Junpei Kawamoto committed 1y ago

3c7501b

feat: Update llama.cpp

Andrei Betlen committed 1y ago

63d65ac

August 12, 2024

fix: only print 'cache saved' in verbose mode (#1668)

Laurent Sorber committed 1y ago

9bab46f

August 8, 2024

fix: typo

Andrei Betlen committed 1y ago

7aaf701

August 7, 2024

feat: Enable recursive search of HFFS.ls when using `from_pretrained` (#1656)

Benedikt Heidrich committed 1y ago

5e39a85

feat: Add more detailed log for prefix-match (#1659)

Xu Song committed 1y ago

e966f3b

July 31, 2024

fix : Missing LoRA adapter after API change (#1630)

Shamit Verma committed 1y ago

1f0b9a2

July 18, 2024

fix: Change repeat_penalty to 1.0 to match llama.cpp defaults (#1590)

ddh0 committed 1y ago

0700476

July 17, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

7613d23

July 9, 2024

fix(minor): Minor ruff fixes

Andrei Betlen committed 1y ago

08f2bb3

fix(misc): Format

Andrei Betlen committed 1y ago

c1ae815

June 19, 2024

fix: Make destructor to automatically call .close() method on Llama class.

Andrei Betlen committed 1y ago

4c1d74c

June 13, 2024

feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)

Junpei Kawamoto committed 1y ago

320a5d7

feat: Support SPM infill (#1492)

Sigbjørn Skjæret committed 1y ago

dbcf64c

June 4, 2024

feat: adding `rpc_servers` parameter to `Llama` class (#1477)

nullname committed 1y ago

d634efc

fix: fix logprobs when BOS is not present (#1471)

Asghar Ghorbani committed 1y ago

6e0642c

fix: Avoid duplicate special tokens in chat formats (#1439)

Sigbjørn Skjæret committed 1y ago

027f7bc

May 29, 2024

fix: fix string value kv_overrides. Closes #1487

Andrei Betlen committed 1y ago

df45a4b

May 24, 2024

feat: Improve Llama.eval performance by avoiding list conversion (#1476)

Linghan Zhong committed 1y ago

5cae104

May 16, 2024

fix: segfault for models without eos / bos tokens. Closes #1463

Andrei Betlen committed 1y ago

d99a6ba

May 14, 2024

feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)

twaka committed 1y ago

5212fb0

misc: Remove unnecessary metadata lookups (#1448)

Sigbjørn Skjæret committed 1y ago

389e09c

May 9, 2024

feat: Support multiple chat templates - step 1 (#1396)

Sigbjørn Skjæret committed 1y ago

5ab40e6

May 8, 2024

feat: fill-in-middle support (#1386)

Sigbjørn Skjæret committed 1y ago

4a7122d

fix: chat_format log where auto-detected format prints `None` (#1434)

Bruno Alvisio committed 1y ago

a50d24e

May 3, 2024

fix: Use memmove to copy str_value kv_override. Closes #1417

Andrei Betlen committed 1y ago

9f7a855

April 30, 2024

fix: wrong parameter for flash attention in pickle __getstate__

Andrei Betlen committed 1y ago

29b6e9a

feat: Add option to enable `flash_attn` to Lllama params and ModelSettings

Andrei Betlen committed 1y ago

22d77ee

April 28, 2024

feat: Add support for str type kv_overrides

Andrei Betlen committed 1y ago

a411612

April 26, 2024

feat: Allow for possibly non-pooled embeddings (#1380)

Douglas Hanley committed 1y ago

f6ed21f

April 22, 2024

feat: Use new llama_token_is_eog in create_completions

Andrei Betlen committed 1y ago

d40a250

April 20, 2024

feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct

Andrei Betlen committed 1y ago

cc81afe

April 17, 2024

feat: Make saved state more compact on-disk (#1296)

tc-wolf committed 1y ago

4924455

feat: Use all available CPUs for batch processing (#1345)

ddh0 committed 1y ago

c96b2da

April 10, 2024

fix: pass correct type to chat handlers for chat completion logprobs

Andrei Betlen committed 1y ago

bb65b4d

April 3, 2024

fix: segfault when logits_all=False. Closes #1319

Andrei Betlen committed 2y ago

8649d76

April 1, 2024

feat: add support for KV cache quantization options (#1307)

Limour committed 2y ago

f165048

March 31, 2024

feat: Add logprobs support to chat completions (#1311)

windspirit95 committed 2y ago

aa9f1ae

March 14, 2024

fix: set default pooling type to unspecified

Andrei Betlen committed 2y ago

4084aab

fix: Set default pooling_type to mean, check for null pointer.

Andrei Betlen committed 2y ago

d318cc8

March 9, 2024

feat: Switch embed to llama_get_embeddings_seq (#1263)

Douglas Hanley committed 2y ago

2811014