Commits: llama_cpp/llama.py - abetlen/llama-cpp-python

abetlen / llama-cpp-python UNCLAIMED

Python bindings for llama.cpp

0 0 2 Python

COMMITS

/ llama_cpp/llama.py

v0.2.83

July 18, 2024

fix: Change repeat_penalty to 1.0 to match llama.cpp defaults (#1590)

ddh0 committed 1y ago

0700476

July 17, 2024

feat: Update llama.cpp

Andrei Betlen committed 1y ago

7613d23

July 9, 2024

fix(minor): Minor ruff fixes

Andrei Betlen committed 1y ago

08f2bb3

fix(misc): Format

Andrei Betlen committed 1y ago

c1ae815

June 19, 2024

fix: Make destructor to automatically call .close() method on Llama class.

Andrei Betlen committed 1y ago

4c1d74c

June 13, 2024

feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)

Junpei Kawamoto committed 1y ago

320a5d7

feat: Support SPM infill (#1492)

Sigbjørn Skjæret committed 1y ago

dbcf64c

June 4, 2024

feat: adding `rpc_servers` parameter to `Llama` class (#1477)

nullname committed 1y ago

d634efc

fix: fix logprobs when BOS is not present (#1471)

Asghar Ghorbani committed 1y ago

6e0642c

fix: Avoid duplicate special tokens in chat formats (#1439)

Sigbjørn Skjæret committed 1y ago

027f7bc

May 29, 2024

fix: fix string value kv_overrides. Closes #1487

Andrei Betlen committed 1y ago

df45a4b

May 24, 2024

feat: Improve Llama.eval performance by avoiding list conversion (#1476)

Linghan Zhong committed 1y ago

5cae104

May 16, 2024

fix: segfault for models without eos / bos tokens. Closes #1463

Andrei Betlen committed 1y ago

d99a6ba

May 14, 2024

feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)

twaka committed 1y ago

5212fb0

misc: Remove unnecessary metadata lookups (#1448)

Sigbjørn Skjæret committed 1y ago

389e09c

May 9, 2024

feat: Support multiple chat templates - step 1 (#1396)

Sigbjørn Skjæret committed 1y ago

5ab40e6

May 8, 2024

feat: fill-in-middle support (#1386)

Sigbjørn Skjæret committed 1y ago

4a7122d

fix: chat_format log where auto-detected format prints `None` (#1434)

Bruno Alvisio committed 1y ago

a50d24e

May 3, 2024

fix: Use memmove to copy str_value kv_override. Closes #1417

Andrei Betlen committed 1y ago

9f7a855

April 30, 2024

fix: wrong parameter for flash attention in pickle __getstate__

Andrei Betlen committed 1y ago

29b6e9a

feat: Add option to enable `flash_attn` to Lllama params and ModelSettings

Andrei Betlen committed 1y ago

22d77ee

April 28, 2024

feat: Add support for str type kv_overrides

Andrei Betlen committed 1y ago

a411612

April 26, 2024

feat: Allow for possibly non-pooled embeddings (#1380)

Douglas Hanley committed 1y ago

f6ed21f

April 22, 2024

feat: Use new llama_token_is_eog in create_completions

Andrei Betlen committed 1y ago

d40a250

April 20, 2024

feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct

Andrei Betlen committed 1y ago

cc81afe

April 17, 2024

feat: Make saved state more compact on-disk (#1296)

tc-wolf committed 1y ago

4924455

feat: Use all available CPUs for batch processing (#1345)

ddh0 committed 1y ago

c96b2da

April 10, 2024

fix: pass correct type to chat handlers for chat completion logprobs

Andrei Betlen committed 1y ago

bb65b4d

April 3, 2024

fix: segfault when logits_all=False. Closes #1319

Andrei Betlen committed 2y ago

8649d76

April 1, 2024

feat: add support for KV cache quantization options (#1307)

Limour committed 2y ago

f165048

March 31, 2024

feat: Add logprobs support to chat completions (#1311)

windspirit95 committed 2y ago

aa9f1ae

March 14, 2024

fix: set default pooling type to unspecified

Andrei Betlen committed 2y ago

4084aab

fix: Set default pooling_type to mean, check for null pointer.

Andrei Betlen committed 2y ago

d318cc8

March 9, 2024

feat: Switch embed to llama_get_embeddings_seq (#1263)

Douglas Hanley committed 2y ago

2811014

March 6, 2024

Update llama.cpp

Andrei Betlen committed 2y ago

93dc56a

March 1, 2024

docs: Add information re: auto chat formats. Closes #1236

Andrei Betlen committed 2y ago

97aa3a1

feat: Update llama.cpp

Andrei Betlen committed 2y ago

f062a7f

February 28, 2024

fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter (#1230)

Sigbjørn Skjæret committed 2y ago

c36ab15

February 25, 2024

feat: Update llama.cpp

Andrei Betlen committed 2y ago

2292af5

February 23, 2024

fix: LlamaHFTokenizer now receives pre_tokens

Andrei Betlen committed 2y ago

47bad30

fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8'

Andrei Betlen committed 2y ago

db776a8

February 22, 2024

fix: Update from_pretrained defaults to match hf_hub_download

Andrei Betlen committed 2y ago

e6d6260

February 21, 2024

feat(low-level-api): Improve API static type-safety and performance (#1205)

Andrei committed 2y ago

7f51b60

feat: Pull models directly from huggingface (#1206)

Andrei committed 2y ago

0f8aa4a

February 17, 2024

fix: self.numa missing

Andrei Betlen committed 2y ago

53f6f5f

feat: Update llama.cpp

Andrei Betlen committed 2y ago

fdce078

February 15, 2024

fix: create_embedding broken response for input type str

Andrei Betlen committed 2y ago

0ce66bc

fix: Incorporate embedding pooling layer fixes (#1194)

Douglas Hanley committed 2y ago

7bb91f0

February 14, 2024

feat: Support batch embeddings (#1186)

Douglas Hanley committed 2y ago

d7a6791

February 13, 2024

fix: sample idx off-by-one error for logit_processors (#1179)

Andrew Lapp committed 2y ago

d6be533