COMMITS
/ llama_cpp/llama.py September 19, 2024
A
Improve LlamaState test and fix rng / seed
Andrei Betlen committed
A
Merge branch 'main' into patch-3
Andrei committed
A
misc: Format
Andrei Betlen committed
A
Merge branch 'main' into patch-3
Andrei committed
A
feat: Update sampling API for llama.cpp (#1742)
Andrei committed
September 18, 2024
A
Merge branch 'main' into patch-3
Andrei committed
August 31, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
August 29, 2024
B
feat: Enable detokenizing special tokens with `special=True` (#1596)
benniekiss committed
August 25, 2024
X
Fix memory allocation of ndarray
Xu Song committed
August 15, 2024
J
fix: Llama.close didn't free lora adapter (#1679)
Junpei Kawamoto committed
A
feat: Update llama.cpp
Andrei Betlen committed
August 12, 2024
L
fix: only print 'cache saved' in verbose mode (#1668)
Laurent Sorber committed
August 7, 2024
B
feat: Enable recursive search of HFFS.ls when using `from_pretrained` (#1656)
Benedikt Heidrich committed
X
feat: Add more detailed log for prefix-match (#1659)
Xu Song committed
July 31, 2024
S
fix : Missing LoRA adapter after API change (#1630)
Shamit Verma committed
July 18, 2024
July 17, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
July 9, 2024
A
fix(minor): Minor ruff fixes
Andrei Betlen committed
A
fix(misc): Format
Andrei Betlen committed
June 19, 2024
A
fix: Make destructor to automatically call .close() method on Llama class.
Andrei Betlen committed
June 13, 2024
J
feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)
Junpei Kawamoto committed
S
feat: Support SPM infill (#1492)
Sigbjørn Skjæret committed
June 4, 2024
N
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
nullname committed
A
fix: fix logprobs when BOS is not present (#1471)
Asghar Ghorbani committed
S
fix: Avoid duplicate special tokens in chat formats (#1439)
Sigbjørn Skjæret committed
May 29, 2024
A
fix: fix string value kv_overrides. Closes #1487
Andrei Betlen committed
May 24, 2024
L
feat: Improve Llama.eval performance by avoiding list conversion (#1476)
Linghan Zhong committed
May 16, 2024
A
fix: segfault for models without eos / bos tokens. Closes #1463
Andrei Betlen committed
May 14, 2024
S
misc: Remove unnecessary metadata lookups (#1448)
Sigbjørn Skjæret committed
May 9, 2024
S
feat: Support multiple chat templates - step 1 (#1396)
Sigbjørn Skjæret committed
May 8, 2024
S
feat: fill-in-middle support (#1386)
Sigbjørn Skjæret committed
B
fix: chat_format log where auto-detected format prints `None` (#1434)
Bruno Alvisio committed
May 3, 2024
A
fix: Use memmove to copy str_value kv_override. Closes #1417
Andrei Betlen committed
April 30, 2024
A
fix: wrong parameter for flash attention in pickle __getstate__
Andrei Betlen committed
A
feat: Add option to enable `flash_attn` to Lllama params and ModelSettings
Andrei Betlen committed
April 28, 2024
A
feat: Add support for str type kv_overrides
Andrei Betlen committed
April 26, 2024
D
feat: Allow for possibly non-pooled embeddings (#1380)
Douglas Hanley committed
April 22, 2024
A
feat: Use new llama_token_is_eog in create_completions
Andrei Betlen committed
April 20, 2024
April 17, 2024
T
feat: Make saved state more compact on-disk (#1296)
tc-wolf committed
D
feat: Use all available CPUs for batch processing (#1345)
ddh0 committed
April 10, 2024
A
fix: pass correct type to chat handlers for chat completion logprobs
Andrei Betlen committed
April 3, 2024
A
fix: segfault when logits_all=False. Closes #1319
Andrei Betlen committed
April 1, 2024
L
feat: add support for KV cache quantization options (#1307)
Limour committed
March 31, 2024
W
feat: Add logprobs support to chat completions (#1311)
windspirit95 committed
March 14, 2024
A
fix: set default pooling type to unspecified
Andrei Betlen committed
A
fix: Set default pooling_type to mean, check for null pointer.
Andrei Betlen committed
March 9, 2024
D
feat: Switch embed to llama_get_embeddings_seq (#1263)
Douglas Hanley committed