COMMITS
/ llama_cpp/llama.py July 18, 2024
July 17, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
July 9, 2024
A
fix(minor): Minor ruff fixes
Andrei Betlen committed
A
fix(misc): Format
Andrei Betlen committed
June 19, 2024
A
fix: Make destructor to automatically call .close() method on Llama class.
Andrei Betlen committed
June 13, 2024
J
feat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)
Junpei Kawamoto committed
S
feat: Support SPM infill (#1492)
Sigbjørn Skjæret committed
June 4, 2024
N
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
nullname committed
A
fix: fix logprobs when BOS is not present (#1471)
Asghar Ghorbani committed
S
fix: Avoid duplicate special tokens in chat formats (#1439)
Sigbjørn Skjæret committed
May 29, 2024
A
fix: fix string value kv_overrides. Closes #1487
Andrei Betlen committed
May 24, 2024
L
feat: Improve Llama.eval performance by avoiding list conversion (#1476)
Linghan Zhong committed
May 16, 2024
A
fix: segfault for models without eos / bos tokens. Closes #1463
Andrei Betlen committed
May 14, 2024
S
misc: Remove unnecessary metadata lookups (#1448)
Sigbjørn Skjæret committed
May 9, 2024
S
feat: Support multiple chat templates - step 1 (#1396)
Sigbjørn Skjæret committed
May 8, 2024
S
feat: fill-in-middle support (#1386)
Sigbjørn Skjæret committed
B
fix: chat_format log where auto-detected format prints `None` (#1434)
Bruno Alvisio committed
May 3, 2024
A
fix: Use memmove to copy str_value kv_override. Closes #1417
Andrei Betlen committed
April 30, 2024
A
fix: wrong parameter for flash attention in pickle __getstate__
Andrei Betlen committed
A
feat: Add option to enable `flash_attn` to Lllama params and ModelSettings
Andrei Betlen committed
April 28, 2024
A
feat: Add support for str type kv_overrides
Andrei Betlen committed
April 26, 2024
D
feat: Allow for possibly non-pooled embeddings (#1380)
Douglas Hanley committed
April 22, 2024
A
feat: Use new llama_token_is_eog in create_completions
Andrei Betlen committed
April 20, 2024
April 17, 2024
T
feat: Make saved state more compact on-disk (#1296)
tc-wolf committed
D
feat: Use all available CPUs for batch processing (#1345)
ddh0 committed
April 10, 2024
A
fix: pass correct type to chat handlers for chat completion logprobs
Andrei Betlen committed
April 3, 2024
A
fix: segfault when logits_all=False. Closes #1319
Andrei Betlen committed
April 1, 2024
L
feat: add support for KV cache quantization options (#1307)
Limour committed
March 31, 2024
W
feat: Add logprobs support to chat completions (#1311)
windspirit95 committed
March 14, 2024
A
fix: set default pooling type to unspecified
Andrei Betlen committed
A
fix: Set default pooling_type to mean, check for null pointer.
Andrei Betlen committed
March 9, 2024
D
feat: Switch embed to llama_get_embeddings_seq (#1263)
Douglas Hanley committed
March 6, 2024
A
Update llama.cpp
Andrei Betlen committed
March 1, 2024
A
docs: Add information re: auto chat formats. Closes #1236
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
February 28, 2024
S
fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter (#1230)
Sigbjørn Skjæret committed
February 25, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
February 23, 2024
A
fix: LlamaHFTokenizer now receives pre_tokens
Andrei Betlen committed
A
fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8'
Andrei Betlen committed
February 22, 2024
A
fix: Update from_pretrained defaults to match hf_hub_download
Andrei Betlen committed
February 21, 2024
A
A
feat: Pull models directly from huggingface (#1206)
Andrei committed
February 17, 2024
A
fix: self.numa missing
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
February 15, 2024
A
fix: create_embedding broken response for input type str
Andrei Betlen committed
D
fix: Incorporate embedding pooling layer fixes (#1194)
Douglas Hanley committed
February 14, 2024
D
feat: Support batch embeddings (#1186)
Douglas Hanley committed
February 13, 2024
A
fix: sample idx off-by-one error for logit_processors (#1179)
Andrew Lapp committed