COMMITS
/ llama_cpp/llama.py April 22, 2024
A
feat: Use new llama_token_is_eog in create_completions
Andrei Betlen committed
April 20, 2024
April 17, 2024
T
feat: Make saved state more compact on-disk (#1296)
tc-wolf committed
D
feat: Use all available CPUs for batch processing (#1345)
ddh0 committed
April 10, 2024
A
fix: pass correct type to chat handlers for chat completion logprobs
Andrei Betlen committed
April 3, 2024
A
fix: segfault when logits_all=False. Closes #1319
Andrei Betlen committed
April 1, 2024
L
feat: add support for KV cache quantization options (#1307)
Limour committed
March 31, 2024
W
feat: Add logprobs support to chat completions (#1311)
windspirit95 committed
March 14, 2024
A
fix: set default pooling type to unspecified
Andrei Betlen committed
A
fix: Set default pooling_type to mean, check for null pointer.
Andrei Betlen committed
March 9, 2024
D
feat: Switch embed to llama_get_embeddings_seq (#1263)
Douglas Hanley committed
March 6, 2024
A
Update llama.cpp
Andrei Betlen committed
March 1, 2024
A
docs: Add information re: auto chat formats. Closes #1236
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
February 28, 2024
S
fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter (#1230)
Sigbjørn Skjæret committed
February 25, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
February 23, 2024
A
fix: LlamaHFTokenizer now receives pre_tokens
Andrei Betlen committed
A
fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8'
Andrei Betlen committed
February 22, 2024
A
fix: Update from_pretrained defaults to match hf_hub_download
Andrei Betlen committed
February 21, 2024
A
A
feat: Pull models directly from huggingface (#1206)
Andrei committed
February 17, 2024
A
fix: self.numa missing
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
February 15, 2024
A
fix: create_embedding broken response for input type str
Andrei Betlen committed
D
fix: Incorporate embedding pooling layer fixes (#1194)
Douglas Hanley committed
February 14, 2024
D
feat: Support batch embeddings (#1186)
Douglas Hanley committed
February 13, 2024
A
fix: sample idx off-by-one error for logit_processors (#1179)
Andrew Lapp committed
February 12, 2024
A
fix: Always set logits_all = True when using speculative decoding
Andrei Betlen committed
A
feat: Generic chatml Function Calling (#957)
Andrei committed
February 9, 2024
A
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
Andrei Betlen committed
A
fix: revert _create_completions.
Andrei Betlen committed
February 8, 2024
A
feat: Move tokenizer to own module
Andrei Betlen committed
February 6, 2024
A
fix: Use llama_log_callback to avoid suppress_stdout_stderr
Andrei Betlen committed
January 31, 2024
A
Add speculative decoding (#1120)
Andrei committed
January 29, 2024
A
Automatically set chat format from gguf (#1110)
Andrei committed
January 24, 2024
A
fix: Check order
Andrei Betlen committed
A
fix: format
Andrei Betlen committed
P
fix: GGUF metadata KV overrides, re #1011 (#1116)
Phil H committed
January 19, 2024
A
feat: Expose gguf model metadata in metadata property
Andrei Betlen committed
A
Fix mirostat sampling
Andrei Betlen committed
January 18, 2024
A
Offload KQV by default
Andrei Betlen committed
January 17, 2024
A
Re-order classes in llama.py
Andrei Betlen committed
A
Move helper classes to _internals submodule
Andrei Betlen committed
A
Move cache classes to llama_cache submodule.
Andrei Betlen committed
January 15, 2024
A
Add split_mode option. Closes #1085
Andrei Betlen committed
P
Implement GGUF metadata KV overrides (#1011)
Phil H committed
January 10, 2024
S
Add ability to pass in penalize_nl param (#1068)
Stephen Hankinson committed
December 22, 2023