COMMITS
/ llama_cpp/server/settings.py September 20, 2024
A
feat: Add option to configure n_ubatch
Andrei Betlen committed
July 9, 2024
A
fix(misc): Format
Andrei Betlen committed
July 2, 2024
June 4, 2024
N
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
nullname committed
May 5, 2024
A
feat(server): Add support for setting root_path. Closes #1420
Andrei Betlen committed
April 30, 2024
A
feat: Add option to enable `flash_attn` to Lllama params and ModelSettings
Andrei Betlen committed
April 26, 2024
A
fix: pydantic deprecation warning
Andrei Betlen committed
April 23, 2024
April 17, 2024
K
feat: add `disable_ping_events` flag (#1257)
khimaros committed
D
feat: Use all available CPUs for batch processing (#1345)
ddh0 committed
April 1, 2024
L
feat: add support for KV cache quantization options (#1307)
Limour committed
February 28, 2024
A
misc: Format
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
February 26, 2024
A
A
fix: remove prematurely commited change
Andrei Betlen committed
February 25, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
February 17, 2024
A
feat: Update llama.cpp
Andrei Betlen committed
January 31, 2024
A
Add speculative decoding (#1120)
Andrei committed
January 29, 2024
A
Automatically set chat format from gguf (#1110)
Andrei committed
January 19, 2024
A
January 18, 2024
A
Offload KQV by default
Andrei Betlen committed
January 16, 2024
January 15, 2024
A
Add split_mode option. Closes #1085
Andrei Betlen committed
P
Implement GGUF metadata KV overrides (#1011)
Phil H committed
December 22, 2023
A
docs: add server config docs
Andrei Betlen committed
D
[Feat] Multi model support (#931)
Dave committed