Commits: llama_cpp/server/settings.py - abetlen/llama-cpp-python - Morph

SIGN IN SIGN UP

abetlen / llama-cpp-python UNCLAIMED

Python bindings for llama.cpp

0 0 1 Python

COMMITS

/ llama_cpp/server/settings.py

main

September 20, 2024

A

feat: Add option to configure n_ubatch

Andrei Betlen committed 1y ago

July 9, 2024

A

fix(misc): Format

Andrei Betlen committed 1y ago

July 2, 2024

A

fix(server): Update `embeddings=False` by default. Embeddings should be enabled by default for embedding models.

Andrei Betlen committed 1y ago

June 4, 2024

N

feat: adding `rpc_servers` parameter to `Llama` class (#1477)

nullname committed 1y ago

May 5, 2024

A

feat(server): Add support for setting root_path. Closes #1420

Andrei Betlen committed 1y ago

April 30, 2024

A

feat: Add option to enable `flash_attn` to Lllama params and ModelSettings

Andrei Betlen committed 1y ago

April 26, 2024

A

fix: pydantic deprecation warning

Andrei Betlen committed 1y ago

April 23, 2024

S

feat(server): Provide ability to dynamically allocate all threads if desired using `-1` (#1364)

Sean Bailey committed 1y ago

April 17, 2024

K

feat: add `disable_ping_events` flag (#1257)

khimaros committed 1y ago

D

feat: Use all available CPUs for batch processing (#1345)

ddh0 committed 1y ago

April 1, 2024

L

feat: add support for KV cache quantization options (#1307)

Limour committed 2y ago

February 28, 2024

A

Andrei Betlen committed 2y ago

A

feat: Update llama.cpp

Andrei Betlen committed 2y ago

February 26, 2024

A

feat(server): Add support for pulling models from Huggingface Hub (#1222)

Andrei committed 2y ago

A

fix: remove prematurely commited change

Andrei Betlen committed 2y ago

February 25, 2024

A

feat: Update llama.cpp

Andrei Betlen committed 2y ago

February 17, 2024

A

feat: Update llama.cpp

Andrei Betlen committed 2y ago

January 31, 2024

A

Add speculative decoding (#1120)

Andrei committed 2y ago

January 29, 2024

A

Automatically set chat format from gguf (#1110)

Andrei committed 2y ago

January 19, 2024

A

feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files.

Andrei Betlen committed 2y ago

January 18, 2024

A

Offload KQV by default

Andrei Betlen committed 2y ago

January 16, 2024

K

fix(cli): allow passing n_ctx=0 to openAI API server args to use model n_ctx_train field per #1015 (#1093)

Kyle Mistele committed 2y ago

January 15, 2024

A

Add split_mode option. Closes #1085

Andrei Betlen committed 2y ago

P

Implement GGUF metadata KV overrides (#1011)

Phil H committed 2y ago

December 22, 2023

A

docs: add server config docs

Andrei Betlen committed 2y ago

D

[Feat] Multi model support (#931)

Dave committed 2y ago