COMMITS
/ llama_cpp/server/app.py March 22, 2026
A
misc: Add Ruff formatting (#2148)
Andrei committed
January 8, 2025
G
fix: streaming resource lock (#1879)
Graeme Power committed
December 9, 2024
G
fix: add missing await statements for async exit_stack handling (#1858)
Graeme Power committed
December 6, 2024
G
I
fix: added missing exit_stack.close() to /v1/chat/completions (#1796)
Ignaz "Ian" Kraft committed
July 9, 2024
A
fix(misc): Format
Andrei Betlen committed
July 2, 2024
A
fix(misc): Fix type errors
Andrei Betlen committed
May 14, 2024
May 5, 2024
A
feat(server): Add support for setting root_path. Closes #1420
Andrei Betlen committed
April 17, 2024
K
feat: add `disable_ping_events` flag (#1257)
khimaros committed
April 10, 2024
A
feat: Add support for yaml based configs
Andrei Betlen committed
March 31, 2024
W
feat: Add logprobs support to chat completions (#1311)
windspirit95 committed
March 23, 2024
A
fix(server): minor type fixes
Andrei Betlen committed
March 19, 2024
A
docs: Add chat examples to openapi ui
Andrei Betlen committed
March 9, 2024
F
feat: Add endpoints for tokenize, detokenize and count tokens (#1136)
Felipe Lorenz committed
February 28, 2024
A
misc: Format
Andrei Betlen committed
February 15, 2024
K
fix: Use '\n' seperator for EventSourceResponse (#1188)
khimaros committed
January 25, 2024
A
feat(server): include llama-cpp-python version in openapi spec
Andrei Betlen committed
January 16, 2024
December 22, 2023
D
[Feat] Multi model support (#931)
Dave committed
December 21, 2023
D
Implement openai api compatible authentication (#1010)
docmeth02 committed
December 18, 2023
A
Add offload_kqv option to llama and server
Andrei Betlen committed
B
Bugfix: Remove f16_kv, add offload_kqv field (#1019)
Brandon Roberts committed
December 12, 2023
R
Add support for running the server with SSL (#994)
Radoslav Gerganov committed
November 24, 2023
A
docs: Update openapi endpoint names
Andrei Betlen committed
November 21, 2023
A
Add support for logit_bias outside of server api. Closes #827
Andrei Betlen committed
T
Added support for min_p (#921)
TK-Master committed
November 10, 2023
A
Fix: default max_tokens matches openai api (16 for completion, max length for chat completion)
Andrei Betlen committed
November 8, 2023
A
Fix destructor NoneType is not callable error
Andrei Betlen committed
A
Add JSON mode support. Closes #881
Andrei Betlen committed
A
Add seed parameter support for completion and chat_completion requests. Closes #884
Andrei Betlen committed
D
Multimodal Support (Llava 1.5) (#821)
Damian Stewart committed
November 3, 2023
A
Update llama.cpp
Andrei Betlen committed
A
Add functionary support (#784)
Andrei committed
November 2, 2023
A
Update llama.cpp
Andrei Betlen committed
A
fix: tokenization of special characters: (#850)
Antoine Lizee committed
November 1, 2023
D
Iterate over tokens that should be biased rather than the entire vocabulary. (#851)
David Ponce committed
D
Pass-Through grammar parameter in web server. (#855) Closes #778
Daniel Thuerck committed
October 18, 2023
X
update value check for n_gpu_layers field (#826)
Xiaoyu Kevin Hu committed
October 10, 2023
A
Print traceback on server error
Andrei Betlen committed
September 30, 2023
A
Log server exceptions to stdout
Andrei Betlen committed
September 29, 2023
A
Update server params
Andrei Betlen committed
September 25, 2023
V
Adds openai-processing-ms response header (#748)
Viacheslav/Slava Tradunsky committed
September 14, 2023
E
Update app.py (#705)
earonesty committed
A