Commits: llama_cpp/server/app.py - abetlen/llama-cpp-python

abetlen / llama-cpp-python UNCLAIMED

Python bindings for llama.cpp

0 0 36 Python

COMMITS

/ llama_cpp/server/app.py

main

March 22, 2026

misc: Add Ruff formatting (#2148)

Andrei committed 9d ago

a9b4a06

January 8, 2025

fix: streaming resource lock (#1879)

Graeme Power committed 1y ago

e8f14ce

December 9, 2024

fix: add missing await statements for async exit_stack handling (#1858)

Graeme Power committed 1y ago

afedfc8

December 6, 2024

fix: Avoid thread starvation on many concurrent requests by making use of asyncio to lock llama_proxy context (#1798)

Graeme Power committed 1y ago

9bd0c95

fix: added missing exit_stack.close() to /v1/chat/completions (#1796)

Ignaz "Ian" Kraft committed 1y ago

073b7e4

July 9, 2024

fix(misc): Format

Andrei Betlen committed 1y ago

c1ae815

July 2, 2024

fix(misc): Fix type errors

Andrei Betlen committed 1y ago

387d01d

fix(server): Fix bug in FastAPI streaming response where dependency was released before request completes causing SEGFAULT

Andrei Betlen committed 1y ago

296304b

May 14, 2024

feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)

twaka committed 1y ago

5212fb0

May 5, 2024

feat(server): Add support for setting root_path. Closes #1420

Andrei Betlen committed 1y ago

0318702

April 17, 2024

feat: add `disable_ping_events` flag (#1257)

khimaros committed 1y ago

b73c73c

April 10, 2024

feat: Add support for yaml based configs

Andrei Betlen committed 2y ago

060bfa6

March 31, 2024

feat: Add logprobs support to chat completions (#1311)

windspirit95 committed 2y ago

aa9f1ae

March 23, 2024

fix(server): minor type fixes

Andrei Betlen committed 2y ago

d11ccc3

March 19, 2024

docs: Add chat examples to openapi ui

Andrei Betlen committed 2y ago

f7decc9

March 9, 2024

feat: Add endpoints for tokenize, detokenize and count tokens (#1136)

Felipe Lorenz committed 2y ago

c139f8b

February 28, 2024

misc: Format

Andrei Betlen committed 2y ago

727d60c

February 15, 2024

fix: Use '\n' seperator for EventSourceResponse (#1188)

khimaros committed 2y ago

ea1f88d

January 25, 2024

feat(server): include llama-cpp-python version in openapi spec

Andrei Betlen committed 2y ago

cde7514

January 16, 2024

Support Accept text/event-stream in chat and completion endpoints, resolves #1083 (#1088)

anil committed 2y ago

cfb7da9

December 22, 2023

[Feat] Multi model support (#931)

Dave committed 2y ago

12b7f2f

December 21, 2023

Implement openai api compatible authentication (#1010)

docmeth02 committed 2y ago

33cc623

December 18, 2023

Add offload_kqv option to llama and server

Andrei Betlen committed 2y ago

095c650

Bugfix: Remove f16_kv, add offload_kqv field (#1019)

Brandon Roberts committed 2y ago

62944df

December 12, 2023

Add support for running the server with SSL (#994)

Radoslav Gerganov committed 2y ago

8e44a32

November 24, 2023

docs: Update openapi endpoint names

Andrei Betlen committed 2y ago

1a7bf20

November 21, 2023

Fix #569

Andrei Betlen committed 2y ago

128dc47

Format

Andrei Betlen committed 2y ago

7a3f878

Add support for logit_bias outside of server api. Closes #827

Andrei Betlen committed 2y ago

07e47f5

Added support for min_p (#921)

TK-Master committed 2y ago

b8438f7

November 10, 2023

Fix: default max_tokens matches openai api (16 for completion, max length for chat completion)

Andrei Betlen committed 2y ago

e7962d2

November 8, 2023

Fix destructor NoneType is not callable error

Andrei Betlen committed 2y ago

ca4cb88

Add JSON mode support. Closes #881

Andrei Betlen committed 2y ago

b30b9c3

Add seed parameter support for completion and chat_completion requests. Closes #884

Andrei Betlen committed 2y ago

86aeb9f

Multimodal Support (Llava 1.5) (#821)

Damian Stewart committed 2y ago

aab74f0

November 3, 2023

Update llama.cpp

Andrei Betlen committed 2y ago

df9362e

Add functionary support (#784)

Andrei committed 2y ago

3af7b21

November 2, 2023

Update llama.cpp

Andrei Betlen committed 2y ago

fa83cc5

fix: tokenization of special characters: (#850)

Antoine Lizee committed 2y ago

4d4e0f1

November 1, 2023

Iterate over tokens that should be biased rather than the entire vocabulary. (#851)

David Ponce committed 2y ago

3fc9147

Pass-Through grammar parameter in web server. (#855) Closes #778

Daniel Thuerck committed 2y ago

5f8f369

October 18, 2023

update value check for n_gpu_layers field (#826)

Xiaoyu Kevin Hu committed 2y ago

a315128

October 10, 2023

Print traceback on server error

Andrei Betlen committed 2y ago

d6a130a

September 30, 2023

Log server exceptions to stdout

Andrei Betlen committed 2y ago

5ef5280

September 29, 2023

Update server params

Andrei Betlen committed 2y ago

d9bce17

September 25, 2023

Adds openai-processing-ms response header (#748)

Viacheslav/Slava Tradunsky committed 2y ago

3d5e5b1

September 14, 2023

Update app.py (#705)

earonesty committed 2y ago

58a6e42

Add numa support, low level api users must now explicitly call llama_backend_init at the start of their programs.

Andrei Betlen committed 2y ago

f4090a0

Format

Andrei Betlen committed 2y ago

4daf77e

Update server params. Added lora_base, lora_path, low_vram, and main_gpu. Removed rms_norm_eps and n_gqa (deprecated in llama.cpp)

Andrei Betlen committed 2y ago

2920c4b