Large Language Model Text Generation Inference
COMMITS
/ docs/openapi.json September 16, 2025
A
Patch version 3.3.6 (#3329)
Alvaro Moran committed
September 2, 2025
A
chore: prepare version 3.3.5 (#3314)
Alvaro Moran committed
June 19, 2025
D
Neuron backend fix and patch version 3.3.4 (#3273)
David Corvoysier committed
June 18, 2025
D
chore: prepare release 3.3.3 (#3269)
David Corvoysier committed
May 30, 2025
D
Prepare for 3.3.2 (#3249)
Daniël de Kok committed
May 22, 2025
D
Prepare for 3.3.1 (#3238)
Daniël de Kok committed
May 9, 2025
D
Prepare for 3.3.0 (#3220)
Daniël de Kok committed
May 1, 2025
D
Pr 2982 ci branch (#3046)
drbh committed
April 8, 2025
N
3.2.3 (#3151)
Nicolas Patry committed
April 6, 2025
N
Preparing for release. (#3147)
Nicolas Patry committed
March 12, 2025
N
Preparing relase 3.2.0 (#3100)
Nicolas Patry committed
N
Fix tool call3 (#3086)
Nicolas Patry committed
March 5, 2025
N
Making `tool_calls` a vector. (#3075)
Nicolas Patry committed
March 4, 2025
N
Preparing for release. (#3060)
Nicolas Patry committed
February 21, 2025
D
Improve tool call message processing (#3036)
drbh committed
January 31, 2025
N
Prepare for release 3.1.0 (#2972)
Nicolas Patry committed
December 16, 2024
N
Fixing CI. (#2846)
Nicolas Patry committed
December 9, 2024
N
Prep new version (#2810)
Nicolas Patry committed
December 6, 2024
O
feat: auto max_new_tokens (#2803)
OlivierDehaene committed
November 22, 2024
O
chore: prepare 2.4.1 release (#2773)
OlivierDehaene committed
November 21, 2024
L
Remove guideline from API (#2762)
Lucain committed
November 19, 2024
November 15, 2024
November 4, 2024
D
fix: add chat_tokenize endpoint to api docs (#2710)
drbh committed
October 25, 2024
O
chore: prepare 2.4.0 release (#2695)
OlivierDehaene committed
October 23, 2024
O
feat: allow any supported payload on /invocations (#2683)
OlivierDehaene committed
October 15, 2024
N
Fixing linters. (#2650)
Nicolas Patry committed
October 14, 2024
O
Small fixes for supported models (#2471)
Omar Sanseviero committed
October 8, 2024
D
CI (2599): Update ToolType input schema (#2601)
drbh committed
October 3, 2024
N
New release 2.3.1 (#2604)
Nicolas Patry committed
September 20, 2024
N
Preparing for release. (#2540)
Nicolas Patry committed
September 19, 2024
N
Stream options. (#2533)
Nicolas Patry committed
August 29, 2024
D
feat: add /v1/models endpoint (#2433)
drbh committed
August 27, 2024
D
Pr 2451 ci branch (#2454)
drbh committed
August 16, 2024
N
FIxing the CI.
Nicolas Patry committed
V
Improve the Consuming TGI + Streaming docs. (#2412)
Vaibhav Srivastav committed
August 12, 2024
August 9, 2024
D
feat: add guideline to chat request and template (#2391)
drbh committed
N
Using an enum for flash backens (paged/flashdecoding/flashinfer) (#2385)
Nicolas Patry committed
August 8, 2024
V
Update Quantization docs and minor doc fix. (#2368)
Vaibhav Srivastav committed
July 31, 2024
N
Rebase TRT-llm (#2331)
Nicolas Patry committed
July 23, 2024
N
Preparing for release. (#2285)
Nicolas Patry committed
July 19, 2024
D
fix: adjust default tool choice (#2244)
drbh committed
July 9, 2024
N
Updating the self check (#2209)
Nicolas Patry committed
N
Adding sanity check to openapi docs.
Nicolas Patry committed
July 5, 2024
N
Refactor dead code - Removing all `flash_xxx.py` files. (#2166)
Nicolas Patry committed
July 3, 2024
N
Fixing missing `object` field for regular completions. (#2175)
Nicolas Patry committed
N
Revert "Fixing missing `object` field for regular completions."
Nicolas Patry committed
N
Fixing missing `object` field for regular completions.
Nicolas Patry committed
D
feat: improve update_docs for openapi schema (#2169)
drbh committed