COMMITS
March 25, 2026
A
Bump version to 0.3.19 (#2162)
Andrei committed
A
A
fix: handle embedding models without KV memory (#2160)
Andrei committed
March 24, 2026
A
chore: bump version (#2157)
Andrei committed
V
feat: expose attention_type parameter in Llama.__init__ (#2143)
Victor Biederbeck committed
A
fix(ci): docker build workflow (#2156)
Andrei committed
A
fix(ci): cuda wheel workflow (#2155)
Andrei committed
A
fix(ci): release wheel workflow (#2154)
Andrei committed
March 23, 2026
A
chore: Bump version (#2153)
Andrei committed
A
fix: Qwen 3.5 support (#2152)
Andrei committed
B
ci: add riscv64 wheel builds to release workflow (#2139)
Bruno Verachten committed
March 22, 2026
A
misc: Add Ruff formatting (#2148)
Andrei committed
A
fix(ci): Rename `huggingface-cli` to `hf` (#2149)
Andrei committed
August 15, 2025
A
chore: Bump version
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
August 7, 2025
A
chore: Bump version
Andrei Betlen committed
S
fix: rename op_offloat to op_offload in llama.py (#2046)
sergey21000 committed
A
feat: Add gpt-oss chat format support through strftime_now in chat format by @iamlemec
Andrei Betlen committed
A
misc: Add Python 3.13 classifier tag
Andrei Betlen committed
A
misc: Update pypi downloads badge
Andrei Betlen committed
A
feat: Update llama.cpp
Andrei Betlen committed
July 18, 2025
A
chore: Bump version
Andrei Betlen committed
July 16, 2025
A
feat: Update llama.cpp
Andrei Betlen committed
July 15, 2025
A
chore: Bump version
Andrei Betlen committed
A
fix: Better chat format for Qwen2.5-VL (#2040)
Alcoft committed
A
feat: Update llama.cpp
Andrei Betlen committed
July 6, 2025
A
fix(ci): Fix macos cpu builds
Andrei Betlen committed