60 Commits

Author SHA1 Message Date
Andrei
11e7a55af7 fix: Qwen 3.5 support (#2152)
* fix: handle Qwen 3.5 hybrid prefix reuse

* test: fix Qwen runtime unit mocks

* test: drop Qwen runtime unit tests

* docs: credit Qwen fix contributors in changelog

* docs/tests: update default Qwen model to 3.5 0.8B

* test: rebaseline Qwen 3.5 outputs

* test: stabilize low-level Qwen sampling check

* test: tighten Qwen 3.5 completion prompts
2026-03-22 22:33:31 -07:00
Andrei
9f661ff2cf fix(ci): Fix macos tests, support both Intel and Apple Silicon testing (#2150)
* fix(ci): use supported macos runner label

* fix(ci): add apple silicon macos test coverage

* fix(ci): run standard macos tests on apple silicon

* fix(ci): simplify apple silicon macos install

* fix(ci): disable ggml native on apple silicon runner

* docs: update changelog for macos ci runner fix
2026-03-22 16:10:47 -07:00
Andrei
ca3b00a204 fix(ci): Rename huggingface-cli to hf (#2149)
* Fix model download in test workflow

* Use hf CLI in test workflow

* Use hf CLI name in CI and docs

* Reference PR in changelog
2026-03-22 15:20:48 -07:00
Andrei Betlen
d8cc231fd0 fix(ci): Use default architecture chosen by action 2024-12-06 04:06:07 -05:00
Andrei Betlen
b34f20046b fix(ci): Use python3 2024-12-06 03:59:51 -05:00
Andrei Betlen
1cd3f2cc6a fix(ci): gg 2024-12-06 03:52:54 -05:00
Andrei Betlen
df05096fc5 fix(ci): Install with regular pip 2024-12-06 03:42:33 -05:00
Andrei Betlen
a412ba5539 fix(ci): Update config 2024-12-06 03:38:12 -05:00
Andrei Betlen
9a09fc7848 fix(ci): Debug print python system architecture 2024-12-06 03:19:36 -05:00
Andrei Betlen
f11a781d86 fix(ci): Use macos-13 runner 2024-12-06 03:04:48 -05:00
Andrei Betlen
8988aaf7f6 fix(ci): Use macos-14 runner 2024-12-06 03:02:04 -05:00
Andrei Betlen
72ed7b8855 fix(ci): Explicitly test on arm64 macos runner 2024-12-06 02:55:57 -05:00
Andrei Betlen
9d06e36c00 fix(ci): Explicitly install arm64 python version 2024-12-06 02:49:20 -05:00
dependabot[bot]
1324c0c50c chore(deps): bump actions/cache from 3 to 4 (#1751)
Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrei <abetlen@gmail.com>
2024-09-20 18:04:59 -04:00
Andrei
f8fcb3ea34 feat: Update sampling API for llama.cpp (#1742)
* Initial samplng api update

* Fix logger

* Update tests

* Update

* Remove seed

* Add sampling chain

* Remove unnused test

* Use Qwen2 0.5B for ci tests

* Fix typo

* Fix typo

* Update cache version

* Use real model for tests

* Add huggingface-hub as a test dependency

* Remove RUST_LOG=trace

* Add actual logit processor test
2024-09-18 20:00:19 -04:00
Olivier DEBAUCHE
e529940f45 feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels
* Update build-wheels-cuda.yaml

* Update build-wheels-cuda.yaml

* revert

* Bump pyhton from 3.8 to 3.9

* Remove python 3.8

* Remove Python 3.7 and 3.8 deprecated

* Bump python from 3.8 to 3.9

* Add python 3.9

* Add python 3.9, remove macos-11 deprecated, add macos-14

* Bump python 3.8 to 3.9

* Add python 3.13

* Add python 3.13

* python 3.13 remove

* remove python 3.13

* remove python 3.8

* Bump macos-13 to macos-14

* Update build-wheels-metal.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update generate-index-from-release.yaml

Add avx, avx2 and avx512

* Update test.yaml

* Update test-pypi.yaml

* Update publish.yaml

* Update publish-to-test.yaml

* Update build-wheels-cuda.yaml

Cuda with AVX2 by default

* Update build-wheels-cuda.yaml

* remove DEPRECATED 32 bits

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

Upgrade matrix os to latest version

* Update build-wheels-metal.yaml

* Update build-wheels-cuda.yaml

* Update test.yaml

* Update test-pypi.yaml

* Update test.yaml

Add cache: 'pip'

* Update publish-to-test.yaml

* Update build-wheels-metal.yaml

Add cache: 'pip'

* Update build-wheels-cuda.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

remove x86_64

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* revert

* Remove cpu variants

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update publish-to-test.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update publish.yaml

* Update test-pypi.yaml

* Update test.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update publish.yaml

* Update test-pypi.yaml

* Update publish-to-test.yaml

* Update test.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update publish-to-test.yaml

* Update publish.yaml

* Update test-pypi.yaml

* Update test.yaml

* Update test.yaml

* Update build-and-release.yaml

* Update publish-to-test.yaml

* Update build-wheels-metal.yaml

* Update test-pypi.yaml

* Update test.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update build-wheels-metal.yaml

* Update publish.yaml

* Update publish-to-test.yaml

* Update test-pypi.yaml

* Update test.yaml

* Update build-wheels-cuda.yaml

* Update generate-index-from-release.yaml

* Update README.md

* Update README.md

* Update test.yaml

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-09-18 19:22:05 -04:00
Andrei Betlen
09a4f78e05 fix(ci): Update LLAMA_ flags to GGML_ 2024-07-09 00:37:03 -04:00
Andrei Betlen
dc20e8c342 fix: Copy dependencies for windows
fix: :(

fix

fix

fix: Copy runtime dlls on windows

fix: Add explicit copy command for windows dlls

fix

fix

fix: >:(

fix

fix

fix

fix

Update path on windows

check dll dependancies

fix: Update PATH on win32

ci: Update test.yaml
2024-07-02 01:39:03 -04:00
Olivier DEBAUCHE
9e396b3ebd feat: Update workflows and pre-built wheels (#1416)
* Update build-wheels-cuda.yaml

* Update build-wheels-cuda.yaml

* revert

* Bump pyhton from 3.8 to 3.9

* Remove python 3.8

* Remove Python 3.7 and 3.8 deprecated

* Bump python from 3.8 to 3.9

* Add python 3.9

* Add python 3.9, remove macos-11 deprecated, add macos-14

* Bump python 3.8 to 3.9

* Add python 3.13

* Add python 3.13

* python 3.13 remove

* remove python 3.13

* remove python 3.8

* Bump macos-13 to macos-14

* Update build-wheels-metal.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update generate-index-from-release.yaml

Add avx, avx2 and avx512

* Update test.yaml

* Update test-pypi.yaml

* Update publish.yaml

* Update publish-to-test.yaml

* Update build-wheels-cuda.yaml

Cuda with AVX2 by default

* Update build-wheels-cuda.yaml

* remove DEPRECATED 32 bits

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

Upgrade matrix os to latest version

* Update build-wheels-metal.yaml

* Update build-wheels-cuda.yaml

* Update test.yaml

* Update test-pypi.yaml

* Update test.yaml

Add cache: 'pip'

* Update publish-to-test.yaml

* Update build-wheels-metal.yaml

Add cache: 'pip'

* Update build-wheels-cuda.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

remove x86_64

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-and-release.yaml

* Update build-wheels-metal.yaml

* revert

* Remove cpu variants

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-06-13 10:19:57 -04:00
Olivier DEBAUCHE
03c654a3d9 ci(fix): Workflow actions updates and fix arm64 wheels not included in release (#1392)
* Update test.yaml

Bump  actions/checkout@v3 to v4
Bump action/setup-python@v4 to v5

* Update test-pypi.yaml

Bum actions/setup-python@v4 to v5

* Update build-and-release.yaml

Bump softprops/action-gh-release@v1 to v2
Bump actions/checkout@v3 to v4
Bump actions/setup-python@v3 to v5

* Update publish.yaml

Bump actions/checkout@v3 to v4
Bump actions/sertup-python@v4 to v5

* Update publish-to-test.yaml

Bump actions/checkout@v3 to v4
Bump actions/setup-python @v4 to v5

* Update test-pypi.yaml

Add Python 3.12

* Update build-and-release.yaml

* Update build-docker.yaml

Bump docker/setup-qemu-action@v2 to v3
Bump docker/setup-buildx-action@v2 to v3

* Update build-and-release.yaml

* Update build-and-release.yaml
2024-04-29 22:52:23 -04:00
Andrei Betlen
266abfc1a3 fix(ci): Fix metal tests as well 2024-04-25 03:09:46 -04:00
Andrei Betlen
de37420fcf fix(ci): Fix python macos test runners issue 2024-04-25 03:08:32 -04:00
Andrei Betlen
7dbbfdecad fix: submodule kompute is not included in sdist. Closes #1165 2024-02-13 23:53:56 -05:00
Andrei Betlen
cc0fe43849 Disable opencl test 2023-11-14 14:59:08 -05:00
Andrei Betlen
4286830f16 Add python3.12 tests 2023-11-06 09:32:20 -05:00
Andrei Betlen
fa83cc5f9c Update llama.cpp
Fix build examples

Exclude examples directory

Revert cmake changes

Try actions/checkout@v4

Try to update submodules

Revert

Update llama.cpp

Fix build examples

Exclude examples directory

Revert cmake changes

Try actions/checkout@v4

Try to update submodules

Revert
2023-11-02 14:28:15 -04:00
Andrei Betlen
ddbd10c442 Fix clblast test 2023-11-02 14:28:15 -04:00
Andrei Betlen
735522272b Fix runner label 2023-11-02 14:28:15 -04:00
Andrei Betlen
952e4cc3ce Fix: use linux image for opencl test 2023-11-01 21:31:02 -04:00
Andrei Betlen
8bf7fa6e5f Add opencl test 2023-11-01 21:18:36 -04:00
Andrei Betlen
446d5f5649 Add metal ci test 2023-11-01 21:15:01 -04:00
Andrei Betlen
fe743b4945 Revert python 3.12 tests 2023-09-12 18:43:43 -04:00
Andrei Betlen
6bddf620e1 Add python 3.12 to tests 2023-09-12 18:41:29 -04:00
Andrei Betlen
dadfd96745 Use compiler to determine best optimizations for platform 2023-09-12 18:21:49 -04:00
Andrei Betlen
4c0787b408 Disable acceleration in macos tests only 2023-09-12 18:05:44 -04:00
Andrei Betlen
04a6bbe30e Revert test changes 2023-09-12 17:46:58 -04:00
Andrei Betlen
9547a351ee Try arm64 python 2023-09-12 17:35:07 -04:00
Andrei Betlen
fa2f1fdf60 Enable accelerations and set python architecture 2023-09-12 17:28:36 -04:00
Andrei Betlen
685a929c6a typo 2023-09-12 17:03:19 -04:00
Andrei Betlen
082c2a23bd disable all acceleration on macos ci builds 2023-09-12 16:59:47 -04:00
Andrei Betlen
b053cf7b50 Fix typo 2023-09-12 16:55:52 -04:00
Andrei Betlen
5458427e4c Disable metal for ci test builds 2023-09-12 16:50:14 -04:00
Andrei Betlen
d2c5afe5a3 Remove prerelease python version 2023-07-18 19:38:51 -04:00
Andrei Betlen
7ce6cdf45b Update supported python versions. 2023-07-18 19:37:52 -04:00
Andrei Betlen
6cb77a20c6 Migrate to scikit-build-core. Closes #489 2023-07-18 18:52:29 -04:00
Andrei Betlen
52753b77f5 Upgrade fastapi to 0.100.0 and pydantic v2 2023-07-07 21:38:46 -04:00
Andrei Betlen
e3542b6627 Revert "Merge pull request #350 from abetlen/migrate-to-scikit-build-core"
This reverts commit fb2c5f7fd9, reversing
changes made to 202ed4464b.
2023-06-09 23:23:16 -04:00
Andrei Betlen
7345456779 Migrate to scikit-build-core 2023-06-08 21:49:42 -04:00
Andrei Betlen
bf3d0dcb2c Fix tests 2023-05-01 15:28:46 -04:00
Andrei Betlen
dbe0ad86c8 Update test dependencies 2023-05-01 14:50:01 -04:00