* workflows/CI: make rustc targets more explicit
For Linux builds, distinguish between 'target' and 'arch', since the two
are not always the same (e.g. the target for ppc64le is actually
powerpc64le-unknown-linux-gnu). This allows more explicit support for
other platforms when needed.
Signed-off-by: Trevor Gamblin <tgamblin@baylibre.com>
* workflows/CI: add riscv64 build
Note that the 'target' and 'arch' values here are different - arch is
riscv64, but the actual rustc target is riscv64gc-unknown-linux-gnu,
hence the previous change.
Signed-off-by: Trevor Gamblin <tgamblin@baylibre.com>
---------
Signed-off-by: Trevor Gamblin <tgamblin@baylibre.com>
The documentation source links were pointing to `src/tokenizers/...` which
doesn't exist. The Python source files are located at
`bindings/python/py_src/tokenizers/...`.
Add `version_tag_suffix` parameter to documentation build workflows to
generate correct GitHub source links.
Fixes#1910
* fix ci
* fix stubs
* nit
* exclude
* full fix
* update
* up
* revert
* workflow up
* thius?
* up
* add logs I suspect its just maturin missing
* marutin not installed but not needed
* update
* check style after running tests since I mess up the .pyi
* nit?
* something that is supposed to work but my env does not allow it, seems to be uv related
* ?
* up
* nits
* let' s try
* part of tthe update for pyo3 0.27
* more pyo3 fixes
* update
* does this help?
* help
* finally
* update stub accordingly
* export more of the submodules
* moooore
* add individual .pypi
* cleanup
* update pyo3 signatures and fix warning
* style
* update
* more updates
* sytle
* clippy happy
* does this help?
* fix
* fix
* ?
* what?
* add dwarwub case co
* up?
* update
* clippy and fmt
* this time it works
* remove offending one
* update
* remove shit
* remove more shit that was unwanted
* ?
* simplify a bit
* more verbose?
* more simplification
* fmt
* fix some of the typing in rust directly to please TY (but also just fix some typing.Any
* fix script running
* fix , ignore and exclude
* style
* update
* fmt + add it to style?
* cleanup
* Simplify stub.py docstring injection
- Replace complex modifications dict with simple insertions list
- Remove nested process_function_or_method function
- Use bottom-to-top line replacement for cleaner logic
- Remove unused importlib import
* isolate stub generation into separate tools/stub-gen crate
- Move stub_generation.rs to tools/stub-gen/ as standalone crate
- Remove stub-gen feature and pyo3-introspection from main crate
- Auto-detect PYTHONHOME for uv/venv environments
- Update Makefile and README with new instructions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* Add windows arm64 support to python release workflow
* Run on fork
Updated workflow to include 'arm64-runner' branch and commented out conditions.
* fix typo
* add arm64 python install for all versions
* use python-install option
* clean up fork changes
* Update .github/workflows/python-release.yml
* revert 3.14 addition
Waiting to add in a different PR that adds all 3.14 builds at the same time
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add interpretor install, enable workflow run in fork
* add additional python versions
* Refactor python version setup for x86 windows
* try splitting interpreter into an array
* revert to hard coded list
* try using extra argument
* Fix quotes
* Clean up python install
* revert workflow conditions
* update stub for typing
* up
* add ty type checker
* update stub
* up
* some update
* add owner to stub?
* update
* no print
* uptime funk
* mm
* wtf
* fix
* fix more
* some fixses are manual but come on
* up
* # type: ignore[import]
* reduce the scope of ty for less changes
* ups
* up?
* Add benchmark for deserializing large added vocab
* revert dumb stuff, isolate changes
* try to only normalize once
* small improvement?
* some updates
* nit
* fmt
* normalized string are a fucking waste of time when you just want to add tokens to the vocab man....
* more attempts
* works
* let's fucking go, parity
* update
* hahahhahaha
* revert changes that are not actually even needed
* add a python test!
* use normalizer before come on
* nit
* update to a more concrete usecase
* fix build
* style
* reduce sample size
* --allow unmaintained
* clippy happy
* up
* up
* derive impl
* revert unrelated
* fmt
* ignore
* remove stupid file
This adds semver validation to catch breaking changes before release.
The check runs on Ubuntu during CI and compares against the published crate on crates.io.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Testing ABI3 wheels to reduce number of wheels
* No need for py-clone anymore.
* Upgrade python versions.
* Remove those flakes.
* Promoting new CI + Fixing secret.
* Update docs-check.yml
Bump actions/setup-python to v5
Bump python-version to 3.12 (default on ubuntu-latest)
Switch actions-rs/toolchain to dtolnay/rust-toolchain as the former one is no longer maintained
* Update node-release.yml
Bump actions/setup-python to v5
Switch actions-rs/toolchain to dtolnay/rust-toolchain as the former one is no longer maintained
Bump actions/cache to v4
Bump actions/setup-node to v4
Bump actions/upload-artifact to v4
Bump actions/download-artifact to v4
* Update node.yml
Switch actions-rs/toolchain to dtolnay/rust-toolchain as the former one is no longer maintained
Bump actions/cache to v4
Bump actions/setup-node to v4
* Update python-release-conda.yml
Switch actions-rs/toolchain to dtolnay/rust-toolchain as the former one is no longer maintained
Bump conda-incubator/setup-miniconda to v3
* Update python-release.yml
Bump actions/setup-python to v5
Bump actions/download-artifact to v4
* Update rust-release.yml
Switch actions-rs/toolchain to dtolnay/rust-toolchain as the former one is no longer maintained
Bump actions/cache to v4
* Update stale.yml
Bump actions/stale to v9
* Update python.yml
Bump actions/setup-python to v5
* [BREAKING CHANGE] Ignore added_tokens (both special and not) in the
decoder
Causes issues with `ByteLevel` messing up some `AddedTokens` with some
utf-8 range used in the bytelevel mapping.
This commit tests the extend of the damage of ignoring the decoder for
those tokens.
* Format.
* Installing cargo audit.
* Minor fix.
* Fixing "bug" in node/python.
* Autoformat.
* Clippy.
* Only prefix space when there's no decoder.
* remove enforcement of non special when adding tokens
* mut no longer needed
* add a small test
* nit
* style
* audit
* ignore cargo audit's own vulnerability
* update
* revert
* remove CVE
* Fixing the progressbar.
* Upgrade deps.
* Update cargo audit
* Ssh this action.
* Fixing esaxx by using slower rust version.
* Trying the new esaxx version.
* Publish.
* Get cache again.
* Move to maturing mimicking move for `safetensors`.
* Tmp.
* Fix sdist.
* Wat?
* Clippy 1.72
* Remove if.
* Conda sed.
* Fix doc check workflow.
* Moving to maturin AND removing http + openssl mess (smoothing transition
moving to `huggingface_hub`)
* Fix dep
* Black.
* New node bindings.
* Fix docs + node cache ?
* Yarn.
* Working dir.
* Extension module.
* Put back interpreter.
* Remove cache.
* New attempt
* Multi python.
* Remove FromPretrained.
* Remove traces of `fromPretrained`.
* Drop 3.12 for windows?
* Typo.
* Put back the default feature for ignoring links during simple test.
* Fix ?
* x86_64 -> x64.
* Remove warning for windows bindings.
* Excluse aarch.
* Include/exclude.
* Put back workflows in correct states.
* CD backports
follow
huggingface/safetensors#317
* fix node bindings?
`cargo check` doesnt work on my local configuration from `tokenizers/bindings/node/native`
i don't think it will be a problem but i have difficulty telling
* backport #315
* safetensors#317 back ports