32 Commits

Author SHA1 Message Date
Arthur
d6a4acc0d2 Update serialization (#1891)
* Add benchmark for deserializing large added vocab

* revert dumb stuff, isolate changes

* try to only normalize once

* small improvement?

* some updates

* nit

* fmt

* normalized string are a fucking waste of time when you just want to add tokens to the vocab man....

* more attempts

* works

* let's fucking go, parity

* update

* hahahhahaha

* revert changes that are not actually even needed

* add a python test!

* use normalizer before come on

* nit

* update to a more concrete usecase

* fix build

* style

* reduce sample size

* --allow unmaintained

* clippy happy

* up

* up

* derive impl

* revert unrelated

* fmt

* ignore

* remove stupid file
2025-11-27 23:07:18 +01:00
Haixuan Xavier Tao
007fc767ac Add cargo-semver-checks to Rust CI workflow (#1875)
This adds semver validation to catch breaking changes before release.
The check runs on Ubuntu during CI and compares against the published crate on crates.io.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-16 11:22:48 +02:00
Arthur
01f8bc834c clippy (#1781)
* clippy

* fmtr

* rutc?

* fix onig issue

* up

* decode stream default

* jump a release for cargo audit ...

* more cliippy stuff

* clippy?

* proper style

* fmt
2025-05-27 11:30:32 +02:00
Nicolas Patry
4383a25787 Update the release builds following 0.21.1. (#1746)
* Update the release builds following 0.21.1.

* Clippy fix.
2025-03-13 13:01:41 +01:00
tinyboxvk
41e0eaa561 Bump actions/checkout to v4 (#1667)
Signed-off-by: tinyboxvk <tinyboxvk@users.noreply.github.com>
2024-10-29 14:32:07 +01:00
Nicolas Patry
25aee8b88c [BREAKING CHANGE] Ignore added_tokens (both special and not) in the decoder (#1513)
* [BREAKING CHANGE] Ignore added_tokens (both special and not) in the
decoder

Causes issues with `ByteLevel` messing up some `AddedTokens` with some
utf-8 range used in the bytelevel mapping.

This commit tests the extend of the damage of ignoring the decoder for
those tokens.

* Format.

* Installing cargo audit.

* Minor fix.

* Fixing "bug" in node/python.

* Autoformat.

* Clippy.

* Only prefix space when there's no decoder.
2024-05-06 11:49:38 +02:00
Arthur
f2ec3b239b remove enforcement of non special when adding tokens (#1521)
* remove enforcement of non special when adding tokens

* mut no longer needed

* add a small test

* nit

* style

* audit

* ignore cargo audit's own vulnerability

* update

* revert

* remove CVE
2024-04-30 15:53:47 +02:00
Nicolas Patry
aed491df8c Fixing the progressbar. (#1353)
* Fixing the progressbar.

* Upgrade deps.

* Update cargo audit

* Ssh this action.

* Fixing esaxx by using slower rust version.

* Trying the new esaxx version.

* Publish.

* Get cache again.
2023-10-05 15:33:58 +02:00
Funtowicz Morgan
a03330607b Update all GH Actions with dependency on actions/checkout from v[1,2] to v3 to notably improve performance (retrieve only the commit being checked-out) (#1256) 2023-05-22 14:50:00 +02:00
Andrew Kane
67080e163a Include license file in Rust crate (#1115)
* Include license file in Rust crate

* Ignore security warning.

* Also for python.

* Upgrading ubuntu version.

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2022-11-30 23:17:56 +01:00
Nicolas Patry
bbae829a72 Adding rust audit. (#1099)
* Adding rust audit.

* Update clap version + derive_builder (they clashed).

* Ignoring specific CVE which can be ignored

https://github.com/Azure/iot-identity-service/issues/481

* Updating python lock.

* Revert `derive-builder` update.

* Adding back help msg.
2022-11-09 12:59:36 +01:00
Nicolas Patry
4ef0afbeb6 Update old gh actions, remove deprecated doc building. (#1069) 2022-10-05 17:59:46 +02:00
ropottnik
50ac90d338 testable example docs for training-serialization (#373)
* testable usage docs for training and serialization and reference in README.md

* Generate Readme from testable examples + template

* add up-to-date check for Readme with generated one

* try make pipeline fail by adding something to the lib.rs readme

* remove difference from lib.rs again to make pipeline pass

* fix black version

Co-authored-by: Simon Ertl <simon@Simons-MacBook-Pro.local>
2020-08-31 13:59:34 -04:00
Sebastian Pütz
1f64761480 Cache based on Cargo.lock. 2020-08-03 10:51:57 -04:00
Anthony MOI
d3e1c55fb7 CI - Only support macOS 10.11+ 2020-07-17 12:11:41 -04:00
Anthony MOI
212747f7fd CI - Set MACOSX_DEPLOYMENT_TARGET=10.10 2020-07-17 12:11:41 -04:00
Pierric Cistac
d3fb1d12f4 Try avoid duplicated github actions in PRs 2020-04-01 16:39:51 -04:00
Pierric Cistac
d90593a5e8 Run github actions on pull requests
Try to fix actions not running for pull requests opened by external contributors cc @n1t0
2020-04-01 14:04:14 -04:00
Anthony MOI
025e74c8c3 Merge pull request #197 from huggingface/remove-normalized
Remove NormalizedString from Encoding
2020-03-18 16:52:21 -04:00
Pierric Cistac
0c572097a2 fix cargo cache in ci
see https://github.com/actions/cache/issues/133#issuecomment-599102035
2020-03-18 15:35:35 -04:00
Anthony MOI
002af4bccb Update Github actions for integration tests 2020-03-16 11:38:00 -04:00
Evan Pete Walsh
d1ae0bd576 Cache cargo registry and build target directory in CI (#78)
* try caching cargo registry and build

* trigger build

* try fixing path

* force build

* try different cache key

* try rebuild
2020-01-17 13:30:08 -08:00
Evan Pete Walsh
e3cf6a7b00 refactor benchmarks (#25)
* refactor benchmarks

* fix

* fix CI
2020-01-01 17:07:36 -08:00
Evan Pete Walsh
ebf22198f3 Add benchmark framework and benches for BPE (GPT2) (#4)
* add benchmarks

* fix bench

* refactor BPE benchmarks

* fix

* remove un-needed gitignore

* update Cargo.lock

* fix

* small fix

* improve benchmarks

* move setup to Makefile

* benchmark BPE encode batch

* refactor batch benchmark
2020-01-01 07:35:57 -08:00
epwalsh
4914e6285e add path to manifest 2019-12-13 17:53:32 -05:00
epwalsh
7f42417482 fix yaml 2019-12-13 17:53:32 -05:00
epwalsh
03406d0b54 add rustfmt and clippy to CI pipeline 2019-12-13 17:53:32 -05:00
Morgan Funtowicz
1a52cda912 Fix yaml indent 2019-11-30 13:06:32 -05:00
Morgan Funtowicz
f9ccf62301 Try updating to official rust Github Action to avoid missing rust components. 2019-11-30 13:06:32 -05:00
Morgan Funtowicz
78e7591780 Fix Cargo.toml not found in Rust workflow 2019-11-30 13:06:32 -05:00
Anthony MOI
d1b6b14bd7 Attempt fix workflows 2019-11-29 19:28:49 -05:00
Funtowicz Morgan
5c6834f363 Added GitHub Action workflow for Rust
This allows for automated build & test of the library.
2019-11-26 09:47:48 +00:00