* Add benchmark for deserializing large added vocab
* revert dumb stuff, isolate changes
* try to only normalize once
* small improvement?
* some updates
* nit
* fmt
* normalized string are a fucking waste of time when you just want to add tokens to the vocab man....
* more attempts
* works
* let's fucking go, parity
* update
* hahahhahaha
* revert changes that are not actually even needed
* add a python test!
* use normalizer before come on
* nit
* update to a more concrete usecase
* fix build
* style
* reduce sample size
* --allow unmaintained
* clippy happy
* up
* up
* derive impl
* revert unrelated
* fmt
* ignore
* remove stupid file
This adds semver validation to catch breaking changes before release.
The check runs on Ubuntu during CI and compares against the published crate on crates.io.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* [BREAKING CHANGE] Ignore added_tokens (both special and not) in the
decoder
Causes issues with `ByteLevel` messing up some `AddedTokens` with some
utf-8 range used in the bytelevel mapping.
This commit tests the extend of the damage of ignoring the decoder for
those tokens.
* Format.
* Installing cargo audit.
* Minor fix.
* Fixing "bug" in node/python.
* Autoformat.
* Clippy.
* Only prefix space when there's no decoder.
* remove enforcement of non special when adding tokens
* mut no longer needed
* add a small test
* nit
* style
* audit
* ignore cargo audit's own vulnerability
* update
* revert
* remove CVE
* Fixing the progressbar.
* Upgrade deps.
* Update cargo audit
* Ssh this action.
* Fixing esaxx by using slower rust version.
* Trying the new esaxx version.
* Publish.
* Get cache again.
* Include license file in Rust crate
* Ignore security warning.
* Also for python.
* Upgrading ubuntu version.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* Adding rust audit.
* Update clap version + derive_builder (they clashed).
* Ignoring specific CVE which can be ignored
https://github.com/Azure/iot-identity-service/issues/481
* Updating python lock.
* Revert `derive-builder` update.
* Adding back help msg.
* testable usage docs for training and serialization and reference in README.md
* Generate Readme from testable examples + template
* add up-to-date check for Readme with generated one
* try make pipeline fail by adding something to the lib.rs readme
* remove difference from lib.rs again to make pipeline pass
* fix black version
Co-authored-by: Simon Ertl <simon@Simons-MacBook-Pro.local>