tokenizers/.github/workflows/rust.yml at main

mirror of https://github.com/huggingface/tokenizers.git synced 2026-03-27 06:01:18 +00:00

Files

Arthur d6a4acc0d2 Update serialization (#1891 )

* Add benchmark for deserializing large added vocab

* revert dumb stuff, isolate changes

* try to only normalize once

* small improvement?

* some updates

* nit

* fmt

* normalized string are a fucking waste of time when you just want to add tokens to the vocab man....

* more attempts

* works

* let's fucking go, parity

* update

* hahahhahaha

* revert changes that are not actually even needed

* add a python test!

* use normalizer before come on

* nit

* update to a more concrete usecase

* fix build

* style

* reduce sample size

* --allow unmaintained

* clippy happy

* up

* up

* derive impl

* revert unrelated

* fmt

* ignore

* remove stupid file

2025-11-27 23:07:18 +01:00

3.2 KiB

Raw Permalink Blame History

View Raw

3.2 KiB Raw Permalink Blame History

3.2 KiB

Raw Permalink Blame History