Blame: tests/test_processing_common.py - huggingface/transformers

huggingface / transformers UNCLAIMED

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

158507 0 0 Python

Normal View History Raw

Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`# Copyright 2024 The HuggingFace Inc. team. All rights reserved.`
			`#`
			`# Licensed under the Apache License, Version 2.0 (the "License");`
			`# you may not use this file except in compliance with the License.`
			`# You may obtain a copy of the License at`
			`#`
			`# http://www.apache.org/licenses/LICENSE-2.0`
			`#`
			`# Unless required by applicable law or agreed to in writing, software`
			`# distributed under the License is distributed on an "AS IS" BASIS,`
			`# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`# See the License for the specific language governing permissions and`
			`# limitations under the License.`


add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`import inspect`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`import json`
[v5] Delete legacy chat template saving (#41648) * delete lagcy chat template saving * fix tests * fix qwen audio 2025-10-22 11:40:55 +02:00			`import os`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`import random`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`import shutil`
processor tests - use dummy videos (#40537) * use dummy videos * failing on main, new model merged had conflicts 2025-09-01 11:04:47 +02:00			`import sys`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`import tempfile`
Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`from pathlib import Path`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`import numpy as np`
Prepare processors for VideoLLMs (#36149) * allow processor to preprocess conversation + video metadata * allow callable * add test * fix test * nit: fix * add metadata frames_indices * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * port updates from Orr and add one more test * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * typo * as dataclass * style * docstring + maek sure tests green --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> 2025-02-14 11:34:08 +01:00			`from huggingface_hub import hf_hub_download`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`from parameterized import parameterized`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`from transformers.processing_utils import (`
			`MODALITY_TO_AUTOPROCESSOR_MAPPING,`
			`Unpack,`
			`)`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`from transformers.testing_utils import (`
			`check_json_file_has_correct_format,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`require_av,`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`require_librosa,`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`require_torch,`
			`require_vision,`
			`)`
processor tests - use dummy videos (#40537) * use dummy videos * failing on main, new model merged had conflicts 2025-09-01 11:04:47 +02:00			`from transformers.utils import is_torch_available, is_vision_available`


fix some ut failures on XPU w/ torch 2.9 (#41941) * fix some ut failures on XPU w/ torch 2.9 Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> 2025-10-30 03:23:57 -07:00			`parent_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))`
			`sys.path.append(os.path.join(parent_dir, "utils"))`
			`from fetch_hub_objects_for_ci import url_to_local_path # noqa: E402`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00

Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`global_rng = random.Random()`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`if is_vision_available():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`from PIL import Image`

Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`if is_torch_available():`
			`import torch`

[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`MODALITY_INPUT_DATA = {`
			`"images": [`
Change multimodal data links to HF hub (#40309) change multimodal data links to HF hub 2025-08-22 11:50:04 +02:00			`"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png",`
			`"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png",`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`],`
			`"videos": [`
Change multimodal data links to HF hub (#40309) change multimodal data links to HF hub 2025-08-22 11:50:04 +02:00			`"https://huggingface.co/datasets/raushan-testing-hf/videos-test/resolve/main/Big_Buck_Bunny_720_10s_10MB.mp4",`
[video processors] decode only sampled videos -> less RAM and faster processing (#39600) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> 2025-08-26 11:38:02 +02:00			`"https://huggingface.co/datasets/raushan-testing-hf/videos-test/resolve/main/sample_demo_1.mp4",`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`],`
			`"audio": [`
Change multimodal data links to HF hub (#40309) change multimodal data links to HF hub 2025-08-22 11:50:04 +02:00			`"https://huggingface.co/datasets/raushan-testing-hf/audio-test/resolve/main/glass-breaking-151256.mp3",`
			`"https://huggingface.co/datasets/raushan-testing-hf/audio-test/resolve/main/f2641_0_throatclearing.wav",`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`],`
			`}`

processor tests - use dummy videos (#40537) * use dummy videos * failing on main, new model merged had conflicts 2025-09-01 11:04:47 +02:00			`for modality, urls in MODALITY_INPUT_DATA.items():`
			`MODALITY_INPUT_DATA[modality] = [url_to_local_path(url) for url in urls]`
Add support for including in-memory videos (not just files/urls) in apply_chat_template (#39494) * added code for handling video object ,as dictionary of frames and metadata, in chat template * added new test where videos are passed as objects (dict of frames, metadata) in the chat template * modified hardcoded video_len check that does not match with increased number of tests cases. * Modify hardcoded video_len check that fails with increased number of tests * update documentation of multi-modal chat templating with extra information about including video object in chat template. * add array handling in load_video() * temporary test video inlcuded * skip testing smolvlm with videos that are list of frames * update documentation & make fixup * Address review comments 2025-08-04 02:49:42 -07:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00
Remove repeated prepare_images in processor tests (#33163) * Remove repeated prepare_images * Address comments - update docstring; explanatory comment 2024-09-09 13:20:27 +01:00			`def prepare_image_inputs():`
			`"""This function prepares a list of PIL images"""`
			`image_inputs = [np.random.randint(255, size=(3, 30, 400), dtype=np.uint8)]`
			`image_inputs = [Image.fromarray(np.moveaxis(x, 0, -1)) for x in image_inputs]`
			`return image_inputs`


Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`# Copied from tests.models.whisper.test_feature_extraction_whisper.floats_list`
			`def floats_list(shape, scale=1.0, rng=None, name=None):`
			`"""Creates a random float32 tensor"""`
			`if rng is None:`
			`rng = global_rng`

			`values = []`
			`for batch_idx in range(shape[0]):`
			`values.append([])`
			`for _ in range(shape[1]):`
			`values[-1].append(rng.random() * scale)`

			`return values`


add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`@require_torch`
			`@require_vision`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`class ProcessorTesterMixin:`
			`processor_class = None`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`model_id = (`
			`None # Optional: set this to load from a specific pretrained model instead of creating generic components`
			`)`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`text_input_name = "input_ids"`
			`images_input_name = "pixel_values"`
			`videos_input_name = "pixel_values_videos"`
Add VibeVoice ASR (#43625) * Add vibevoice tokenizer files. * Address style tests. * Revert to expected outputs previously computed on runner. * Enable encoder output test. * Update expected output from runner * Add note on expected outputs * remove code link and better init * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * modular * Same changes to decoder layers. * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * doc nits * Use decoder_depths for decoder! * Doc nits * Nits * Trim feature extraction for tensor only usage. * Start files for ASR * Add cache logic to encoder. * Nit * Revert to previous sampling approach. * Nits * Passing equivalence test * Fix for chat template to use sampling rate other than 16kHz * Better logic for vae sampling? * More standard conversion script. * Revert to sample flag * Nits * Make style * Better modular and cleanup. * update asr docs * Fix GLM docstring * Docs, cleanup, nits. * Nit * Cleaner modular and nits * Nits * Nit * Skip parallelism * Update docs. * Finish integration tests, and nits * Repo checks * doc nits * Doc nits * Remove bad file * Skip testing of encoder. * Shift cache creation to when it's used. * Shift cache creation to where it's used. * Updated checkpoint path * Processor nit * Modeling and processing tests. * Nits * Ensure torch compile and nits * Update src/transformers/models/vibevoice_asr/modular_vibevoice_asr.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Refactor vibevoice acoustic tokenizer to have encoder and decoder configs. * Update asr encoder config directly from tokenizer. * Nits and make style happy. * Simplify acoustic tokenizer config. * Make style * Update src/transformers/models/vibevoice_asr/modular_vibevoice_asr.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Renaming and switching away from deprecated approaches. * Better decode, add test, and update docs. * Clearer code paths. * Better pipeline example with exposed post-processing methods * Docstring nit. * Use voxtral cache, cleaner token init, better naming of chunk size. * Add missing docstrings. * Update to official checkpoint. --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> 2026-03-02 12:29:55 +01:00			`audio_input_name_values = "input_values" # raw/normalized audio`
			`audio_input_name = "input_features" # computed features, e.g. Mel spectrogram, STFT`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`# Max-length values used in image-text kwargs tests. Override in subclasses if needed.`
			`image_text_kwargs_max_length = 117`
			`image_text_kwargs_override_max_length = 112`
			`image_unstructured_max_length = 76`

			`# Max-length values used in audio-text kwargs tests. Override in subclasses if needed.`
			`audio_text_kwargs_max_length = 300`
			`audio_processor_tester_max_length = 117`
			`audio_unstructured_max_length = 76`

			`# Max-length values used in video-text kwargs tests. Override in subclasses if needed.`
			`video_text_kwargs_max_length = 167`
			`video_text_kwargs_override_max_length = 162`
			`video_unstructured_max_length = 176`

			`# Max-length value used in chat template tests. Override in subclasses if needed.`
			`chat_template_max_length = 100 # max_length in test_apply_chat_template_*`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`@classmethod`
			`def setUpClass(cls):`
			`"""`
			`Automatically set up the processor test by creating and saving all required components.`
			`Individual test classes only need to set processor_class and optionally:`
			`- model_id: to load components from a specific pretrained model`
			`- prepare_processor_dict(): to provide custom kwargs for processor initialization`
			`"""`
			`if cls.processor_class is None:`
			`raise ValueError(`
			`f"{cls.__name__} must define 'processor_class' attribute. Example: processor_class = MyProcessor"`
			`)`

			`cls.tmpdirname = tempfile.mkdtemp()`

			`# If model_id is specified, load components from that model`
			`if cls.model_id is not None:`
			`processor = cls._setup_from_pretrained(cls.model_id)`
			`else:`
			`# Otherwise, create generic components`
			`processor = cls._setup_from_components()`

			`# setup test attributes`
			`cls._setup_test_attributes(processor)`
			`processor.save_pretrained(cls.tmpdirname)`

			`@classmethod`
			`def _setup_test_attributes(cls, processor):`
			`# to override in the child class to define class attributes`
			`# such as image_token, video_token, audio_token, etc.`
			`pass`

			`@classmethod`
			`def _setup_from_pretrained(cls, model_id, **kwargs):`
			`"""Load all components from a pretrained model."""`

			`# check if there are any custom components to setup`
			`custom_components = {}`
			`for attribute in cls.processor_class.get_attributes():`
			`if hasattr(cls, f"_setup_{attribute}"):`
			`custom_method = getattr(cls, f"_setup_{attribute}")`
			`custom_components[attribute] = custom_method()`
			`# if there is one custom component, we need to add all the other ones (with from_pretrained)`
			`if custom_components:`
			`for attribute in cls.processor_class.get_attributes():`
			`if attribute not in custom_components:`
			`component_class = cls._get_component_class_from_processor(attribute)`
			`custom_components[attribute] = component_class.from_pretrained(model_id)`

			`kwargs.update(cls.prepare_processor_dict())`
			`processor = cls.processor_class.from_pretrained(model_id, custom_components, kwargs)`
			`return processor`

			`@classmethod`
			`def _setup_from_components(cls):`
			`"""Create all required components for the processor and save the complete processor."""`
			`# Get all required attributes for this processor`
			`attributes = cls.processor_class.get_attributes()`

			`# Create each component (but don't save them individually)`
			`components = {}`
			`for attribute in attributes:`
			`components[attribute] = cls._setup_component(attribute)`

			`processor_kwargs = cls.prepare_processor_dict()`
			`processor = cls.processor_class(components, processor_kwargs)`
			`return processor`

			`@classmethod`
			`def _setup_component(cls, attribute):`
			`"""`
			`Create and return a component.`

			`This method first checks for a custom setup method (_setup_{attribute}).`
			`If not found, it tries to get the component class from the processor's Auto mappings`
			`and instantiate it without arguments.`
			`If that fails, it raises an error telling the user to override the setup method.`

			`Individual test classes should override _setup_{attribute}() for custom component setup.`
			`Custom methods should return the created component.`

			`Returns:`
			`The created component instance.`
			`"""`
			`# Check if there's a custom setup method for this specific attribute`
			`custom_method = getattr(cls, f"_setup_{attribute}", None)`
			`if custom_method is not None:`
			`return custom_method()`

			`# Get the component class from processor's Auto mappings`
			`component_class = cls._get_component_class_from_processor(attribute)`

			`# Get the base class name for the component to provide helpful error messages`
			`component_type = attribute.replace("_", " ")`

			`# Try to instantiate the component without arguments`
			`try:`
			`component = component_class()`
			`except Exception as e:`
			`raise TypeError(`
			`f"Failed to instantiate {component_type} ({component_class}) without arguments.\n"`
			`f"Error: {e}\n\n"`
			`f"To fix this, override the setup method in your test class:\n\n"`
			`f" @classmethod\n"`
			`f" def _setup_{attribute}(cls):\n"`
			`f" # Create your custom {component_type}\n"`
			`f" from transformers import {component_class}\n"`
			`f" component = {component_class}(...)\n"`
			`f" return component\n"`
			`) from e`

			`return component`

			`@classmethod`
			`def _get_component_class_from_processor(cls, attribute, use_fast: bool = True):`
			`"""`
			`Get the component class for a given attribute from the processor's Auto mappings.`

			`This extracts the model type from the test file name and uses that to look up`
			`the config class, which is then used to find the appropriate component class.`
			`"""`
			`import inspect`
			`import re`

			`from transformers.models.auto.configuration_auto import (`
			`CONFIG_MAPPING,`
			`CONFIG_MAPPING_NAMES,`
			`SPECIAL_MODEL_TYPE_TO_MODULE_NAME,`
			`)`

			`# Extract model_type from the test file name`
			`# Test files are named like test_processing_align.py or test_processor_align.py`
			`test_file = inspect.getfile(cls)`
			`match = re.search(r"test_process(?:ing\|or)_(\w+)\.py$", test_file)`
			`if not match:`
			`raise ValueError(`
			`f"Could not extract model type from test file name: {test_file}. "`
			`f"Please override _setup_{attribute}() in your test class."`
			`)`

			`model_type = match.group(1)`
			`if model_type not in CONFIG_MAPPING_NAMES:`
			`# check if the model type is a special model type`
			`for special_model_type, special_module_name in SPECIAL_MODEL_TYPE_TO_MODULE_NAME.items():`
			`if model_type == special_module_name:`
			`model_type = special_model_type`
			`break`

			`# Get the config class for this model type`
			`if model_type not in CONFIG_MAPPING_NAMES:`
			`raise ValueError(`
			`f"Model type '{model_type}' not found in CONFIG_MAPPING_NAMES. "`
			`f"Please override _setup_{attribute}() in your test class."`
			`)`

			`config_class = CONFIG_MAPPING[model_type]`

			`# Now get the component class from the appropriate Auto mapping`
			`if attribute in MODALITY_TO_AUTOPROCESSOR_MAPPING:`
			`mapping_name = attribute`
			`elif "tokenizer" in attribute:`
			`mapping_name = "tokenizer"`
			`else:`
			`raise ValueError(`
			`f"Unknown attribute type: '{attribute}'. "`
			`f"Please override _setup_{attribute}() in your test class to provide custom setup."`
			`)`

			`# Get the appropriate Auto mapping for this component type`
			`if mapping_name == "tokenizer":`
			`from transformers.models.auto.tokenization_auto import TOKENIZER_MAPPING`
use `TokenizersBackend` (#42894) * us `TokenizersBackend` * fixes * pioritize mapping * pioritize mapping * only use mapping for some models * fix fallback * undo debug thing * add case to tokenizersbackend init * add default bos eos token to tok backend * set bos eos * fix more models * mistrla idefics * fix stopping criteria test * fix stopping criteria test * try stopping criteria fix * rebase * update tokenizer model for stopping criteria test * fix tuple mapping for ministral * ignore `tokenizer_class` as it is always wrong * up * try to fix idefics * fix unispeech and maybe other: fallback if conversion was not possible to the saveclass * nits * fixup * TIL that it was ALSO saved in config.json... * arf * fallback to tok config if no config json * people who map to Llama probably don't even want llama either.. * processors to load tokbackend * auto fix order * try diff order * mistral fix for weird chars * reorder * random fix attempt for failing tests that are failing locally so idk how to check these * trying an older commit * fix mistral * map unispeech * try something out * update * nits * trying to be a little bit more restrictive * token type ids for tokenizers should be explicits... let's see which test fail this and we'll add to the specific classes? * Nit * idefics 1-2 are actually the only ones that should map to llama force * small fixes * fix layout * fixup * fix some tests * 1 nit * aria fix * style * canine * fixup * very small test * style * update to tokenizersbackend --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-52.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-196.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-217.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: itazap <ita.zaporozhets@huggingface.co> Co-authored-by: Ita Zaporozhets <31893021+itazap@users.noreply.github.com> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> 2026-01-07 17:49:21 +01:00			`from transformers.utils import is_tokenizers_available`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00
			`component_class = TOKENIZER_MAPPING.get(config_class, None)`
use `TokenizersBackend` (#42894) * us `TokenizersBackend` * fixes * pioritize mapping * pioritize mapping * only use mapping for some models * fix fallback * undo debug thing * add case to tokenizersbackend init * add default bos eos token to tok backend * set bos eos * fix more models * mistrla idefics * fix stopping criteria test * fix stopping criteria test * try stopping criteria fix * rebase * update tokenizer model for stopping criteria test * fix tuple mapping for ministral * ignore `tokenizer_class` as it is always wrong * up * try to fix idefics * fix unispeech and maybe other: fallback if conversion was not possible to the saveclass * nits * fixup * TIL that it was ALSO saved in config.json... * arf * fallback to tok config if no config json * people who map to Llama probably don't even want llama either.. * processors to load tokbackend * auto fix order * try diff order * mistral fix for weird chars * reorder * random fix attempt for failing tests that are failing locally so idk how to check these * trying an older commit * fix mistral * map unispeech * try something out * update * nits * trying to be a little bit more restrictive * token type ids for tokenizers should be explicits... let's see which test fail this and we'll add to the specific classes? * Nit * idefics 1-2 are actually the only ones that should map to llama force * small fixes * fix layout * fixup * fix some tests * 1 nit * aria fix * style * canine * fixup * very small test * style * update to tokenizersbackend --------- Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-45.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-168-52.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-174-196.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-217.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-167-111.ec2.internal> Co-authored-by: itazap <ita.zaporozhets@huggingface.co> Co-authored-by: Ita Zaporozhets <31893021+itazap@users.noreply.github.com> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-164-75.ec2.internal> Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-160-100.ec2.internal> 2026-01-07 17:49:21 +01:00			`if component_class is None and is_tokenizers_available():`
			`from transformers.tokenization_utils_tokenizers import TokenizersBackend`

			`component_class = TokenizersBackend`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`elif mapping_name == "image_processor":`
			`from transformers.models.auto.image_processing_auto import IMAGE_PROCESSOR_MAPPING`

			`component_class = IMAGE_PROCESSOR_MAPPING.get(config_class, None)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`elif mapping_name == "feature_extractor" or mapping_name == "audio_processor":`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`from transformers.models.auto.feature_extraction_auto import FEATURE_EXTRACTOR_MAPPING`

			`component_class = FEATURE_EXTRACTOR_MAPPING.get(config_class, None)`
			`elif mapping_name == "video_processor":`
			`from transformers.models.auto.video_processing_auto import VIDEO_PROCESSOR_MAPPING`

			`component_class = VIDEO_PROCESSOR_MAPPING.get(config_class, None)`
			`else:`
			`raise ValueError(f"Unknown mapping for attribute: {attribute}")`

			`if component_class is None:`
			`raise ValueError(`
			`f"Could not find {mapping_name} class for config {config_class.__name__}. "`
			`f"Please override _setup_{attribute}() in your test class."`
			`)`

			`# Handle tuple case (some mappings return tuples of classes)`
			`if isinstance(component_class, tuple):`
			`if use_fast:`
			`component_class = component_class[-1] if component_class[-1] is not None else component_class[0]`
			`else:`
			`component_class = component_class[0] if component_class[0] is not None else component_class[1]`
🚨🚨 Refactor Image Processors to support different backends (#43514) * init refactor * Fix llava * changes after review * update first batch of image processors * refactor part 2 * improve base image processor class, move backends to separate file * refactor to have backends in separate files, with backends now inheriting from BaseImageProcessor * fix docstrings * update some image processors to new refactored standards * refactor more image processors * refactor more image processors * refactor more fast image processors * refactor more image processors * refactor more image processor * improve compatibility with video processors * refactor more image processors * add more image processors, improve compatibility with video processors * support for modular * refactor modular ima proc * refactor more modular image processors * adjustments before merge * fimish image processors refactor * update docs * add fallback to Pil backend for backward compat * fix repo * Fix all processors and image processors tests * fix modular and style * fix docs * fix remote code backward compatibility + super in lists * Update docs and add new model like cli * fix processor tests * relax test tvp (used to be skipped) * fix 4 channels oneformer * Changes after review * Fixes after review * Fix tests * Change imports in modeling tests to minimize integration tests changes * fix wrong import * fix import and missing doc * fix typo PI0 * Fix all integration tests * Fix after review, enforce protected torch/torchvision imports in pil image processors (directly in modular model converter) * Fix style * Fix test modeling depth pro * Fix processing_idefics * Fixes after merge * _rescale_and_normalize -> rescale_and_normalize * fix-repo 2026-03-19 10:33:28 -04:00			`elif isinstance(component_class, dict):`
			`if not use_fast:`
			`component_class = component_class["pil"]`
			`else:`
			`component_class = (`
			`component_class["torchvision"] if "torchvision" in component_class else component_class["pil"]`
			`)`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`return component_class`

			`@classmethod`
			`def tearDownClass(cls):`
			`"""Clean up the temporary directory."""`
			`if hasattr(cls, "tmpdirname"):`
			`shutil.rmtree(cls.tmpdirname, ignore_errors=True)`

[processor] clean up mulitmodal tests (#37362) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot 2025-04-11 13:32:19 +02:00			`@staticmethod`
			`def prepare_processor_dict():`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`"""Override this method to provide custom kwargs for processor initialization."""`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`return {}`

			`def get_component(self, attribute, **kwargs):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if attribute not in MODALITY_TO_AUTOPROCESSOR_MAPPING and "tokenizer" in attribute:`
			`auto_processor_class = MODALITY_TO_AUTOPROCESSOR_MAPPING["tokenizer"]`
			`component = auto_processor_class.from_pretrained(self.tmpdirname, subfolder=attribute, **kwargs) # noqa`
			`else:`
			`auto_processor_class = MODALITY_TO_AUTOPROCESSOR_MAPPING[attribute]`
			`component = auto_processor_class.from_pretrained(self.tmpdirname, **kwargs) # noqa`
Uniformize kwargs for image-text-to-text processors (#32544) * uniformize FUYU processor kwargs * Uniformize instructblip processor kwargs * Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2 * Uniformize llava_next processor * Fix save_load test for processor with chat_template only as extra init args * Fix import Unpack * Fix Fuyu Processor import * Fix FuyuProcessor import * Fix FuyuProcessor * Add defaults for specific kwargs kosmos2 * Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs * Add tests processor Udop * remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature * Fix overwrite tests kwargs processors * Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop * Fix processing test fuyu * remove unnecessary pad_token check in instructblip ProcessorTest * Fix BC tests and cleanup * FIx imports fuyu * Uniformize Pix2Struct * Fix wrong name for FuyuProcessorKwargs * Fix slow tests reversed inputs align fuyu llava-next, change udop warning * Fix wrong logging import udop * Add check images text input order * Fix copies * change text pair handling when positional arg * rebase on main, fix imports in test_processing_common * remove optional args and udop uniformization from this PR * fix failing tests * remove unnecessary test, fix processing utils and test processing common * cleanup Unpack * cleanup * fix conflict grounding dino 2024-09-24 21:28:19 -04:00			`if "tokenizer" in attribute and not component.pad_token:`
Modify ProcessorTesterMixin for better generalization (#32637) * Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs * remove crop_size argument in align processor tests to be coherent with base tests * Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino 2024-08-13 11:48:53 -04:00			`component.pad_token = "[TEST_PAD]"`
Uniformize kwargs for Pixtral processor (#33521) * add uniformized pixtral and kwargs * update doc * fix _validate_images_text_input_order * nit 2024-09-17 14:44:27 -04:00			`if component.pad_token_id is None:`
			`component.pad_token_id = 0`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00
			`return component`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`def prepare_components(self, **kwargs):`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`components = {}`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`for attribute in self.processor_class.get_attributes():`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`component = self.get_component(attribute)`
			`components[attribute] = component`

			`return components`

			`def get_processor(self):`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class.from_pretrained(self.tmpdirname)`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`return processor`

Use \| for Optional and Union typing (#41646) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> 2025-10-16 22:29:54 +08:00			`def prepare_text_inputs(self, batch_size: int \| None = None, modalities: str \| list \| None = None):`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`if isinstance(modalities, str):`
			`modalities = [modalities]`

			`special_token_to_add = ""`
			`if modalities is not None:`
			`for modality in modalities:`
			`special_token_to_add += getattr(self, f"{modality}_token", "")`
[processor] clean up mulitmodal tests (#37362) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot 2025-04-11 13:32:19 +02:00
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00			`if batch_size is None:`
[processor] clean up mulitmodal tests (#37362) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot 2025-04-11 13:32:19 +02:00			`return f"lower newer {special_token_to_add}"`
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00
			`if batch_size < 1:`
			`raise ValueError("batch_size must be greater than 0")`

			`if batch_size == 1:`
[processor] clean up mulitmodal tests (#37362) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot 2025-04-11 13:32:19 +02:00			`return [f"lower newer {special_token_to_add}"]`
			`return [f"lower newer {special_token_to_add}", f" {special_token_to_add} upper older longer string"] + [`
			`f"lower newer {special_token_to_add}"`
			`] * (batch_size - 2)`
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`@require_vision`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`def prepare_image_inputs(self, batch_size: int \| None = None, nested: bool = False):`
Remove repeated prepare_images in processor tests (#33163) * Remove repeated prepare_images * Address comments - update docstring; explanatory comment 2024-09-09 13:20:27 +01:00			`"""This function prepares a list of PIL images for testing"""`
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00			`if batch_size is None:`
			`return prepare_image_inputs()[0]`
			`if batch_size < 1:`
			`raise ValueError("batch_size must be greater than 0")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if nested:`
			`return [prepare_image_inputs()] * batch_size`
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00			`return prepare_image_inputs() * batch_size`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
Qwen2-VL: clean-up and add more tests (#33354) * clean-up on qwen2-vl and add generation tests * add video tests * Update tests/models/qwen2_vl/test_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix and add better tests * Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs and address comments * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * remove size at all --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-09-12 18:24:04 +02:00			`@require_vision`
Use \| for Optional and Union typing (#41646) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> 2025-10-16 22:29:54 +08:00			`def prepare_video_inputs(self, batch_size: int \| None = None):`
Qwen2-VL: clean-up and add more tests (#33354) * clean-up on qwen2-vl and add generation tests * add video tests * Update tests/models/qwen2_vl/test_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix and add better tests * Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs and address comments * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * remove size at all --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-09-12 18:24:04 +02:00			`"""This function prepares a list of numpy videos."""`
			`video_input = [np.random.randint(255, size=(3, 30, 400), dtype=np.uint8)] * 8`
Support batch size > 1 image-text inference (#36682) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-09-01 20:26:07 +08:00			`video_input = np.array(video_input)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`if batch_size is None:`
			`return video_input`
			`return [video_input] * batch_size`
Qwen2-VL: clean-up and add more tests (#33354) * clean-up on qwen2-vl and add generation tests * add video tests * Update tests/models/qwen2_vl/test_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix and add better tests * Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update docs and address comments * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/qwen2_vl.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * remove size at all --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-09-12 18:24:04 +02:00
Use \| for Optional and Union typing (#41646) Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> 2025-10-16 22:29:54 +08:00			`def prepare_audio_inputs(self, batch_size: int \| None = None):`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`"""This function prepares a list of numpy audio."""`
			`raw_speech = floats_list((1, 1000))`
			`raw_speech = [np.asarray(audio) for audio in raw_speech]`
			`if batch_size is None:`
			`return raw_speech`
			`return raw_speech * batch_size`

Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00			`def test_processor_to_json_string(self):`
			`processor = self.get_processor()`
			`obj = json.loads(processor.to_json_string())`
			`for key, value in self.prepare_processor_dict().items():`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`# Chat template is saved as a separate file`
			`if key not in "chat_template":`
[processor] clean up mulitmodal tests (#37362) * clkea up mulitmodal processor tests * fixup * fix tests * fix one last test * forgot 2025-04-11 13:32:19 +02:00			`# json converts dict keys to str, but some processors force convert back to int when init`
			`if (`
			`isinstance(obj[key], dict)`
			`and isinstance(list(obj[key].keys())[0], str)`
			`and isinstance(list(value.keys())[0], int)`
			`):`
			`obj[key] = {int(k): v for k, v in obj[key].items()}`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`self.assertEqual(obj[key], value)`
			`self.assertEqual(getattr(processor, key, None), value)`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00
			`def test_processor_from_and_save_pretrained(self):`
			`processor_first = self.get_processor()`

			`with tempfile.TemporaryDirectory() as tmpdirname:`
Don't save `processor_config.json` if a processor has no extra attribute (#28584) * not save if empty * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2024-01-19 09:59:14 +00:00			`saved_files = processor_first.save_pretrained(tmpdirname)`
			`if len(saved_files) > 0:`
			`check_json_file_has_correct_format(saved_files[0])`
			`processor_second = self.processor_class.from_pretrained(tmpdirname)`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00
Don't save `processor_config.json` if a processor has no extra attribute (#28584) * not save if empty * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2024-01-19 09:59:14 +00:00			`self.assertEqual(processor_second.to_dict(), processor_first.to_dict())`
Save `Processor` (#27761) * save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> 2024-01-18 11:21:45 +01:00
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`for attribute in processor_first.get_attributes():`
Load and save video-processor from separate folder (#33562) * load and save from video-processor folder * Update src/transformers/models/llava_onevision/processing_llava_onevision.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-09-19 09:56:52 +02:00			`attribute_first = getattr(processor_first, attribute)`
			`attribute_second = getattr(processor_second, attribute)`

			`# tokenizer repr contains model-path from where we loaded`
			`if "tokenizer" not in attribute:`
Remove duplicated processor class from config (#42806) * remove duplicated processor class from config * adjust the test cases * check public and private attr, both were used in the past 2025-12-16 14:03:59 +01:00			# We don't store/load `_processor_class` for subprocessors.
			# The `_processor_class` is saved once per config, at general level
			`self.assertFalse(hasattr(attribute_second, "_processor_class"))`
			`self.assertFalse(hasattr(attribute_first, "_processor_class"))`

			`self.assertFalse(hasattr(attribute_second, "processor_class"))`
			`self.assertFalse(hasattr(attribute_first, "processor_class"))`

Load and save video-processor from separate folder (#33562) * load and save from video-processor folder * Update src/transformers/models/llava_onevision/processing_llava_onevision.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-09-19 09:56:52 +02:00			`self.assertEqual(repr(attribute_first), repr(attribute_second))`

[omni modality] support composite processor config (#38142) * dump ugly option to check again tomorrow * tiny update * do not save as nested dict yet! * fix and add tests * fix dia audio tokenizers * rename the flag and fix new model Evolla * fix style * address comments * broken from different PRp * fix saving layoutLM * delete print * delete! 2025-08-28 14:40:27 +02:00			`def test_processor_from_and_save_pretrained_as_nested_dict(self):`
			`processor_first = self.get_processor()`

			`with tempfile.TemporaryDirectory() as tmpdirname:`
🚨 [v5] Toggle the serialization format in processors (#41474) * toggle the serialization * prob this fixes it * fix tests * typo * delete legacy save entirely * remove extra nesting in if * revert test and serialzie a public attr instead of private 2025-10-16 10:19:22 +02:00			`saved_files = processor_first.save_pretrained(tmpdirname)`
[omni modality] support composite processor config (#38142) * dump ugly option to check again tomorrow * tiny update * do not save as nested dict yet! * fix and add tests * fix dia audio tokenizers * rename the flag and fix new model Evolla * fix style * address comments * broken from different PRp * fix saving layoutLM * delete print * delete! 2025-08-28 14:40:27 +02:00			`check_json_file_has_correct_format(saved_files[0])`

			`# Load it back and check if loaded correctly`
			`processor_second = self.processor_class.from_pretrained(tmpdirname)`
			`self.assertEqual(processor_second.to_dict(), processor_first.to_dict())`

			`# Try to load each attribute separately from saved directory`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`for attribute in processor_first.get_attributes():`
			`if attribute not in MODALITY_TO_AUTOPROCESSOR_MAPPING and "tokenizer" in attribute:`
			`auto_processor_class = MODALITY_TO_AUTOPROCESSOR_MAPPING["tokenizer"]`
			`attribute_reloaded = auto_processor_class.from_pretrained(tmpdirname, subfolder=attribute)`
			`else:`
			`auto_processor_class = MODALITY_TO_AUTOPROCESSOR_MAPPING[attribute]`
			`attribute_reloaded = auto_processor_class.from_pretrained(tmpdirname)`
[omni modality] support composite processor config (#38142) * dump ugly option to check again tomorrow * tiny update * do not save as nested dict yet! * fix and add tests * fix dia audio tokenizers * rename the flag and fix new model Evolla * fix style * address comments * broken from different PRp * fix saving layoutLM * delete print * delete! 2025-08-28 14:40:27 +02:00			`attribute_first = getattr(processor_first, attribute)`

			`# tokenizer repr contains model-path from where we loaded`
			`if "tokenizer" not in attribute:`
			`self.assertEqual(repr(attribute_first), repr(attribute_reloaded))`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`def test_save_load_pretrained_additional_features(self):`
			`"""`
			`Tests that additional kwargs passed to from_pretrained are correctly applied to components.`
			`"""`
			`attributes = self.processor_class.get_attributes()`

			`if not any(`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`attr in ["tokenizer", "image_processor", "feature_extractor", "audio_processor", "video_processor"]`
			`for attr in attributes`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`):`
			`self.skipTest("Processor has no tokenizer or image_processor to test additional features")`
			`additional_kwargs = {}`

			`has_tokenizer = "tokenizer" in attributes`
			`if has_tokenizer:`
			`additional_kwargs["cls_token"] = "(CLS)"`
			`additional_kwargs["sep_token"] = "(SEP)"`

			`has_image_processor = "image_processor" in attributes`
			`if has_image_processor:`
			`additional_kwargs["do_normalize"] = False`
			`has_video_processor = "video_processor" in attributes`
			`if has_video_processor:`
			`additional_kwargs["do_normalize"] = False`

			`processor_second = self.processor_class.from_pretrained(self.tmpdirname, **additional_kwargs)`
			`if has_tokenizer:`
			`self.assertEqual(processor_second.tokenizer.cls_token, "(CLS)")`
			`self.assertEqual(processor_second.tokenizer.sep_token, "(SEP)")`
			`if has_image_processor:`
			`self.assertEqual(processor_second.image_processor.do_normalize, False)`
			`if has_video_processor:`
			`self.assertEqual(processor_second.video_processor.do_normalize, False)`

			`def test_processor_from_pretrained_vs_from_components(self):`
			`"""`
			`Tests that loading a processor fully with from_pretrained produces the same result as`
			`loading each component individually with from_pretrained and building the processor from them.`
			`"""`
			`# Load processor fully with from_pretrained`
			`processor_full = self.get_processor()`

			`# Load each component individually with from_pretrained`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`# Build processor from components + prepare_processor_dict() kwargs`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor_from_components = self.processor_class(components, processor_kwargs)`

			`self.assertEqual(processor_from_components.to_dict(), processor_full.to_dict())`

[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`def test_model_input_names(self):`
			`processor = self.get_processor()`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`text = self.prepare_text_inputs(modalities=["image", "video", "audio"])`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`image_input = self.prepare_image_inputs()`
			`video_inputs = self.prepare_video_inputs()`
			`audio_inputs = self.prepare_audio_inputs()`
			`inputs_dict = {"text": text, "images": image_input, "videos": video_inputs, "audio": audio_inputs}`

			`call_signature = inspect.signature(processor.__call__)`
			`input_args = [param.name for param in call_signature.parameters.values()]`
			`inputs_dict = {k: v for k, v in inputs_dict.items() if k in input_args}`

			`inputs = processor(**inputs_dict, return_tensors="pt")`

			`self.assertSetEqual(set(inputs.keys()), set(processor.model_input_names))`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`def test_image_processor_defaults(self):`
			`"""`
			`Tests that image processor is called correctly when passing images to the processor.`
			`This test verifies that processor(images=X) produces the same output as image_processor(X).`
			`"""`
			`# Skip if processor doesn't have image_processor`
			`if "image_processor" not in self.processor_class.get_attributes():`
			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`

			`image_processor = self.get_component("image_processor")`

			`# Get all required components for processor`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`processor = self.processor_class(**components)`

			`image_input = self.prepare_image_inputs()`

			`input_image_proc = image_processor(image_input, return_tensors="pt")`
			`try:`
			`input_processor = processor(images=image_input, return_tensors="pt")`
			`except Exception:`
			`# The processor does not accept image only input, so we can skip this test`
			`self.skipTest("Processor does not accept image-only input.")`

			`# Verify outputs match`
			`for key in input_image_proc:`
			`torch.testing.assert_close(input_image_proc[key], input_processor[key])`

			`def test_tokenizer_defaults(self):`
			`"""`
			`Tests that tokenizer is called correctly when passing text to the processor.`
			`This test verifies that processor(text=X) produces the same output as tokenizer(X).`
			`"""`
			`# Skip if processor doesn't have tokenizer`
			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`

			`# Get all required components for processor`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`processor = self.processor_class(**components)`
			`tokenizer = components["tokenizer"]`

			`input_str = ["lower newer"]`

			`# Process with both tokenizer and processor (disable padding to ensure same output)`
			`try:`
			`encoded_processor = processor(text=input_str, padding=False, return_tensors="pt")`
			`except Exception:`
			`# The processor does not accept text only input, so we can skip this test`
			`self.skipTest("Processor does not accept text-only input.")`
			`encoded_tok = tokenizer(input_str, padding=False, return_tensors="pt")`

			`# Verify outputs match (handle processors that might not return token_type_ids)`
			`for key in encoded_tok:`
			`if key in encoded_processor:`
			`self.assertListEqual(encoded_tok[key].tolist(), encoded_processor[key].tolist())`

			`def test_feature_extractor_defaults(self):`
			`"""`
			`Tests that feature extractor is called correctly when passing audio to the processor.`
			`This test verifies that processor(audio=X) produces the same output as feature_extractor(X).`
			`"""`
			`# Skip if processor doesn't have feature_extractor`
			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`and "audio_processor" not in self.processor_class.get_attributes()`
			`):`
			`self.skipTest(f"feature_extractor or audio_processor attribute not present in {self.processor_class}")`

			`if "feature_extractor" in self.processor_class.get_attributes():`
			`feature_extractor = self.get_component("feature_extractor")`
			`else:`
			`feature_extractor = self.get_component("audio_processor")`

			`# Get all required components for processor`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`processor = self.processor_class(**components)`

			`audio_input = self.prepare_audio_inputs()`

			`# Process with both feature_extractor and processor`
			`input_feat_extract = feature_extractor(audio_input, return_tensors="pt")`
			`try:`
			`input_processor = processor(audio=audio_input, return_tensors="pt")`
			`except Exception:`
			`# The processor does not accept audio only input, so we can skip this test`
			`self.skipTest("Processor does not accept audio-only input.")`

			`# Verify outputs match`
			`for key in input_feat_extract:`
			`torch.testing.assert_close(input_feat_extract[key], input_processor[key])`

			`def test_video_processor_defaults(self):`
			`"""`
			`Tests that video processor is called correctly when passing videos to the processor.`
			`This test verifies that processor(videos=X) produces the same output as video_processor(X).`
			`"""`
			`# Skip if processor doesn't have video_processor`
			`if "video_processor" not in self.processor_class.get_attributes():`
			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`

			`video_processor = self.get_component("video_processor")`

			`# Get all required components for processor`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`processor = self.processor_class(**components)`

			`video_input = self.prepare_video_inputs()`

			`# Process with both video_processor and processor`
			`input_video_proc = video_processor(video_input, return_tensors="pt")`
			`try:`
			`input_processor = processor(videos=video_input, return_tensors="pt")`
			`except Exception:`
			`# The processor does not accept video only input, so we can skip this test`
			`self.skipTest("Processor does not accept video-only input.")`

			`# Verify outputs match`
			`for key in input_video_proc:`
			`torch.testing.assert_close(input_video_proc[key], input_processor[key])`

			`def test_tokenizer_decode_defaults(self):`
			`"""`
			`Tests that processor.batch_decode() correctly forwards to tokenizer.batch_decode().`
			`"""`
			`# Skip if processor doesn't have tokenizer`
			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`

			`# Get all required components for processor`
			`components = {}`
			`for attribute in self.processor_class.get_attributes():`
			`components[attribute] = self.get_component(attribute)`

			`processor = self.processor_class(**components)`
			`tokenizer = components["tokenizer"]`

			`predicted_ids = [[1, 4, 5, 8, 1, 0, 8], [3, 4, 3, 1, 1, 8, 9]]`

			`# Test batch_decode`
			`decoded_processor = processor.batch_decode(predicted_ids)`
			`decoded_tok = tokenizer.batch_decode(predicted_ids)`

			`self.assertListEqual(decoded_tok, decoded_processor)`

			`def test_processor_with_multiple_inputs(self):`
			`"""`
			`Tests that processor correctly handles multiple modality inputs together.`
			`Verifies that the output contains expected keys and raises error when no input is provided.`
			`"""`
			`# Skip if processor doesn't have multiple attributes (not multimodal)`
			`attributes = self.processor_class.get_attributes()`
			`if len(attributes) <= 1:`
			`self.skipTest(f"Processor only has {len(attributes)} attribute(s), test requires multimodal processor")`

			`processor = self.get_processor()`

			`# Map attributes to input parameter names, prepare methods, and output key names`
			`attr_to_input_param = {`
			`"tokenizer": ("text", "prepare_text_inputs", "text_input_name"),`
			`"image_processor": ("images", "prepare_image_inputs", "images_input_name"),`
			`"video_processor": ("videos", "prepare_video_inputs", "videos_input_name"),`
			`"feature_extractor": ("audio", "prepare_audio_inputs", "audio_input_name"),`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"audio_processor": ("audio", "prepare_audio_inputs", "audio_input_name"),`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`}`

			`# Prepare inputs dynamically based on processor attributes`
			`processor_inputs = {}`
			`expected_output_keys = []`

			`for attr in attributes:`
			`if attr in attr_to_input_param:`
			`param_name, prepare_method_name, output_key_attr = attr_to_input_param[attr]`
			`# Call the prepare method`
			`prepare_method = getattr(self, prepare_method_name)`
			`if param_name == "text":`
			`modalities = []`
			`if "image_processor" in attributes:`
			`modalities.append("image")`
			`if "video_processor" in attributes:`
			`modalities.append("video")`
			`if "audio_processor" in attributes or "feature_extractor" in attributes:`
			`modalities.append("audio")`
			`processor_inputs[param_name] = prepare_method(modalities=modalities)`
			`else:`
			`processor_inputs[param_name] = prepare_method()`
			`# Track expected output keys`
			`expected_output_keys.append(getattr(self, output_key_attr))`

			`# Test combined processing`
			`inputs = processor(**processor_inputs, return_tensors="pt")`

			`# Verify output contains all expected keys`
			`for key in expected_output_keys:`
Add VibeVoice ASR (#43625) * Add vibevoice tokenizer files. * Address style tests. * Revert to expected outputs previously computed on runner. * Enable encoder output test. * Update expected output from runner * Add note on expected outputs * remove code link and better init * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * modular * Same changes to decoder layers. * Update src/transformers/models/vibevoice_acoustic_tokenizer/modular_vibevoice_acoustic_tokenizer.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * doc nits * Use decoder_depths for decoder! * Doc nits * Nits * Trim feature extraction for tensor only usage. * Start files for ASR * Add cache logic to encoder. * Nit * Revert to previous sampling approach. * Nits * Passing equivalence test * Fix for chat template to use sampling rate other than 16kHz * Better logic for vae sampling? * More standard conversion script. * Revert to sample flag * Nits * Make style * Better modular and cleanup. * update asr docs * Fix GLM docstring * Docs, cleanup, nits. * Nit * Cleaner modular and nits * Nits * Nit * Skip parallelism * Update docs. * Finish integration tests, and nits * Repo checks * doc nits * Doc nits * Remove bad file * Skip testing of encoder. * Shift cache creation to when it's used. * Shift cache creation to where it's used. * Updated checkpoint path * Processor nit * Modeling and processing tests. * Nits * Ensure torch compile and nits * Update src/transformers/models/vibevoice_asr/modular_vibevoice_asr.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Refactor vibevoice acoustic tokenizer to have encoder and decoder configs. * Update asr encoder config directly from tokenizer. * Nits and make style happy. * Simplify acoustic tokenizer config. * Make style * Update src/transformers/models/vibevoice_asr/modular_vibevoice_asr.py Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> * Renaming and switching away from deprecated approaches. * Better decode, add test, and update docs. * Clearer code paths. * Better pipeline example with exposed post-processing methods * Docstring nit. * Use voxtral cache, cleaner token init, better naming of chunk size. * Add missing docstrings. * Update to official checkpoint. --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com> 2026-03-02 12:29:55 +01:00			`if key == self.audio_input_name:`
			`self.assertTrue(`
			`self.audio_input_name_values in inputs or self.audio_input_name in inputs,`
			`f"Expected either '{self.audio_input_name_values}' or '{self.audio_input_name}' in inputs",`
			`)`
			`else:`
			`self.assertIn(key, inputs)`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00
			`# Test that it raises error when no input is passed`
			`with self.assertRaises((TypeError, ValueError)):`
			`processor()`

Support batch size > 1 image-text inference (#36682) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-09-01 20:26:07 +08:00			`def test_processor_text_has_no_visual(self):`
			`"""`
			`Tests that multimodal models can process batch of inputs where samples can`
			`be with images/videos or without. See https://github.com/huggingface/transformers/issues/40263`
			`"""`
			`processor = self.get_processor()`
			`call_signature = inspect.signature(processor.__call__)`
			`input_args = [param.name for param in call_signature.parameters.values() if param.annotation != param.empty]`

			`if not ("text" in input_args and ("images" in input_args and "videos" in input_args)):`
			`self.skipTest(f"{self.processor_class} doesn't support several vision modalities with text.")`

			`# Prepare inputs and filter by input signature. Make sure to use a high batch size, we'll set some`
			`# samples to text-only later`
			`text = self.prepare_text_inputs(batch_size=3, modalities=["image", "video"])`
			`image_inputs = self.prepare_image_inputs(batch_size=3)`
			`video_inputs = self.prepare_video_inputs(batch_size=3)`
			`inputs_dict = {"text": text, "images": image_inputs, "videos": video_inputs}`
			`inputs_dict = {k: v for k, v in inputs_dict.items() if k in input_args}`

			`processing_kwargs = {"return_tensors": "pt", "padding": True}`
			`if "videos" in inputs_dict:`
			`processing_kwargs["do_sample_frames"] = False`

Fix typos in tests and util (#40780) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> 2025-09-10 19:45:40 +08:00			`# First call processor with all inputs and use nested input type, which is the format supported by all multimodal processors`
Support batch size > 1 image-text inference (#36682) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-09-01 20:26:07 +08:00			`image_inputs_nested = [[image] if not isinstance(image, list) else image for image in image_inputs]`
			`video_inputs_nested = [[video] for video in video_inputs]`
			`inputs_dict_nested = {"text": text, "images": image_inputs_nested, "videos": video_inputs_nested}`
			`inputs_dict_nested = {k: v for k, v in inputs_dict_nested.items() if k in input_args}`
			`inputs = processor(inputs_dict_nested, processing_kwargs)`
			`self.assertTrue(self.text_input_name in inputs)`

			`# Now call with one of the samples with no associated vision input. Let's set the first input to be a plain text`
			# with no placeholder tokens and no images/videos. The final format would be `images = [[], [image2], [image3]]`
			`plain_text = "lower newer"`
			`image_inputs_nested[0] = []`
			`video_inputs_nested[0] = []`
			`text[0] = plain_text`
			`inputs_dict_no_vision = {"text": text, "images": image_inputs_nested, "videos": video_inputs_nested}`
			`inputs_dict_no_vision = {k: v for k, v in inputs_dict_no_vision.items() if k in input_args}`
			`inputs_nested = processor(inputs_dict_no_vision, processing_kwargs)`

			`# Check that text samples are same and are expanded with placeholder tokens correctly. First sample`
			`# has no vision input associated, so we skip it and check it has no vision`
			`self.assertListEqual(`
			`inputs[self.text_input_name][1:].tolist(), inputs_nested[self.text_input_name][1:].tolist()`
			`)`

			`# Now test if we can apply chat templates with no vision inputs in one of the samples`
			`# NOTE: we don't skip the test as we want the above to be checked even if process has to chat template`
			`if processor.chat_template is not None:`
			`messages = [`
			`[`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "text", "text": "What is the capital of France?"},`
			`],`
			`},`
			`],`
			`[`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "text", "text": "What is the capital of France?"},`
			`{`
			`"type": "image",`
Fetch one missing test data (#40703) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2025-09-04 23:05:23 +02:00			`"url": url_to_local_path(`
			`"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"`
			`),`
Support batch size > 1 image-text inference (#36682) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-09-01 20:26:07 +08:00			`},`
			`],`
			`},`
			`],`
			`]`

			`inputs_chat_template = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=False,`
			`tokenize=True,`
			`return_dict=True,`
			`return_tensors="pt",`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"padding": True},`
Support batch size > 1 image-text inference (#36682) * update make nested image list * fix make flat list of images * update type anno * fix image_processing_smolvlm * use first image * add verbose comment * fix images * rollback * fix ut * Update image_processing_smolvlm.py * Update image_processing_idefics3.py * add tests and fix some processors * fix copies * fix after rebase * make the test cover chat templates * sjip udop, no point in fixing it * fix after rebase * fix a few more tests --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-09-01 20:26:07 +08:00			`)`
			`self.assertTrue(self.text_input_name in inputs_chat_template)`

add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`# These kwargs-related tests ensure that processors are correctly instantiated.`
			`# they need to be applied only if an image_processor exists.`

			`def skip_processor_without_typed_kwargs(self, processor):`
			`# TODO this signature check is to test only uniformized processors.`
			`# Once all are updated, remove it.`
			`is_kwargs_typed_dict = False`
			`call_signature = inspect.signature(processor.__call__)`
			`for param in call_signature.parameters.values():`
			`if param.kind == param.VAR_KEYWORD and param.annotation != param.empty:`
			`is_kwargs_typed_dict = (`
			`hasattr(param.annotation, "__origin__") and param.annotation.__origin__ == Unpack`
			`)`
			`if not is_kwargs_typed_dict:`
			`self.skipTest(f"{self.processor_class} doesn't have typed kwargs.")`

			`def test_tokenizer_defaults_preserved_by_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.image_text_kwargs_max_length, padding="max_length"`
			`)`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`
			`inputs = processor(text=input_str, images=image_input, return_tensors="pt")`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.image_text_kwargs_max_length)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_image_processor_defaults_preserved_by_image_kwargs(self):`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`"""`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`We use do_rescale=True, rescale_factor=-1.0 to ensure that image_processor kwargs are preserved in the processor.`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`We then check that the mean of the pixel_values is less than or equal to 0 after processing.`
			`Since the original pixel_values are in [0, 255], this is a good indicator that the rescale_factor is indeed applied.`
			`"""`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
			`processor_components["image_processor"] = self.get_component(`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"image_processor", do_rescale=True, rescale_factor=-1.0`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.image_text_kwargs_max_length, padding="max_length"`
			`)`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`

Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`inputs = processor(text=input_str, images=image_input, return_tensors="pt")`
			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_kwargs_overrides_default_tokenizer_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
			`processor_components["tokenizer"] = self.get_component("tokenizer", padding="longest")`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`
Modify ProcessorTesterMixin for better generalization (#32637) * Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs * remove crop_size argument in align processor tests to be coherent with base tests * Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino 2024-08-13 11:48:53 -04:00			`inputs = processor(`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`text=input_str,`
			`images=image_input,`
			`return_tensors="pt",`
			`max_length=self.image_text_kwargs_override_max_length,`
			`padding="max_length",`
Modify ProcessorTesterMixin for better generalization (#32637) * Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs * remove crop_size argument in align processor tests to be coherent with base tests * Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino 2024-08-13 11:48:53 -04:00			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.image_text_kwargs_override_max_length)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_kwargs_overrides_default_image_processor_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
			`processor_components["image_processor"] = self.get_component(`
			`"image_processor", do_rescale=True, rescale_factor=1`
			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.image_text_kwargs_max_length, padding="max_length"`
			`)`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`

Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`inputs = processor(`
			`text=input_str, images=image_input, do_rescale=True, rescale_factor=-1.0, return_tensors="pt"`
			`)`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_unstructured_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`
			`inputs = processor(`
			`text=input_str,`
			`images=image_input,`
			`return_tensors="pt",`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`do_rescale=True,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`rescale_factor=-1.0,`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`padding="max_length",`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`max_length=self.image_unstructured_max_length,`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`)`

Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.image_unstructured_max_length)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_unstructured_kwargs_batched(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=2, modalities="image")`
Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor 2024-10-01 23:52:03 +02:00			`image_input = self.prepare_image_inputs(batch_size=2)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`inputs = processor(`
			`text=input_str,`
			`images=image_input,`
			`return_tensors="pt",`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`do_rescale=True,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`rescale_factor=-1.0,`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`padding="longest",`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`max_length=self.image_unstructured_max_length,`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`)`

Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
			`self.assertTrue(`
			`len(inputs[self.text_input_name][0]) == len(inputs[self.text_input_name][1])`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`and len(inputs[self.text_input_name][1]) < self.image_unstructured_max_length`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_doubly_passed_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = [self.prepare_text_inputs(modalities="image")]`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`
			`with self.assertRaises(ValueError):`
			`_ = processor(`
			`text=input_str,`
			`images=image_input,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`images_kwargs={"do_rescale": True, "rescale_factor": -1.0},`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`do_rescale=True,`
			`return_tensors="pt",`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`)`

Fix multimodal processor get duplicate arguments when receive kwargs for initialization (#39125) * fix processor tokenizer override Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * add regression test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * check image processor same Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> 2025-07-02 19:57:15 +08:00			`def test_args_overlap_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
Fix multimodal processor get duplicate arguments when receive kwargs for initialization (#39125) * fix processor tokenizer override Signed-off-by: Isotr0py <2037008807@qq.com> * code format Signed-off-by: Isotr0py <2037008807@qq.com> * add regression test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * check image processor same Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> 2025-07-02 19:57:15 +08:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
			`processor_first = self.get_processor()`
			`image_processor = processor_first.image_processor`
			`image_processor.is_override = True`

			`with tempfile.TemporaryDirectory() as tmpdirname:`
			`processor_first.save_pretrained(tmpdirname)`
			`processor_second = self.processor_class.from_pretrained(tmpdirname, image_processor=image_processor)`
			`self.assertTrue(processor_second.image_processor.is_override)`

add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`def test_structured_kwargs_nested(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`

			`# Define the kwargs for each modality`
			`all_kwargs = {`
			`"common_kwargs": {"return_tensors": "pt"},`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"images_kwargs": {"do_rescale": True, "rescale_factor": -1.0},`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"text_kwargs": {"padding": "max_length", "max_length": self.image_unstructured_max_length},`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`}`

			`inputs = processor(text=input_str, images=image_input, **all_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.image_unstructured_max_length)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00
			`def test_structured_kwargs_nested_from_dict(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`processor_components = self.prepare_components()`
VLMs: major clean up 🧼 (#34502) only lllava models are modified 2025-01-08 10:35:23 +01:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`image_input = self.prepare_image_inputs()`

			`# Define the kwargs for each modality`
			`all_kwargs = {`
			`"common_kwargs": {"return_tensors": "pt"},`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"images_kwargs": {"do_rescale": True, "rescale_factor": -1.0},`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"text_kwargs": {"padding": "max_length", "max_length": self.image_unstructured_max_length},`
add initial design for uniform processors + align model (#31197) * add initial design for uniform processors + align model * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * expand VideoInput * fix * fix style * remove defaults values * add comment to indicate documentation on adding kwargs * protect imports * [run-slow]align * fix * remove set() that breaks ordering * test more * removed unused func * [run-slow]align 2024-06-13 16:27:16 +02:00			`}`

			`inputs = processor(text=input_str, images=image_input, **all_kwargs)`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00			`self.assertLessEqual(inputs[self.images_input_name][0][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.image_unstructured_max_length)`
add uniform processors for altclip + chinese_clip (#31198) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * remove try/catch block * deprecate kwarg * format * add copyright + remove unused method * [run-slow]altclip, chinese_clip * clean imports * fix version * clean up deprecation * fix style * add corner case test on kwarg overlap * resume processing - add Unpack as importable * add tmpdirname * fix altclip * fix up * add back crop_size to specific tests * generalize tests to possible video_processor * add back crop_size arg * fixup overlapping kwargs test for qformer_tokenizer * remove copied from * fixup chinese_clip tests values * fixup tests - qformer tokenizers * [run-slow] altclip, chinese_clip * remove prepare_image_inputs 2024-09-19 17:21:54 +02:00
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`# text + audio kwargs testing`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`@require_torch`
			`def test_tokenizer_defaults_preserved_by_kwargs_audio(self):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.audio_text_kwargs_max_length, padding="max_length"`
			`)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skip_processor_without_typed_kwargs(processor)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`raw_speech = self.prepare_audio_inputs(batch_size=3)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`inputs = processor(text=input_str, audio=raw_speech, return_tensors="pt")`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(len(inputs[self.text_input_name][0]), self.audio_text_kwargs_max_length)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
			`@require_torch`
			`def test_kwargs_overrides_default_tokenizer_kwargs_audio(self):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.audio_processor_tester_max_length`
			`)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skip_processor_without_typed_kwargs(processor)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`raw_speech = self.prepare_audio_inputs(batch_size=3)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`inputs = processor(`
			`text=input_str,`
			`audio=raw_speech,`
			`return_tensors="pt",`
			`max_length=self.audio_text_kwargs_max_length,`
			`padding="max_length",`
			`)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(len(inputs[self.text_input_name][0]), self.audio_text_kwargs_max_length)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
			`@require_torch`
			`def test_unstructured_kwargs_audio(self):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor_components = self.prepare_components()`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`raw_speech = self.prepare_audio_inputs(batch_size=3)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`inputs = processor(`
			`text=input_str,`
			`audio=raw_speech,`
			`return_tensors="pt",`
			`max_length=self.audio_text_kwargs_max_length,`
			`padding="max_length",`
			`)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(len(inputs[self.text_input_name][0]), self.audio_text_kwargs_max_length)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
			`@require_torch`
			`def test_doubly_passed_kwargs_audio(self):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor_components = self.prepare_components()`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`raw_speech = self.prepare_audio_inputs(batch_size=3)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`with self.assertRaises(ValueError):`
			`_ = processor(`
			`text=input_str,`
			`audio=raw_speech,`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`text_kwargs={"padding": "max_length"},`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`padding="max_length",`
			`)`

			`@require_torch`
			`@require_vision`
			`def test_structured_kwargs_audio_nested(self):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
			`self.skipTest(f"feature_extractor or audio_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.audio_processor_tester_max_length`
			`)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`

Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
[processor] move commonalities to mixin (#40339) * move commonalities to mixin * revert - unrelated * fix copies * fix style * comments 2025-08-22 13:04:43 +02:00			`raw_speech = self.prepare_audio_inputs(batch_size=3)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
			`# Define the kwargs for each modality`
			`all_kwargs = {`
			`"common_kwargs": {"return_tensors": "pt"},`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"text_kwargs": {"padding": "max_length", "max_length": self.audio_unstructured_max_length},`
			`"audio_kwargs": {"padding": "max_length", "max_length": self.audio_text_kwargs_max_length},`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00			`}`

			`inputs = processor(text=input_str, audio=raw_speech, **all_kwargs)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(len(inputs[self.text_input_name][0]), self.audio_unstructured_max_length)`
Uniformize model processors (#31368) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * :broom: * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-10-02 10:41:08 +02:00
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`def test_tokenizer_defaults_preserved_by_kwargs_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_components = self.prepare_components()`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.video_text_kwargs_max_length, padding="max_length"`
			`)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_kwargs = self.prepare_processor_dict()`

			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`inputs = processor(text=input_str, videos=video_input, do_sample_frames=False, return_tensors="pt")`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.video_text_kwargs_max_length)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_video_processor_defaults_preserved_by_video_kwargs(self):`
			`"""`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`We use do_rescale=True, rescale_factor=-1.0 to ensure that image_processor kwargs are preserved in the processor.`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`We then check that the mean of the pixel_values is less than or equal to 0 after processing.`
			`Since the original pixel_values are in [0, 255], this is a good indicator that the rescale_factor is indeed applied.`
			`"""`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_components = self.prepare_components()`
			`processor_components["video_processor"] = self.get_component(`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"video_processor", do_rescale=True, rescale_factor=-1.0`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.video_text_kwargs_max_length, padding="max_length"`
			`)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_kwargs = self.prepare_processor_dict()`

			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`

Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`inputs = processor(text=input_str, videos=video_input, do_sample_frames=False, return_tensors="pt")`
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_kwargs_overrides_default_tokenizer_kwargs_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_components = self.prepare_components()`
			`processor_components["tokenizer"] = self.get_component("tokenizer", padding="longest")`
			`processor_kwargs = self.prepare_processor_dict()`

			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`
			`inputs = processor(`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`text=input_str,`
			`videos=video_input,`
			`do_sample_frames=False,`
			`return_tensors="pt",`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`max_length=self.video_text_kwargs_override_max_length,`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`padding="max_length",`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.video_text_kwargs_override_max_length)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_kwargs_overrides_default_video_processor_kwargs(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
Simplify and standardize processor tests (#41773) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Standardize mgp_str tests * fix after review 2025-11-26 12:40:37 -05:00			`if "tokenizer" not in self.processor_class.get_attributes():`
			`self.skipTest(f"tokenizer attribute not present in {self.processor_class}")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_components = self.prepare_components()`
			`processor_components["video_processor"] = self.get_component(`
			`"video_processor", do_rescale=True, rescale_factor=1`
			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`processor_components["tokenizer"] = self.get_component(`
			`"tokenizer", max_length=self.video_text_kwargs_max_length, padding="max_length"`
			`)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`processor_kwargs = self.prepare_processor_dict()`

			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`

Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`inputs = processor(`
			`text=input_str,`
			`videos=video_input,`
			`do_sample_frames=False,`
			`do_rescale=True,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`rescale_factor=-1.0,`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`return_tensors="pt",`
			`)`
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_unstructured_kwargs_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`
			`inputs = processor(`
			`text=input_str,`
			`videos=video_input,`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`do_sample_frames=False,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`return_tensors="pt",`
			`do_rescale=True,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`rescale_factor=-1.0,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`padding="max_length",`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`max_length=self.video_unstructured_max_length,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`)`

🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.video_unstructured_max_length)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_unstructured_kwargs_batched_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=2, modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs(batch_size=2)`
			`inputs = processor(`
			`text=input_str,`
			`videos=video_input,`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`do_sample_frames=False,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`return_tensors="pt",`
			`do_rescale=True,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`rescale_factor=-1.0,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`padding="longest",`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`max_length=self.video_unstructured_max_length,`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`)`

🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.assertTrue(`
			`len(inputs[self.text_input_name][0]) == len(inputs[self.text_input_name][1])`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`and len(inputs[self.text_input_name][1]) < self.video_unstructured_max_length`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`)`

			`def test_doubly_passed_kwargs_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = [self.prepare_text_inputs(modalities="video")]`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`
			`with self.assertRaises(ValueError):`
			`_ = processor(`
			`text=input_str,`
			`videos=video_input,`
Update Glm4V processor and add tests (#39988) * update GLm4V and add tests * Update tests/models/glm4v/test_processor_glm4v.py Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * remove min/max pixels for BC * fix video tests --------- Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> 2025-08-12 13:40:54 +02:00			`do_sample_frames=False,`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`videos_kwargs={"do_rescale": True, "rescale_factor": -1.0},`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`do_rescale=True,`
			`return_tensors="pt",`
			`)`

			`def test_structured_kwargs_nested_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`

			`# Define the kwargs for each modality`
			`all_kwargs = {`
			`"common_kwargs": {"return_tensors": "pt"},`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"videos_kwargs": {"do_rescale": True, "rescale_factor": -1.0, "do_sample_frames": False},`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"text_kwargs": {"padding": "max_length", "max_length": self.video_unstructured_max_length},`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`}`

			`inputs = processor(text=input_str, videos=video_input, **all_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`

🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.video_unstructured_max_length)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
			`def test_structured_kwargs_nested_from_dict_video(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "video_processor" not in self.processor_class.get_attributes():`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`self.skipTest(f"video_processor attribute not present in {self.processor_class}")`
			`processor_components = self.prepare_components()`
			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
			`self.skip_processor_without_typed_kwargs(processor)`
Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="video")`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`video_input = self.prepare_video_inputs()`

			`# Define the kwargs for each modality`
			`all_kwargs = {`
			`"common_kwargs": {"return_tensors": "pt"},`
Validate processing kwargs with @strict from huggingface_hub (#40793) * initial design draft * delete * fix a few tests * fix * fix the rest of tests * common-kwargs * why the runner complains about typing with "\|"? * revert * forgot to delete * update * fix last issues * add more detalis in docs * pin the latest hub release * fix tests for new models * also fast image processor * fix copies * image processing ast validated * fix more tests * typo.and fix copies * bump * style * fix some tests * fix copies * pin rc4 and mark all TypedDict as non-total * delete typed dict adaptor * address comments * delete optionals 2025-10-08 16:14:09 +02:00			`"videos_kwargs": {"do_rescale": True, "rescale_factor": -1.0, "do_sample_frames": False},`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`"text_kwargs": {"padding": "max_length", "max_length": self.video_unstructured_max_length},`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00			`}`

			`inputs = processor(text=input_str, videos=video_input, **all_kwargs)`
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`self.assertLessEqual(inputs[self.videos_input_name][0].mean(), 0)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(inputs[self.text_input_name].shape[-1], self.video_unstructured_max_length)`
Uniformize LlavaNextVideoProcessor kwargs (#35613) * Uniformize processor kwargs and add tests * add videos_kwargs tests * fix copies * fix llava_next_video chat template tests * remove unnecessary default kwargs 2025-02-18 14:13:51 -05:00
add uniform processors for altclip + chinese_clip (#31198) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * remove try/catch block * deprecate kwarg * format * add copyright + remove unused method * [run-slow]altclip, chinese_clip * clean imports * fix version * clean up deprecation * fix style * add corner case test on kwarg overlap * resume processing - add Unpack as importable * add tmpdirname * fix altclip * fix up * add back crop_size to specific tests * generalize tests to possible video_processor * add back crop_size arg * fixup overlapping kwargs test for qformer_tokenizer * remove copied from * fixup chinese_clip tests values * fixup tests - qformer tokenizers * [run-slow] altclip, chinese_clip * remove prepare_image_inputs 2024-09-19 17:21:54 +02:00			`# TODO: the same test, but for audio + text processors that have strong overlap in kwargs`
			`# TODO (molbap) use the same structure of attribute kwargs for other tests to avoid duplication`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`def test_overlapping_text_image_kwargs_handling(self):`
[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if "image_processor" not in self.processor_class.get_attributes():`
add uniform processors for altclip + chinese_clip (#31198) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * remove try/catch block * deprecate kwarg * format * add copyright + remove unused method * [run-slow]altclip, chinese_clip * clean imports * fix version * clean up deprecation * fix style * add corner case test on kwarg overlap * resume processing - add Unpack as importable * add tmpdirname * fix altclip * fix up * add back crop_size to specific tests * generalize tests to possible video_processor * add back crop_size arg * fixup overlapping kwargs test for qformer_tokenizer * remove copied from * fixup chinese_clip tests values * fixup tests - qformer tokenizers * [run-slow] altclip, chinese_clip * remove prepare_image_inputs 2024-09-19 17:21:54 +02:00			`self.skipTest(f"image_processor attribute not present in {self.processor_class}")`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
Uniformize kwargs for image-text-to-text processors (#32544) * uniformize FUYU processor kwargs * Uniformize instructblip processor kwargs * Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2 * Uniformize llava_next processor * Fix save_load test for processor with chat_template only as extra init args * Fix import Unpack * Fix Fuyu Processor import * Fix FuyuProcessor import * Fix FuyuProcessor * Add defaults for specific kwargs kosmos2 * Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs * Add tests processor Udop * remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature * Fix overwrite tests kwargs processors * Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop * Fix processing test fuyu * remove unnecessary pad_token check in instructblip ProcessorTest * Fix BC tests and cleanup * FIx imports fuyu * Uniformize Pix2Struct * Fix wrong name for FuyuProcessorKwargs * Fix slow tests reversed inputs align fuyu llava-next, change udop warning * Fix wrong logging import udop * Add check images text input order * Fix copies * change text pair handling when positional arg * rebase on main, fix imports in test_processing_common * remove optional args and udop uniformization from this PR * fix failing tests * remove unnecessary test, fix processing utils and test processing common * cleanup Unpack * cleanup * fix conflict grounding dino 2024-09-24 21:28:19 -04:00			`processor_components = self.prepare_components()`
Add new model LFM2-VL (#40624) * Add LFM2-VL support * add tests * linting, formatting, misc review changes * add siglip2 to auto config and instantiate it in lfm2-vl configuration * decouple image processor from processor * remove torch import from configuration * replace \| with Optional * remove layer truncation from modeling file * fix copies * update everything * fix test case to use tiny model * update the test cases * fix finally the image processor and add slow tests * fixup * typo in docs * fix tests * the doc name uses underscore * address comments from Yoni * delete tests and unsuffling * relative import * do we really handle imports better now? * fix test * slow tests * found a bug in ordering + slow tests * fix copies * dont run compile test --------- Co-authored-by: Anna <anna@liquid.ai> Co-authored-by: Anna Banaszak <48625325+ankke@users.noreply.github.com> 2025-09-18 13:01:58 +02:00			`processor_kwargs = self.prepare_processor_dict()`
			`processor = self.processor_class(processor_components, processor_kwargs)`
add uniform processors for altclip + chinese_clip (#31198) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * remove try/catch block * deprecate kwarg * format * add copyright + remove unused method * [run-slow]altclip, chinese_clip * clean imports * fix version * clean up deprecation * fix style * add corner case test on kwarg overlap * resume processing - add Unpack as importable * add tmpdirname * fix altclip * fix up * add back crop_size to specific tests * generalize tests to possible video_processor * add back crop_size arg * fixup overlapping kwargs test for qformer_tokenizer * remove copied from * fixup chinese_clip tests values * fixup tests - qformer tokenizers * [run-slow] altclip, chinese_clip * remove prepare_image_inputs 2024-09-19 17:21:54 +02:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(modalities="image")`
add uniform processors for altclip + chinese_clip (#31198) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * fix mutable default :eyes: * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * remove try/catch block * deprecate kwarg * format * add copyright + remove unused method * [run-slow]altclip, chinese_clip * clean imports * fix version * clean up deprecation * fix style * add corner case test on kwarg overlap * resume processing - add Unpack as importable * add tmpdirname * fix altclip * fix up * add back crop_size to specific tests * generalize tests to possible video_processor * add back crop_size arg * fixup overlapping kwargs test for qformer_tokenizer * remove copied from * fixup chinese_clip tests values * fixup tests - qformer tokenizers * [run-slow] altclip, chinese_clip * remove prepare_image_inputs 2024-09-19 17:21:54 +02:00			`image_input = self.prepare_image_inputs()`

			`with self.assertRaises(ValueError):`
			`_ = processor(`
			`text=input_str,`
			`images=image_input,`
			`return_tensors="pt",`
			`padding="max_length",`
			`text_kwargs={"padding": "do_not_pad"},`
			`)`
Add support for args to ProcessorMixin for backward compatibility (#33479) * add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin * change size and crop_size in processor kwargs tests to do_rescale and rescale_factor * remove unnecessary llava processor kwargs test overwrite * nit * change data_arg_name to input_name * Remove unnecessary test override * Remove unnecessary tests Paligemma * Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring 2024-09-20 11:40:59 -04:00
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`def test_overlapping_text_audio_kwargs_handling(self):`
			`"""`
			Checks that `padding`, or any other overlap arg between audio extractor and tokenizer
			`is be passed to only text and ignored for audio for BC purposes`
			`"""`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`

Add Qwen2.5-Omni (#36752) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-04-14 18:36:41 +08:00			`processor_components = self.prepare_components()`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`processor_kwargs = self.prepare_processor_dict()`
Add Qwen2.5-Omni (#36752) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-04-14 18:36:41 +08:00			`processor = self.processor_class(processor_components, processor_kwargs)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`self.skip_processor_without_typed_kwargs(processor)`

Fix processing tests (#40379) * fix tests * skip failing test in generation as well * grounding dino was overwritten * one more overwritten code * clear comment 2025-08-25 14:50:54 +02:00			`input_str = self.prepare_text_inputs(batch_size=3, modalities="audio")`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00			`audio_lengths = [4000, 8000, 16000, 32000]`
			`raw_speech = [np.asarray(audio)[:length] for audio, length in zip(floats_list((3, 32_000)), audio_lengths)]`

			`# padding = True should not raise an error and will if the audio processor popped its value to None`
			`_ = processor(text=input_str, audio=raw_speech, padding=True, return_tensors="pt")`

Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`def test_chat_template_save_loading(self):`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`processor = self.processor_class.from_pretrained(self.tmpdirname)`
Prepare processors for VideoLLMs (#36149) * allow processor to preprocess conversation + video metadata * allow callable * add test * fix test * nit: fix * add metadata frames_indices * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * port updates from Orr and add one more test * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * typo * as dataclass * style * docstring + maek sure tests green --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> 2025-02-14 11:34:08 +01:00			`signature = inspect.signature(processor.__init__)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`if "chat_template" not in {*signature.parameters.keys()}:`
			`self.skipTest("Processor doesn't accept chat templates at input")`

Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`processor.chat_template = "test template"`
			`with tempfile.TemporaryDirectory() as tmpdirname:`
[v5] Delete legacy chat template saving (#41648) * delete lagcy chat template saving * fix tests * fix qwen audio 2025-10-22 11:40:55 +02:00			`processor.save_pretrained(tmpdirname)`
			`with open(Path(tmpdirname, "chat_template.json"), "w") as fp:`
			`json.dump({"chat_template": processor.chat_template}, fp)`
			`os.remove(Path(tmpdirname, "chat_template.jinja"))`

Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`reloaded_processor = self.processor_class.from_pretrained(tmpdirname)`
			`self.assertEqual(processor.chat_template, reloaded_processor.chat_template)`

			`with tempfile.TemporaryDirectory() as tmpdirname:`
:rotating_light: :rotating_light: Allow saving and loading multiple "raw" chat template files (#36588) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> 2025-04-11 16:37:23 +01:00			`processor.save_pretrained(tmpdirname)`
			`self.assertTrue(Path(tmpdirname, "chat_template.jinja").is_file())`
			`self.assertFalse(Path(tmpdirname, "chat_template.json").is_file())`
			`self.assertFalse(Path(tmpdirname, "additional_chat_templates").is_dir())`
			`reloaded_processor = self.processor_class.from_pretrained(tmpdirname)`
			`self.assertEqual(processor.chat_template, reloaded_processor.chat_template)`
			`# When we save as single files, tokenizers and processors share a chat template, which means`
			`# the reloaded tokenizer should get the chat template as well`
			`self.assertEqual(reloaded_processor.chat_template, reloaded_processor.tokenizer.chat_template)`

			`with tempfile.TemporaryDirectory() as tmpdirname:`
			`processor.chat_template = {"default": "a", "secondary": "b"}`
			`processor.save_pretrained(tmpdirname)`
Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`self.assertTrue(Path(tmpdirname, "chat_template.jinja").is_file())`
			`self.assertFalse(Path(tmpdirname, "chat_template.json").is_file())`
:rotating_light: :rotating_light: Allow saving and loading multiple "raw" chat template files (#36588) * Add saving in the new format (but no loading yet!) * Add saving in the new format (but no loading yet!) * A new approach to template files! * make fixup * make fixup, set correct dir * Some progress but need to rework for cached_file * Rework loading handling again * Small fixes * Looks like it's working now! * make fixup * Working! * make fixup * make fixup * Add TODO so I don't miss it * Cleaner control flow with one less indent * Copy the new logic to processing_utils as well * Proper support for dicts of templates * make fixup * define the file/dir names in a single place * Update the processor chat template reload test as well * Add processor loading of multiple templates * Flatten correctly to match tokenizers * Better support when files are empty sometimes * Stop creating those empty templates * Revert changes now we don't have empty templates * Revert changes now we don't have empty templates * Don't support separate template files on the legacy path * Rework/simplify loading code * Make sure it's always a chat_template key in chat_template.json * Update processor handling of multiple templates * Add a full save-loading test to the tokenizer tests as well * Correct un-flattening * New test was incorrect * Correct error/offline handling * Better exception handling * More error handling cleanup * Add skips for test failing on main * Reorder to fix errors * make fixup * clarify legacy processor file docs and location * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/processing_utils.py Co-authored-by: Lucain <lucainp@gmail.com> * Rename to _jinja and _legacy * Stop saving multiple templates in the legacy format * Cleanup the processing code * Cleanup the processing code more * make fixup * make fixup * correct reformatting * Use correct dir name * Fix import location * Use save_jinja_files instead of save_raw_chat_template_files * Correct the test for saving multiple processor templates * Fix type hint * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Patch llava_onevision test * Update src/transformers/processing_utils.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Refactor chat template saving out into a separate function * Update tests for the new default * Don't do chat template saving logic when chat template isn't there * Ensure save_jinja_files is propagated to tokenizer correctly * Trigger tests * Update more tests to new default * Trigger tests --------- Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> 2025-04-11 16:37:23 +01:00			`self.assertTrue(Path(tmpdirname, "additional_chat_templates").is_dir())`
Separate chat templates into a single file (#33957) * Initial draft * Add .jinja file loading for processors * Add processor saving of naked chat template files * make fixup * Add save-load test for tokenizers * Add save-load test for tokenizers * stash commit * Try popping the file * make fixup * Pop the arg correctly * Pop the arg correctly * Add processor test * Fix processor code * stash commit * Processor clobbers child tokenizer's chat template * Processor clobbers child tokenizer's chat template * make fixup * Split processor/tokenizer files to avoid interactions * fix test * Expand processor tests * Rename arg to "save_raw_chat_template" across all classes * Update processor warning * Move templates to single file * Move templates to single file * Improve testing for processor/tokenizer clashes * Improve testing for processor/tokenizer clashes * Extend saving test * Test file priority correctly * make fixup * Don't pop the chat template file before the slow tokenizer gets a look * Remove breakpoint * make fixup * Fix error 2024-11-26 14:18:04 +00:00			`reloaded_processor = self.processor_class.from_pretrained(tmpdirname)`
			`self.assertEqual(processor.chat_template, reloaded_processor.chat_template)`
			`# When we save as single files, tokenizers and processors share a chat template, which means`
			`# the reloaded tokenizer should get the chat template as well`
			`self.assertEqual(reloaded_processor.chat_template, reloaded_processor.tokenizer.chat_template)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`@require_torch`
			`def _test_apply_chat_template(`
			`self,`
			`modality: str,`
			`batch_size: int,`
			`return_tensors: str,`
			`input_name: str,`
			`processor_name: str,`
			`input_data: list[str],`
			`):`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`processor = self.get_processor()`
			`if processor.chat_template is None:`
			`self.skipTest("Processor has no chat template")`

[v5] 🚨Refactor subprocessors handling in processors (#41633) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * modifs after review 2025-11-07 12:57:33 -05:00			`if processor_name not in self.processor_class.get_attributes():`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`self.skipTest(f"{processor_name} attribute not present in {self.processor_class}")`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`# some models have only Fast image processor`
			`if getattr(processor, processor_name).__class__.__name__.endswith("Fast"):`
			`return_tensors = "pt"`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`batch_messages = [`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`[`
Fix processor chat template (#40613) fix tests 2025-09-02 10:59:48 +02:00			`{"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},`
			`{"role": "user", "content": [{"type": "text", "text": "Describe this."}]},`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`]`
			`] * batch_size`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`# Test that jinja can be applied`
			`formatted_prompt = processor.apply_chat_template(batch_messages, add_generation_prompt=True, tokenize=False)`
			`self.assertEqual(len(formatted_prompt), batch_size)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			# Test that tokenizing with template and directly with `self.tokenizer` gives same output
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`formatted_prompt_tokenized = processor.apply_chat_template(`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`batch_messages, add_generation_prompt=True, tokenize=True, return_tensors=return_tensors`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
			`add_special_tokens = True`
			`if processor.tokenizer.bos_token is not None and formatted_prompt[0].startswith(processor.tokenizer.bos_token):`
			`add_special_tokens = False`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`tok_output = processor.tokenizer(`
			`formatted_prompt, return_tensors=return_tensors, add_special_tokens=add_special_tokens`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`expected_output = tok_output.input_ids`
			`self.assertListEqual(expected_output.tolist(), formatted_prompt_tokenized.tolist())`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			# Test that kwargs passed to processor's `__call__` are actually used
			`tokenized_prompt_100 = processor.apply_chat_template(`
			`batch_messages,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`add_generation_prompt=True,`
			`tokenize=True,`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`return_tensors=return_tensors,`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={`
			`"padding": "max_length",`
			`"truncation": True,`
			`"max_length": self.chat_template_max_length,`
			`},`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`self.assertEqual(len(tokenized_prompt_100[0]), self.chat_template_max_length)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			# Test that `return_dict=True` returns text related inputs in the dict
			`out_dict_text = processor.apply_chat_template(`
			`batch_messages,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`add_generation_prompt=True,`
			`tokenize=True,`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`return_dict=True,`
			`return_tensors=return_tensors,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`self.assertTrue(all(key in out_dict_text for key in ["input_ids", "attention_mask"]))`
			`self.assertEqual(len(out_dict_text["input_ids"]), batch_size)`
			`self.assertEqual(len(out_dict_text["attention_mask"]), batch_size)`

			# Test that with modality URLs and `return_dict=True`, we get modality inputs in the dict
			`for idx, url in enumerate(input_data[:batch_size]):`
Fix processor chat template (#40613) fix tests 2025-09-02 10:59:48 +02:00			`batch_messages[idx][1]["content"] = [batch_messages[idx][1]["content"][0], {"type": modality, "url": url}]`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
			`out_dict = processor.apply_chat_template(`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`batch_messages,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`return_tensors=return_tensors,`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"num_frames": 2}, # by default no more than 2 frames, otherwise too slow`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`input_name = getattr(self, input_name)`
			`self.assertTrue(input_name in out_dict)`
			`self.assertEqual(len(out_dict["input_ids"]), batch_size)`
			`self.assertEqual(len(out_dict["attention_mask"]), batch_size)`
			`self.assertEqual(len(out_dict[input_name]), batch_size)`

			`return_tensor_to_type = {"pt": torch.Tensor, "np": np.ndarray, None: list}`
			`for k in out_dict:`
			`self.assertIsInstance(out_dict[k], return_tensor_to_type[return_tensors])`

			`# Test continue from final message`
			`assistant_message = {`
			`"role": "assistant",`
			`"content": [{"type": "text", "text": "It is the sound of"}],`
			`}`
			`for idx, url in enumerate(input_data[:batch_size]):`
			`batch_messages[idx] = batch_messages[idx] + [assistant_message]`
			`continue_prompt = processor.apply_chat_template(batch_messages, continue_final_message=True, tokenize=False)`
			`for prompt in continue_prompt:`
			self.assertTrue(prompt.endswith("It is the sound of")) # no `eos` token at the end
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`@require_librosa`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`@parameterized.expand([(1, "np"), (1, "pt"), (2, "np"), (2, "pt")])`
			`def test_apply_chat_template_audio(self, batch_size: int, return_tensors: str):`
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if "feature_extractor" in self.processor_class.get_attributes():`
			`self._test_apply_chat_template(`
			`"audio",`
			`batch_size,`
			`return_tensors,`
			`"audio_input_name",`
			`"feature_extractor",`
			`MODALITY_INPUT_DATA["audio"],`
			`)`
			`else:`
			`self._test_apply_chat_template(`
			`"audio",`
			`batch_size,`
			`return_tensors,`
			`"audio_input_name",`
			`"audio_processor",`
			`MODALITY_INPUT_DATA["audio"],`
			`)`
Support `return_tensors` in audio chat templates (#34601) * add audio chat templates * update * update * nit * green ci * we dont care about the order anymore * clean up after rebase * overriden tests rename * rename shieldgemma also * one more rename * require_read_token * removde images/videos * retrigger CI flaky 2025-03-25 11:08:47 +01:00
processor tests - use dummy videos (#40537) * use dummy videos * failing on main, new model merged had conflicts 2025-09-01 11:04:47 +02:00			`@require_av`
			`@parameterized.expand([(1, "pt")])`
			`def test_apply_chat_template_decoded_video(self, batch_size: int, return_tensors: str):`
			`dummy_preloaded_video = np.array(self.prepare_video_inputs())`
			`input_data = [dummy_preloaded_video]`
			`self._test_apply_chat_template(`
			`"video", batch_size, return_tensors, "videos_input_name", "video_processor", input_data`
			`)`

🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`@require_av`
[video processors] decode only sampled videos -> less RAM and faster processing (#39600) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> 2025-08-26 11:38:02 +02:00			`@parameterized.expand([(1, "pt"), (2, "pt")]) # video processor supports only torchvision`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`def test_apply_chat_template_video(self, batch_size: int, return_tensors: str):`
			`self._test_apply_chat_template(`
			`"video", batch_size, return_tensors, "videos_input_name", "video_processor", MODALITY_INPUT_DATA["videos"]`
			`)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
🚨[Fast Image Processor] Force Fast Image Processor for Qwen2_VL/2_5_VL + Refactor (#39591) * init * Force qwen2VL image proc to fast * refactor qwen2 vl fast * fix copies * Update after PR review and update tests to use return_tensors="pt" * fix processor tests * add BC for min pixels/max pixels 2025-07-25 11:11:28 -04:00			`@parameterized.expand([(1, "pt"), (2, "pt")]) # fast image processors supports only torchvision`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`def test_apply_chat_template_image(self, batch_size: int, return_tensors: str):`
			`self._test_apply_chat_template(`
			`"image", batch_size, return_tensors, "images_input_name", "image_processor", MODALITY_INPUT_DATA["images"]`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`

🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`@require_torch`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`def test_apply_chat_template_video_frame_sampling(self):`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`processor = self.get_processor()`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`if processor.chat_template is None:`
			`self.skipTest("Processor has no chat template")`

			`signature = inspect.signature(processor.__call__)`
			`if "videos" not in {*signature.parameters.keys()} or (`
			`signature.parameters.get("videos") is not None`
			`and signature.parameters["videos"].annotation == inspect._empty`
			`):`
			`self.skipTest("Processor doesn't accept videos at input")`

			`messages = [`
			`[`
			`{`
			`"role": "user",`
			`"content": [`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`{`
			`"type": "video",`
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`"url": url_to_local_path(`
			`"https://huggingface.co/datasets/raushan-testing-hf/videos-test/resolve/main/tiny_video.mp4"`
			`),`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`},`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`{"type": "text", "text": "What is shown in this video?"},`
			`],`
			`},`
			`]`
			`]`

			`num_frames = 3`
			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`return_tensors="pt",`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"num_frames": num_frames},`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
			`self.assertTrue(self.videos_input_name in out_dict_with_video)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name]), 1)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name][0]), num_frames)`

Replace video_fps with fps in tests (#39898) Signed-off-by: cyy <cyyever@outlook.com> 2025-08-05 18:39:55 +08:00			# Load with `fps` arg
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`fps = 10`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
🔴 Video processors as a separate class (#35206) * initial design * update all video processors * add tests * need to add qwen2-vl (not tested yet) * add qwen2-vl in auto map * fix copies * isort * resolve confilicts kinda * nit: * qwen2-vl is happy now * qwen2-5 happy * other models are happy * fix copies * fix tests * add docs * CI green now? * add more tests * even more changes + tests * doc builder fail * nit * Update src/transformers/models/auto/processing_auto.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * small update * imports correctly * dump, otherwise this is getting unmanagebale T-T * dump * update * another update * update * tests * move * modular * docs * test * another update * init * remove flakiness in tests * fixup * clean up and remove commented lines * docs * skip this one! * last fix after rebasing * run fixup * delete slow files * remove unnecessary tests + clean up a bit * small fixes * fix tests * more updates * docs * fix tests * update * style * fix qwen2-5-vl * fixup * fixup * unflatten batch when preparing * dump, come back soon * add docs and fix some tests * how to guard this with new dummies? * chat templates in qwen * address some comments * remove `Fast` suffix * fixup * oops should be imported from transforms * typo in requires dummies * new model added with video support * fixup once more * last fixup I hope * revert image processor name + comments * oh, this is why fetch test is failing * fix tests * fix more tests * fixup * add new models: internvl, smolvlm * update docs * imprt once * fix failing tests * do we need to guard it here again, why? * new model was added, update it * remove testcase from tester * fix tests * make style * not related CI fail, lets' just fix here * mark flaky for now, filas 15 out of 100 * style * maybe we can do this way? * don't download images in setup class --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> 2025-05-12 11:55:51 +02:00			`return_tensors="pt",`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"fps": fps},`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`
			`self.assertTrue(self.videos_input_name in out_dict_with_video)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name]), 1)`
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`# 3 frames are inferred from input video's length and FPS, so can be hardcoded`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name][0]), 3)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
Fix typos in tests and util (#40780) Fix typos Signed-off-by: Yuanyuan Chen <cyyever@outlook.com> 2025-09-10 19:45:40 +08:00			# When `do_sample_frames=False` no sampling is done and whole video is loaded, even if number of frames is passed
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`fps = 10`
[video processors] support frame sampling within processors (#38105) * apply updates smolVLM (still needs workaround for chat template) * add other models * dump qwen omni for now, come back later * port qwen omni from their impl * wait, all qwens sample videos in same way! * clean up * make smolvlm backwards compatible and fix padding * dix some tests * fox smolvlm tests * more clean up and test fixing * delete unused arg * fix * address comments * style * fix test 2025-06-12 11:34:30 +02:00			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={`
			`"do_sample_frames": False,`
			`"fps": fps,`
			`"return_tensors": "pt",`
			`},`
[video processors] support frame sampling within processors (#38105) * apply updates smolVLM (still needs workaround for chat template) * add other models * dump qwen omni for now, come back later * port qwen omni from their impl * wait, all qwens sample videos in same way! * clean up * make smolvlm backwards compatible and fix padding * dix some tests * fox smolvlm tests * more clean up and test fixing * delete unused arg * fix * address comments * style * fix test 2025-06-12 11:34:30 +02:00			`)`
			`self.assertTrue(self.videos_input_name in out_dict_with_video)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name]), 1)`
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`self.assertEqual(len(out_dict_with_video[self.videos_input_name][0]), 11)`
[video processors] support frame sampling within processors (#38105) * apply updates smolVLM (still needs workaround for chat template) * add other models * dump qwen omni for now, come back later * port qwen omni from their impl * wait, all qwens sample videos in same way! * clean up * make smolvlm backwards compatible and fix padding * dix some tests * fox smolvlm tests * more clean up and test fixing * delete unused arg * fix * address comments * style * fix test 2025-06-12 11:34:30 +02:00
Replace video_fps with fps in tests (#39898) Signed-off-by: cyy <cyyever@outlook.com> 2025-08-05 18:39:55 +08:00			# Load with `fps` and `num_frames` args, should raise an error
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`with self.assertRaises(ValueError):`
			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"fps": fps, "num_frames": num_frames},`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`)`

			`# Load without any arg should load the whole video`
			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
			`)`
			`self.assertTrue(self.videos_input_name in out_dict_with_video)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name]), 1)`
Load a tiny video to make CI faster (#40684) * load a tiny video to make CI faster * add video in url_to_local_path 2025-09-04 14:49:00 +02:00			`self.assertEqual(len(out_dict_with_video[self.videos_input_name][0]), 11)`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00
[video processors] decode only sampled videos -> less RAM and faster processing (#39600) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> 2025-08-26 11:38:02 +02:00			`# Load video as a list of frames (i.e. images).`
			`# NOTE: each frame should have same size because we assume they come from one video`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`messages[0][0]["content"][0] = {`
			`"type": "video",`
			`"url": [`
Final test data cache - inside CI docker images (#40689) * run * build * build * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2025-09-04 15:12:49 +02:00			`url_to_local_path(`
			`"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/australia.jpg"`
			`)`
			`]`
			`* 2,`
Chat template: update for processor (#35953) * update * we need batched nested input to always process correctly * update a bit * fix copies 2025-02-10 09:52:19 +01:00			`}`
			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
			`)`
			`self.assertTrue(self.videos_input_name in out_dict_with_video)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name]), 1)`
			`self.assertEqual(len(out_dict_with_video[self.videos_input_name][0]), 2)`
Prepare processors for VideoLLMs (#36149) * allow processor to preprocess conversation + video metadata * allow callable * add test * fix test * nit: fix * add metadata frames_indices * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * port updates from Orr and add one more test * Update src/transformers/processing_utils.py Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> * typo * as dataclass * style * docstring + maek sure tests green --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> 2025-02-14 11:34:08 +01:00
[video processors] decode only sampled videos -> less RAM and faster processing (#39600) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> 2025-08-26 11:38:02 +02:00			`# When the inputs are frame URLs/paths we expect that those are already`
			`# sampled and will raise an error is asked to sample again.`
			`with self.assertRaisesRegex(`
			ValueError, "Sampling frames from a list of images is not supported! Set `do_sample_frames=False`"
			`):`
			`out_dict_with_video = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
Allow arbitrary template kwargs in processors (#44881) * . * warn * only common tests * . * . * dont import deprecated typed dicts * print... * merge dicts if both are passed * Revert "merge dicts if both are passed" This reverts commit 2d46cc515b5cdbda81a3e82bca6f15b8c4981a65. 2026-03-24 11:59:58 +01:00			`processor_kwargs={"do_sample_frames": True},`
[video processors] decode only sampled videos -> less RAM and faster processing (#39600) * draft update two models for now * batch update all VLMs first * update some more image processors * update * fix a few tests * just make CI green for now * fix copies * update once more * update * unskip the test * fix these two * fix torchcodec audio loading * maybe * yay, i fixed torchcodec installation and now can actually test it * fix copies deepseek * make sure the metadata is returrned when users request it * add docs * update * fixup * Update src/transformers/audio_utils.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/glm4v/video_processing_glm4v.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update * what if we set some metadata attr to `None` * fix CI * fix one test * fix 4 channel test * fix glm timestemps * rebase gone wrong * raise warning once * fixup * typo * fix copies * ifx smolvlm test * this is why torch's official benchmark was faster, set threads to `0` * Apply style fixes --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> 2025-08-26 11:38:02 +02:00			`)`

[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00			`@require_librosa`
			`@require_av`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`def test_chat_template_audio_from_video(self):`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00			`processor = self.get_processor()`
			`if processor.chat_template is None:`
			`self.skipTest("Processor has no chat template")`

			`signature = inspect.signature(processor.__call__)`
			`if "videos" not in {*signature.parameters.keys()} or (`
			`signature.parameters.get("videos") is not None`
			`and signature.parameters["videos"].annotation == inspect._empty`
			`):`
Fix typos in strings and comments (#37784) * Fix typos in strings and comments * Fix 2025-04-25 20:47:25 +08:00			`self.skipTest(f"{self.processor_class} does not support video inputs")`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00
Add processing tests for phi4 multimodal (#44234) * Fix tied weight keys sam2 video * add tests for phi4 processor 2026-02-23 17:08:11 -05:00			`if (`
			`"feature_extractor" not in self.processor_class.get_attributes()`
			`or "audio_processor" not in self.processor_class.get_attributes()`
			`):`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00			`self.skipTest(f"feature_extractor attribute not present in {self.processor_class}")`

			`video_file_path = hf_hub_download(`
			`repo_id="raushan-testing-hf/videos-test", filename="sample_demo_1.mp4", repo_type="dataset"`
			`)`
			`messages = [`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "video", "path": video_file_path},`
			`{"type": "text", "text": "Which of these animals is making the sound?"},`
			`],`
			`},`
			`{`
			`"role": "assistant",`
			`"content": [{"type": "text", "text": "It is a cow."}],`
			`},`
			`{`
			`"role": "user",`
			`"content": [`
Add Qwen2.5-Omni (#36752) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-04-14 18:36:41 +08:00			`{"type": "text", "text": "Tell me all about this animal."},`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00			`],`
			`},`
			`]`

			`formatted_prompt = processor.apply_chat_template([messages], add_generation_prompt=True, tokenize=False)`
			`self.assertEqual(len(formatted_prompt), 1) # batch size=1`

			`out_dict = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`tokenize=True,`
			`return_dict=True,`
🚨Default to fast image processors for all models (#41388) * remove attributes and add all missing sub processors to their auto classes * remove all mentions of .attributes * cleanup * fix processor tests * fix modular * remove last attributes * fixup * fixes after merge * fix wrong tokenizer in auto florence2 * fix missing audio_processor + nits * Override __init__ in NewProcessor and change hf-internal-testing-repo (temporarily) * fix auto tokenizer test * add init to markup_lm * update CustomProcessor in custom_processing * remove print * nit * refactor processor tests first part * refactor part 2 * fix test modeling owlv2 * fix test_processing_layoutxlm * Fix owlv2, wav2vec2, markuplm, voxtral issues * part3 * refactor all processor with mixin * add support for loading and saving multiple tokenizer natively * remove exclude_attributes from save_pretrained * get processor from pretrained instead of components in tests * skip tests in colqwen2, pixtral * modifs after review * fix style and copies * Fix after review * add test_processor_from_pretrained_vs_from_components, fix failing tests * fix overflowing_tokens tests * add config for layoutxlm * fix ci * use modular * fic docstring * Fix most tests * Standardize mgp_str tests * fix oneformer processing tests + fix copies * fix after review * fix missing fet_images in fast image processors * fix 01 - to check * fix 02 - to check * fix 03 - to check * fix 03 - to check * fix 03 - to check * fix 04 - to check * fix 05 - to check * fix 06 - sytle * fix 07 - revert * Fix some errors * Improve BatchFeature: stack list and lists of torch tensors (#42750) * stack lists of tensors in BatchFeature, improve error messages, add tests * remove unnecessary stack in fast image processors and video processors * make style * fix tests * fix remaining tests * fix copies * Fix Lfm2_vl im proc test * nit after review --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2026-01-21 11:08:23 -05:00			`return_tensors="pt",`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00			`load_audio_from_video=True,`
			`)`
			`self.assertTrue(self.audio_input_name in out_dict)`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`self.assertTrue(self.videos_input_name in out_dict)`
[chat templates} support loading audio from video (#36955) * add audio from video * typos * delete print * comments 2025-03-27 14:46:11 +01:00
			`# should always have input_ids and attention_mask`
			`self.assertEqual(len(out_dict["input_ids"]), 1) # batch-size=1`
			`self.assertEqual(len(out_dict["attention_mask"]), 1) # batch-size=1`
Add Qwen2.5-Omni (#36752) * Add qwen2.5-omni * Remove einops dependency * Add torchdiffeq dependency * Sort init * Add torchdiffeq to extras['diffeq'] * Fix repo consistency * use cached_file * del odeint * renew pytest * format * Remove torchdiffeq * format * fixed batch infer bug * Change positional_embedding to parameter * Change default speaker * Config revision * Use modular & code clean * code clean * decouple padding with model & code cleaning * sort init * fix * fix * Second code review * fix * fix * rename vars to full name + some comments * update pytest * Code clean & fix * fix * style * more clean up * fixup * smaller vision model in tests * fix processor test * deflake a bit the tests (still flaky though) * de-flake tests finally + add generation mixin * final nits i hope * make sure processor tests are complete * replace with Qwen2_5OmniForConditionalGeneration * fix tests after updating ckpt * fix typos when cleaning, also we can't change ckpt * fixup * images and videos kwargs for processor * thinker and talker loadable from hub ckpt * address comments and update tests after rebase * fixup * skip for now * fixup * fixup * remove torch dependency in processors --------- Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con> Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com> Co-authored-by: raushan <raushan@huggingface.co> 2025-04-14 18:36:41 +08:00			`self.assertEqual(len(out_dict[self.audio_input_name]), 1) # 1 audio in the conversation`
[chat-template] Unify tests and clean up 🧼 (#37275) * fix tests and some clean up * make one general test for each modality * remove redundant merging of kwargs * edge cases * dont enforce slow when reloading * fix gemma3 tests * has to adapt llama 4 after rebase * remove also from overriden tests * should be green now 2025-04-10 14:42:32 +02:00			`self.assertEqual(len(out_dict[self.videos_input_name]), 1) # 1 video in the conversation`
[chat template] add a testcase for kwargs (#39415) add a testcase 2025-07-16 14:31:35 +05:00
			`def test_chat_template_jinja_kwargs(self):`
			`"""Tests that users can pass any kwargs and they will be used in jinja templates."""`
			`processor = self.get_processor()`
			`if processor.chat_template is None:`
			`self.skipTest("Processor has no chat template")`

			`messages = [`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "text", "text": "Which of these animals is making the sound?"},`
			`],`
			`},`
			`{`
			`"role": "assistant",`
			`"content": [{"type": "text", "text": "It is a cow."}],`
			`},`
			`]`

			`dummy_template = (`
			`"{% for message in messages %}"`
			`"{% if add_system_prompt %}"`
			`"{{'You are a helpful assistant.'}}"`
			`"{% endif %}"`
			`"{% if (message['role'] != 'assistant') %}"`
			`"{{'<\|special_start\|>' + message['role'] + '\n' + message['content'][0]['text'] + '<\|special_end\|>' + '\n'}}"`
			`"{% elif (message['role'] == 'assistant')%}"`
			`"{{'<\|special_start\|>' + message['role'] + '\n'}}"`
			`"{{message['content'][0]['text'] + '<\|special_end\|>' + '\n'}}"`
			`"{% endif %}"`
			`"{% endfor %}"`
			`)`

			`formatted_prompt = processor.apply_chat_template(`
			`messages, add_system_prompt=True, tokenize=False, chat_template=dummy_template`
			`)`
			`expected_prompt = "You are a helpful assistant.<\|special_start\|>user\nWhich of these animals is making the sound?<\|special_end\|>\nYou are a helpful assistant.<\|special_start\|>assistant\nIt is a cow.<\|special_end\|>\n"`
			`self.assertEqual(formatted_prompt, expected_prompt)`
[chat template] return assistant mask in processors (#38545) * messed up the git history, squash commits * raise error if slow and refine tests * index was off by one * fix the test 2025-07-18 14:23:20 +02:00
			`@require_torch`
			`def test_apply_chat_template_assistant_mask(self):`
			`processor = self.get_processor()`

			`if processor.chat_template is None:`
			`self.skipTest("Processor has no chat template")`

			`messages = [`
			`[`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "text", "text": "What is the capital of France?"},`
			`],`
			`},`
			`{`
			`"role": "assistant",`
			`"content": [`
			`{"type": "text", "text": "The capital of France is Paris."},`
			`],`
			`},`
			`{`
			`"role": "user",`
			`"content": [`
			`{"type": "text", "text": "What about Italy?"},`
			`],`
			`},`
			`{`
			`"role": "assistant",`
			`"content": [`
			`{"type": "text", "text": "The capital of Italy is Rome."},`
			`],`
			`},`
			`]`
			`]`

			`dummy_template = (`
			`"{% for message in messages %}"`
			`"{% if (message['role'] != 'assistant') %}"`
			`"{{'<\|special_start\|>' + message['role'] + '\n' + message['content'][0]['text'] + '<\|special_end\|>' + '\n'}}"`
			`"{% elif (message['role'] == 'assistant')%}"`
			`"{{'<\|special_start\|>' + message['role'] + '\n'}}"`
			`"{% generation %}"`
			`"{{message['content'][0]['text'] + '<\|special_end\|>' + '\n'}}"`
			`"{% endgeneration %}"`
			`"{% endif %}"`
			`"{% endfor %}"`
			`)`

			`inputs = processor.apply_chat_template(`
			`messages,`
			`add_generation_prompt=False,`
			`tokenize=True,`
			`return_dict=True,`
			`return_tensors="pt",`
			`return_assistant_tokens_mask=True,`
			`chat_template=dummy_template,`
			`)`
			`self.assertTrue("assistant_masks" in inputs)`
			`self.assertEqual(len(inputs["assistant_masks"]), len(inputs["input_ids"]))`

			`mask = inputs["assistant_masks"].bool()`
			`assistant_ids = inputs["input_ids"][mask]`

			`assistant_text = (`
			`"The capital of France is Paris.<\|special_end\|>\nThe capital of Italy is Rome.<\|special_end\|>\n"`
			`)`

			`# Some tokenizers add extra spaces which aren't then removed when decoding, so we need to check token ids`
			`# if we can't get identical text outputs`
			`text_is_same = assistant_text == processor.decode(assistant_ids, clean_up_tokenization_spaces=True)`
			`ids_is_same = processor.tokenizer.encode(assistant_text, add_special_tokens=False), assistant_ids.tolist()`
			`self.assertTrue(text_is_same or ids_is_same)`
Fix `get_number_of_image_tokens` (#43948) * fix * fix tests * this test not needed anymore, teh new one tests better 2026-02-12 17:23:37 +01:00
			`def test_get_num_multimodal_tokens_matches_processor_call(self):`
			`"Tests that the helper used internally in vLLM works correctly"`

			`processor = self.get_processor()`

			`if not hasattr(processor, "_get_num_multimodal_tokens"):`
			self.skipTest("Processor doesn't support `_get_num_multimodal_tokens` yet")

			`if processor.tokenizer.pad_token_id is None:`
			`processor.tokenizer.pad_token_id = processor.tokenizer.eos_token_id`

			`image_sizes = [(100, 100), (300, 100), (500, 30), (213, 167)]`
			`image_inputs = []`
			`for h, w in image_sizes:`
			`image_inputs.append(np.random.randint(255, size=(h, w, 3), dtype=np.uint8))`

			`text = [f"This is an image {getattr(self, 'image_token', '')}"] * len(image_inputs)`
			`inputs = processor(`
			`text=text, images=image_inputs, padding=True, return_mm_token_type_ids=True, return_tensors="pt"`
			`)`

			`if "mm_token_type_ids" not in inputs:`
			self.skipTest("Processor doesn't support `mm_token_type_ids`")

			`num_image_tokens_from_call = inputs.mm_token_type_ids.sum(-1).tolist()`
			`num_image_tokens_from_helper = processor._get_num_multimodal_tokens(image_sizes=image_sizes)`
			`self.assertListEqual(num_image_tokens_from_call, num_image_tokens_from_helper["num_image_tokens"])`