Commit Graph

  • 8d0f79effa Release 1.0.26 main v1.0.26 Ross Wightman 2026-03-23 11:09:21 -07:00
  • 6e3fdda395 Implement PRR as a pooling module. Alternative to #2678 Ross Wightman 2026-03-13 13:38:35 -07:00
  • 8b4239c4d5 Add comments for DinoV3 re global pool (class token). Fix #2681 Ross Wightman 2026-03-17 09:02:59 -07:00
  • 52e6d19d9d Change avg_checkpoints.py to use more secure load helper Ross Wightman 2026-03-16 14:06:34 -07:00
  • 7a2f49bd49 Fix FX tracing on resolve_self_attn_mask Ross Wightman 2026-03-16 11:48:53 -07:00
  • 61a26c7707 Improve attention mask handling for vision_transformer and eva and related blocks Ross Wightman 2026-03-16 10:42:35 -07:00
  • 5681186444 Fix FX tracing on resolve_self_attn_mask improve_attn_mask Ross Wightman 2026-03-16 11:48:53 -07:00
  • f59f8e0a3a Improve attention mask handling for vision_transformer and eva and related blocks Ross Wightman 2026-03-16 10:42:35 -07:00
  • 682c8458d0 Implement PRR as a pooling module. Alternative to #2678 prr_pool Ross Wightman 2026-03-13 13:38:35 -07:00
  • 3e8def86c4 Improve 2d and latent attention pool dimension handling. Fix #2682 Ross Wightman 2026-03-13 10:45:36 -07:00
  • f8c695d164 Improve 2d and latent attention pool dimension handling. Fix #2682 attention_pool_dim_handling Ross Wightman 2026-03-13 10:45:36 -07:00
  • a94c10fce1 Update version.py Ross Wightman 2026-03-10 07:40:16 -07:00
  • 0c90043d23 fix: branch Hiera MaskUnitAttention into 4D global path for FlashAttention dispatch Your Name 2026-03-09 17:45:02 +05:30
  • a346c76b5f Further refine weights_only=True, add safe globals for argparse Namespace to avoid failures on timm train checkpoints Ross Wightman 2026-03-09 11:11:17 -07:00
  • 7b439f4ba4 default weights_only=True for load fns Ross Wightman 2026-03-08 21:44:12 -07:00
  • a13b9b84ec Further refine weights_only=True, add safe globals for argparse Namespace to avoid failures on timm train checkpoints weights_only_default_true Ross Wightman 2026-03-09 11:11:17 -07:00
  • 2f750c0962 default weights_only=True for load fns Ross Wightman 2026-03-08 21:44:12 -07:00
  • fa0d31e88c Fix CLS and Reg tokens usage when pos_embed is disabled Sina Hajimiri 2026-03-05 14:42:09 -05:00
  • 85bb3304d2 Enhance SGDP optimizer with caution parameter (#2675) Jinghui Yuan 2026-03-06 02:48:07 +08:00
  • f4f4cfc73f Add timmx model export tool to README Youssef Boulaoaune 2026-02-26 10:39:56 +09:00
  • 2845814ea6 fix: replace bare except clauses with except Exception haosenwang1018 2026-02-25 14:20:02 +00:00
  • 9326ff27f0 Release 1.0.25 v1.0.25 Ross Wightman 2026-02-23 08:45:52 -08:00
  • 75684faa04 Add DTensor compatible NS impl for Muon Ross Wightman 2026-02-18 09:26:59 -08:00
  • 4b727ccb70 Add DTensor compatible NS impl for Muon dtensor_muon Ross Wightman 2026-02-18 09:26:59 -08:00
  • feed46c9ac Change clamp_min_ to clamp_(min=) as former doesn't work with DTensor / FSDP2 without custom registration Ross Wightman 2026-02-15 11:02:50 -08:00
  • 94b799bde7 Change clamp_min_ to clamp_(min=) as former doesn't work with DTensor / FSDP2 without custom registration remove_clamp_min Ross Wightman 2026-02-15 11:02:50 -08:00
  • 56467ad5f2 fix(optim): replace bare except with Exception in Lion optimizer Luka Aladashvili 2026-02-11 13:20:04 +04:00
  • 2490051330 Add a StepGate/StepEmbedding idea to RSViT variants. Add stochastic supervision/recursion loop options. vibe_rsvit Ross Wightman 2026-02-10 13:36:33 -08:00
  • 6f1005af19 Committing what's there. Still a big WIP, in the middle of iterating on config layout and arg mappings (not ideal yet). ssl_tasks Ross Wightman 2026-02-06 12:44:54 -08:00
  • 41754f29d3 Fix #2661 ... don't skip reset_parameters/init when meta device detected as it breaks use of accelerate and similar dispatch override context managers Ross Wightman 2026-02-06 10:44:40 -08:00
  • bdab30b0eb Remove torch.jit calls except for .ignore and .is_scripting to avoid 2.11 deprecation. Fix #2663 Ross Wightman 2026-02-06 10:29:37 -08:00
  • 31d987afb6 Fix #2661 ... don't skip reset_parameters/init when meta device detected as it breaks use of accelerate and similar dispatch override context managers misc_fixes_2026_02 Ross Wightman 2026-02-06 10:44:40 -08:00
  • 7a08ab54fa Remove torch.jit calls except for .ignore and .is_scripting to avoid 2.11 deprecation. Fix #2663 Ross Wightman 2026-02-06 10:29:37 -08:00
  • e8f0ae34b4 Fixup LayerScale use in perceiver & rsvit Ross Wightman 2026-01-30 10:30:56 -08:00
  • 9171d82efc Enhance the numerical stability of the Cautious Optimizer Jinghui Yuan 2026-01-29 14:51:58 +08:00
  • 17b976408d Add the test of cadamp Jinghui Yuan 2026-01-28 13:21:51 +08:00
  • ba85080efd Add cadamp optimizer to optimization factory Jinghui Yuan 2026-01-28 13:03:44 +08:00
  • a3643d20d2 Adding the cautious optimizer and the spherical cautious optimizer Jinghui Yuan 2026-01-28 12:59:41 +08:00
  • 3cbb8f91c0 Fix perceiver tests Ross Wightman 2026-01-28 09:46:27 -08:00
  • 125cd184fc Fix use with soft target loss, and tests for FX / classifier Ross Wightman 2026-01-28 08:34:33 -08:00
  • cfb77f965e Vibe coding some different ideas on vit param sharing/recursion, latent/perceiver structure. Ross Wightman 2026-01-27 10:45:46 -08:00
  • c9e22eb61e More task work, add engine, initial timm.apps ssl/cls scripts, debugging nepa and lejepa training Ross Wightman 2026-01-16 20:25:33 -08:00
  • 9800e14d9c POC work on ssl (NEPA/JEPA/AIM variants), expansion of task framework Ross Wightman 2026-01-14 14:09:50 -08:00
  • 370df82133 Update README.md Ross Wightman 2026-01-21 14:32:44 -08:00
  • 247c545fa1 Fix #2653, no models with weights impacted so just a clean fix, remove buffer as benchmarks w/ pt 2.9 show no difference Ross Wightman 2026-01-20 15:40:27 -08:00
  • 0d6b2cb8a5 Update README.md fix_parallel_scaling_bias Ross Wightman 2026-01-21 14:32:44 -08:00
  • 824e98ff48 Fix #2653, no models with weights impacted so just a clean fix, remove buffer as benchmarks w/ pt 2.9 show no difference Ross Wightman 2026-01-20 15:40:27 -08:00
  • 836dd99075 Fix distilled head dropout using wrong token in PiT forward_head Ofer Hasson 2026-01-18 18:12:57 +02:00
  • 3c6f79df77 Remove mistaken .in_chans attr set in classifier class Ross Wightman 2026-01-09 11:26:06 -08:00
  • 7945d3a014 Refactor distillation tasks to allow creation of DistillationTecher within Task. Add TokenDistillationTask. Ross Wightman 2026-01-09 10:49:34 -08:00
  • c0e3d24fe6 Add .in_chans as an attribute to all models Ross Wightman 2026-01-09 10:06:27 -08:00
  • f87dfb9b0d Dev version 1.0.25 Ross Wightman 2026-01-09 13:24:25 -08:00
  • 628f47936b Remove mistaken .in_chans attr set in classifier class token_distill_task Ross Wightman 2026-01-09 11:26:06 -08:00
  • 1d2b50c7a4 Refactor distillation tasks to allow creation of DistillationTecher within Task. Add TokenDistillationTask. Ross Wightman 2026-01-09 10:49:34 -08:00
  • f6673802ea Add .in_chans as an attribute to all models Ross Wightman 2026-01-09 10:06:27 -08:00
  • 90cae8c5ab Add new inference benchmark results v1.0.24 Ross Wightman 2026-01-06 15:23:54 -08:00
  • ed276693a2 Remove ViT B/16 384 augreg2 tag that doesn't exist Ross Wightman 2026-01-06 15:21:25 -08:00
  • 48625b172d Update README.md with release / fix info. Version 1.0.24 Ross Wightman 2026-01-06 09:17:18 -08:00
  • 030985f285 Fix #2644 with full import path Ross Wightman 2026-01-06 07:35:40 -08:00
  • b6432177e3 Version 1.0.23 v1.0.23 Ross Wightman 2026-01-05 13:39:54 -08:00
  • 31a85de79a Update typing in other scheduler classes. Fix spacing in cosine typing. Add some basic scheduler unit tests Ross Wightman 2025-12-31 11:15:20 -08:00
  • ec0e81edcc Update typing in other scheduler classes. Fix spacing in cosine typing. Add some basic scheduler unit tests scheduler_types Ross Wightman 2025-12-31 11:15:20 -08:00
  • 8038db635d Update README.md Ross Wightman 2025-12-30 15:57:13 -08:00
  • e33cd718a5 Add csatv2_21m weights at 512 & 640 img size. Add layer-scale support to csatv2 but not used yet. Ross Wightman 2025-12-29 16:23:57 -08:00
  • 816a45f02e Add sbb vit 'dlittle' weights trained with NAdaMuon Ross Wightman 2025-12-28 14:09:44 -08:00
  • 9d62118f3d refactor(scheduler): add type hints to CosineLRScheduler Yohei Okabayashi 2025-12-30 18:05:19 +09:00
  • 8674fa41ea Add csatv2_21m weights at 512 & 640 img size. Add layer-scale support to csatv2 but not used yet. endof2025_weights Ross Wightman 2025-12-29 16:23:57 -08:00
  • 940edf4285 Add sbb vit 'dlittle' weights trained with NAdaMuon Ross Wightman 2025-12-28 14:09:44 -08:00
  • d4ab5165b4 Add docstrings to layer helper functions and modules (#2634) Murat Raimbekov 2025-12-26 22:17:26 +06:00
  • 3eb784a6dc Rejig init_weights for 'skip' mode, add 'reset' to make it work with meta init Ross Wightman 2025-12-24 10:20:45 -08:00
  • 741d9997ac Fix efficientvit msra check Ross Wightman 2025-12-23 07:19:39 -08:00
  • 6bb1efc49c Use iterator for vit is_meta check, too configurable Ross Wightman 2025-12-22 21:46:14 -08:00
  • 50993861cb Add 'skip' to vit init_weights assert with new sequencing Ross Wightman 2025-12-22 17:47:52 -08:00
  • 2902c0cf29 Use blocks for is_meta check as patch_embed changes Ross Wightman 2025-12-22 15:51:06 -08:00
  • b5ffe5bd99 Further refine non-persistent buffer init and ensure reasonable overlap with weight init, weight init moving in direction of planned changes for overlapping models Ross Wightman 2025-12-22 15:11:14 -08:00
  • d2553fcdff Initial pass of an 'init_non_persistent_buffers' scheme, WIP.. needs more test and probably missed a few things Ross Wightman 2025-12-19 14:09:10 -08:00
  • fdf3b42fc3 Rejig init_weights for 'skip' mode, add 'reset' to make it work with meta init init_non_persistent_buffers Ross Wightman 2025-12-24 10:20:45 -08:00
  • c59c9c1a29 Modify autocasting in fast normalization functions to handle optional weight parameters safely. Mattie Tesfaldet 2025-12-19 13:12:09 -05:00
  • d914bef04c Fix efficientvit msra check Ross Wightman 2025-12-23 07:19:39 -08:00
  • eb64c3baad Use iterator for vit is_meta check, too configurable Ross Wightman 2025-12-22 21:46:14 -08:00
  • b1e5167f68 Add 'skip' to vit init_weights assert with new sequencing Ross Wightman 2025-12-22 17:47:52 -08:00
  • d144ff1071 Use blocks for is_meta check as patch_embed changes Ross Wightman 2025-12-22 15:51:06 -08:00
  • 782c8d9807 Further refine non-persistent buffer init and ensure reasonable overlap with weight init, weight init moving in direction of planned changes for overlapping models Ross Wightman 2025-12-22 15:11:14 -08:00
  • 431c1b66b8 Upgrade GitHub Actions for Node 24 compatibility Salman Muin Kayser Chishti 2025-12-20 23:40:37 +00:00
  • f12ac6aa11 Initial pass of an 'init_non_persistent_buffers' scheme, WIP.. needs more test and probably missed a few things Ross Wightman 2025-12-19 14:09:10 -08:00
  • 4e651dada1 Add HParams sections to hfdocs (#2630) Ross Wightman 2025-12-16 10:33:54 -08:00
  • c1425cad03 Update hparams.mdx hparam_docs Ross Wightman 2025-12-16 10:25:42 -08:00
  • a0fd395c17 Add HParams sections to hfdocs Ross Wightman 2025-12-16 10:19:19 -08:00
  • de61fce2a5 Update README.md Ross Wightman 2025-12-16 08:53:34 -08:00
  • ff95890e6a Update changes.mdx Ross Wightman 2025-12-16 08:45:31 -08:00
  • 962559deeb Update changes.mdx Ross Wightman 2025-12-16 08:44:15 -08:00
  • d7cfda4c78 Update README.md Ross Wightman 2025-12-16 08:43:19 -08:00
  • 86e9d21bc5 Add nadamuon trained dwee / dpwee vit weights. Add comment to muon impl Ross Wightman 2025-12-15 10:19:52 -08:00
  • c263de1872 Update README.md Ross Wightman 2025-12-12 11:34:50 -08:00
  • 7f8fd8a30d Csatv2 contribution (#2627) Ross Wightman 2025-12-12 11:21:42 -08:00
  • 81ec5a1e7d Unfold not needed with stride=kernel_size, removed unecessary permutes for a speedup csatv2-gusdlf93 Ross Wightman 2025-12-12 10:40:43 -08:00
  • dd57311c53 DCT stats list -> tuple Ross Wightman 2025-12-12 10:21:40 -08:00
  • 7a75178134 Upload csatv2 weights to hub, fix non-contiguous dct weight Ross Wightman 2025-12-12 10:09:34 -08:00
  • 24fd7d2f32 Compact checkpiont_filter_fn a bit, make learnable dct out dim changeable and define another model class to have another arch config tested. Ross Wightman 2025-12-12 09:45:59 -08:00
  • b6eb61aba5 Another round of consistency changes for csatv2, make stage building dynamic for other network shapes, allow drop path option for transformer blocks. Ross Wightman 2025-12-11 15:13:01 -08:00