Commit Graph

  • 2d0ac6f567 Merge pull request #2397 from huggingface/half_prec_trainval Ross Wightman 2025-01-07 11:48:02 -08:00
  • 1969528296 Fix dtype log when default (None) is used w/o AMP half_prec_trainval Ross Wightman 2025-01-07 11:47:22 -08:00
  • 92f610c982 Add half-precision (bfloat16, float16) support to train & validate scripts. Should push dtype handling into model factory / pretrained load at some point... Ross Wightman 2025-01-07 10:25:14 -08:00
  • 40c19f3939 Add wandb project name argument and allow change wandb run name Jiao-Long Cao 2025-01-07 16:43:34 +08:00
  • 6f80214e80 Merge pull request #2394 from huggingface/non_reentrant_ckpt Ross Wightman 2025-01-06 14:44:06 -08:00
  • 155f6e7fea Update README, few minor fixups. non_reentrant_ckpt Ross Wightman 2025-01-06 13:09:15 -08:00
  • 2b251fb291 Wrap torch checkpoint() fn to default use_reentrant flag to False and allow env var override Ross Wightman 2025-01-06 11:28:39 -08:00
  • 131518c15c Add comments to MLP layers re expected layouts Ross Wightman 2025-01-02 09:41:35 -08:00
  • d23facd697 Merge pull request #2388 from laclouis5/fix-mqa-v2 Ross Wightman 2025-01-02 07:48:35 -08:00
  • 2d5277e858 Merge branch 'main' into fix-mqa-v2 Louis Lac 2025-01-02 00:11:22 +01:00
  • 2d734d9058 Fixed unfused attn2d scale Louis Lac 2025-01-01 12:03:34 +01:00
  • 6171e756d3 Fix MQA V2 scale and out shape Louis Lac 2025-01-01 15:37:28 +01:00
  • 851e0746a9 Update README.md Ross Wightman 2024-12-31 14:12:16 -08:00
  • e846b2cf28 Add 384x384 in12k pretrain and finetune for convnext_nano Ross Wightman 2024-12-31 11:00:44 -08:00
  • 2bd531e033 Add 384x384 in12k pretrain and finetune for convnext_nano convnext_nano_r384 Ross Wightman 2024-12-31 11:00:44 -08:00
  • dafe866047 Update README.md Ross Wightman 2024-12-31 10:19:43 -08:00
  • 52595a9641 Update README.md Ross Wightman 2024-12-31 10:10:52 -08:00
  • 1245b83924 fix: minor typos in UPGRADING Ruida Zeng 2024-12-31 03:01:51 -06:00
  • 8fd2f48b65 fix: minor typos in README Ruida Zeng 2024-12-31 02:37:35 -06:00
  • b0068ba5d0 Switch hf hub entries for new aimv2 / dfn weights to point to timm locations. Undo forced device for SDR linspace, part of another change. Ross Wightman 2024-12-30 16:59:55 -08:00
  • cc7fd34015 test filter tweaks Ross Wightman 2024-12-30 16:09:31 -08:00
  • 1bf84b35c3 Update tests for aimv2 filtering Ross Wightman 2024-12-30 15:34:03 -08:00
  • b33418713a Add (almost) full set of aimv2 model instances. Switch back to unpacked SwiGLU. Verify correctness. Add DFN L/14 39B weight. Ross Wightman 2024-12-30 14:23:20 -08:00
  • de35fd87f5 Add SimpleNorm to create_norm factory Ross Wightman 2024-12-30 14:22:42 -08:00
  • d5375ca769 Use torch F.rms_norm when possible, select fast vs normal paths appropriately and test with torchscript Ross Wightman 2024-12-29 14:05:07 -08:00
  • 5f12a25114 Add bias arg to Vitamin GeGLU Ross Wightman 2024-12-29 09:01:46 -08:00
  • 5804d92e4b Switch aimv2 to used packed SwiGLU Ross Wightman 2024-12-28 21:05:38 -08:00
  • 15406a939e Fixing RmsNorm to fix #2380 and noticed with aimv2 when comparing outputs. Still some work to do, need to look at AMP / fast mode behaviour, dispatch to torch when possible. Add SimpleNorm for 'LayerNorm w/o centering and bias' Ross Wightman 2024-12-28 21:03:49 -08:00
  • a648a04834 Supporting aimv2 encoders Ross Wightman 2024-12-27 14:01:13 -08:00
  • eb84e4b571 Switch hf hub entries for new aimv2 / dfn weights to point to timm locations. Undo forced device for SDR linspace, part of another change. aimv2 Ross Wightman 2024-12-30 16:59:55 -08:00
  • 874037e675 test filter tweaks Ross Wightman 2024-12-30 16:09:31 -08:00
  • cb294c83a8 Update tests for aimv2 filtering Ross Wightman 2024-12-30 15:34:03 -08:00
  • 1d6ebeb102 Add (almost) full set of aimv2 model instances. Switch back to unpacked SwiGLU. Verify correctness. Add DFN L/14 39B weight. Ross Wightman 2024-12-30 14:23:20 -08:00
  • a4146b79d1 Add SimpleNorm to create_norm factory Ross Wightman 2024-12-30 14:22:42 -08:00
  • 3a6661ac78 fix broken image link ariG23498 2024-12-30 12:55:38 +05:30
  • 5809c2fe5e Use torch F.rms_norm when possible, select fast vs normal paths appropriately and test with torchscript Ross Wightman 2024-12-29 14:05:07 -08:00
  • e0cacbfd15 Add bias arg to Vitamin GeGLU Ross Wightman 2024-12-29 09:01:46 -08:00
  • 0d87caefff Switch aimv2 to used packed SwiGLU Ross Wightman 2024-12-28 21:05:38 -08:00
  • 04a484a895 Fixing RmsNorm to fix #2380 and noticed with aimv2 when comparing outputs. Still some work to do, need to look at AMP / fast mode behaviour, dispatch to torch when possible. Add SimpleNorm for 'LayerNorm w/o centering and bias' Ross Wightman 2024-12-28 21:03:49 -08:00
  • e752b5d07c Supporting aimv2 encoders Ross Wightman 2024-12-27 14:01:13 -08:00
  • 790decc89b Add more pali(2) weights. Switch rest of models adapting open_clip weights to their own weight instances. Ross Wightman 2024-12-27 12:05:22 -08:00
  • 01cf0f72af Add support for tag, license customization through push_to_hub Ross Wightman 2024-12-27 12:04:04 -08:00
  • b12ecbd614 Move siglip timm weights to own repos Ross Wightman 2024-12-23 17:40:21 -08:00
  • 6fb7aaf37d Switching to timm specific weight instances for open_clip image encoders to facilitate hf-hub: use in timm and new transformers TimmWrapper Ross Wightman 2024-12-23 16:52:08 -08:00
  • 364c567dd2 Merge pull request #2357 from huggingface/more_opt_stuff Ross Wightman 2024-12-27 12:54:02 -08:00
  • 5cf022f228 Add more pali(2) weights. Switch rest of models adapting open_clip weights to their own weight instances. openclip_weight_move Ross Wightman 2024-12-27 12:05:22 -08:00
  • 4f4f40baa6 Add support for tag, license customization through push_to_hub Ross Wightman 2024-12-27 12:04:04 -08:00
  • 7533a7f0c2 Move siglip timm weights to own repos Ross Wightman 2024-12-23 17:40:21 -08:00
  • 447147a25b Switching to timm specific weight instances for open_clip image encoders to facilitate hf-hub: use in timm and new transformers TimmWrapper Ross Wightman 2024-12-23 16:52:08 -08:00
  • d285526dc9 Lazy loader for TF, more LAB fiddling augmentation_update Ross Wightman 2024-12-23 13:24:11 -08:00
  • a02b1a8e79 Merge pull request #2369 from brianhou0208/fix_reduction Ross Wightman 2024-12-18 16:51:53 -08:00
  • 3fbbd511e6 Testing some LAB stuff Ross Wightman 2024-12-18 16:49:17 -08:00
  • 3b181b78d1 Updating augmentations, esp randaug to support full torch.Tensor pipeline Ross Wightman 2024-12-18 12:24:04 -08:00
  • ab0a70dfff fix feature_info.reduction Ryan 2024-12-18 21:12:40 +08:00
  • ea231079f5 Merge pull request #2361 from huggingface/grodino-dataset_trust_remote Ross Wightman 2024-12-06 12:06:56 -08:00
  • 7573096eb8 Make sure trust_remote code only passed to HF datasets. Improve some docstrings. grodino-dataset_trust_remote Ross Wightman 2024-12-06 11:40:04 -08:00
  • 95d903fd87 Merge branch 'main' of github.com:grodino/pytorch-image-models into grodino-dataset_trust_remote Ross Wightman 2024-12-06 11:14:26 -08:00
  • 9eee47de52 Back to dev version Ross Wightman 2024-12-06 10:44:41 -08:00
  • 9383f2880d Add cache_dir example Álvaro Justen (@turicas) 2024-12-05 23:15:54 -03:00
  • d1e9a8622a Rename inception_next_atto pretrained str Ross Wightman 2024-12-06 10:08:03 -08:00
  • 0576175d85 Add inception_next_atto Weihao Yu 2024-12-06 14:22:29 +08:00
  • 9cec2f17cd Merge pull request #2358 from turicas/cache_dir cache_dir Ross Wightman 2024-12-06 10:25:29 -08:00
  • 7ab2b938e5 More tweaks to docstrings for hub/builder Ross Wightman 2024-12-06 08:58:02 -08:00
  • dc1bb05e8e Punch cache_dir through model factory / builder / pretrain helpers. Improve some annotations in related code. Ross Wightman 2024-12-04 22:02:40 -08:00
  • e90b68b603 Rename inception_next_atto pretrained str yuweihao-inception_next_atto Ross Wightman 2024-12-06 10:08:03 -08:00
  • b09f81c8cb More tweaks to docstrings for hub/builder Ross Wightman 2024-12-06 08:58:02 -08:00
  • d7a7ed7ba9 Add inception_next_atto Weihao Yu 2024-12-06 14:22:29 +08:00
  • a1d219c1c3 Add cache_dir example Álvaro Justen (@turicas) 2024-12-05 23:15:54 -03:00
  • afdf11d9ae Add caution to Adan. Add decouple decay option to LAMB. more_opt_stuff Ross Wightman 2024-12-05 13:50:30 -08:00
  • 71849b972a Punch cache_dir through model factory / builder / pretrain helpers. Improve some annotations in related code. Ross Wightman 2024-12-04 22:02:40 -08:00
  • 553ded5c6b Version 1.0.12 v1.0.12 Ross Wightman 2024-12-03 10:34:38 -08:00
  • 464885e135 See if we can avoid some model / layer pickle issues with the aa attr in ConvNormAct Ross Wightman 2024-12-02 16:55:29 -08:00
  • ceaff7668e See if we can avoid some model / layer pickle issues with the aa attr in ConvNormAct convnormact_aa_none Ross Wightman 2024-12-02 16:55:29 -08:00
  • 5fe5f9d488 Add a different mnv4 conv-small weight Ross Wightman 2024-12-02 16:14:37 -08:00
  • 303f7691a1 Add cautious mars, improve test reliability by skipping grad diff for first step Ross Wightman 2024-12-02 09:38:25 -08:00
  • 9fc8bac3d2 Add cautious mars, improve test reliability by skipping grad diff for first step mars_tweak Ross Wightman 2024-12-02 09:38:25 -08:00
  • 82e8677690 Make LaProp weight decay match typical PyTorch 'decoupled' behaviour where it's scaled by LR Ross Wightman 2024-11-29 16:44:43 -08:00
  • 886eb77938 Update README, missed small discrep in adafactor min dim update Ross Wightman 2024-11-29 10:57:47 -08:00
  • e3e434bbc4 To be technically correct, need to check the in-place _ ver of op Ross Wightman 2024-11-28 13:46:17 -08:00
  • 7c32d3bd82 Work around _foreach_maximum issue, need scalar other support Ross Wightman 2024-11-28 13:39:44 -08:00
  • 7cf683628f Cautious optimizer impl plus some typing cleanup. Ross Wightman 2024-11-28 12:34:51 -08:00
  • 9b27f84876 To be technically correct, need to check the in-place _ ver of op cautious_optim Ross Wightman 2024-11-28 13:46:17 -08:00
  • b0a121bed0 Work around _foreach_maximum issue, need scalar other support Ross Wightman 2024-11-28 13:39:44 -08:00
  • 3086dd03fd Cautious optimizer impl plus some typing cleanup. Ross Wightman 2024-11-28 12:34:51 -08:00
  • aeb1ed7a15 Keep basic optim test LR range closer to before w/ updated code Ross Wightman 2024-11-26 13:40:20 -08:00
  • 7a165fcb62 Remove rogue import, thanks IDE :/ Ross Wightman 2024-11-26 12:20:20 -08:00
  • 73d10ab482 Update tests, need handling for radamw with older PyTorch, need to back-off basic test LR in mars? Ross Wightman 2024-11-26 12:13:21 -08:00
  • 09bc21774e Update optimizers.mdx Ross Wightman 2024-11-26 11:18:30 -08:00
  • 4f64ec4e14 Add guard around 'somewhat' newer torch RAdam / NAdam imports Ross Wightman 2024-11-26 11:10:42 -08:00
  • 0903d98162 Reduce tolerance on model inference 'owl' test, pillow output varies a lot, was failing locally Ross Wightman 2024-11-26 10:55:52 -08:00
  • 1ab02a11a1 Update Adan with newer impl (from original source) that includes multi-tensor fn Ross Wightman 2024-11-26 10:55:20 -08:00
  • a024ab3170 Replace radam & nadam impl with torch.optim ver, rename legacy adamw, nadam, radam impl in timm. Update optim factory & tests. Ross Wightman 2024-11-26 10:54:17 -08:00
  • 7b54eab807 Add MARS and LaProp impl, simplified from originals Ross Wightman 2024-11-26 10:51:53 -08:00
  • e5aea357b1 Update Adopt to include clipping for stability, separate wd so no param decay if update not taken on first step Ross Wightman 2024-11-26 10:42:01 -08:00
  • 1a70036691 Keep basic optim test LR range closer to before w/ updated code opt_mars_more Ross Wightman 2024-11-26 13:40:20 -08:00
  • 269bc084fa Remove rogue import, thanks IDE :/ Ross Wightman 2024-11-26 12:20:20 -08:00
  • bc7d2247bf Update tests, need handling for radamw with older PyTorch, need to back-off basic test LR in mars? Ross Wightman 2024-11-26 12:13:21 -08:00
  • 7d3146b97b Update optimizers.mdx Ross Wightman 2024-11-26 11:18:30 -08:00
  • 444c506ce3 Merge pull request #2346 from JohannesTheo/patch-1 Ross Wightman 2024-11-26 11:15:17 -08:00
  • 835a1a60ab Add guard around 'somewhat' newer torch RAdam / NAdam imports Ross Wightman 2024-11-26 11:10:42 -08:00