Commit Graph

  • 1bef748fab [doc][c10d] fixup fsdp tutorial (#1297) Chirag Pandya 2024-11-08 13:26:54 -08:00
  • cb002880d9 [doc][c10d] fixup fsdp tutorial chirag/fix-fsdp-tutorial Chirag Pandya 2024-10-31 09:47:51 -07:00
  • 47d0c2eff2 Fix python failing tests (#1299) Chirag Pandya 2024-11-05 13:23:54 -08:00
  • 390a9b7744 Fix python failing tests chirag/fix-python-tests Chirag Pandya 2024-11-01 09:47:23 -07:00
  • cdef4d43fb Use log1p(x) instead of log(1+x) (#1286) Sergii Dymchenko 2024-09-19 10:46:15 -07:00
  • a308b4e974 Update DDP tutorial for the correct order of set_device (#1285) Chien-Chin Huang 2024-09-17 00:15:14 -07:00
  • 2e28e7361e Use log1p(x) instead of log(1+x) sdym/log1p Sergii Dymchenko 2024-09-16 17:31:42 -07:00
  • 26de419043 Fix AC in T5 example (#1273) Mark Saroufim 2024-06-28 23:35:07 -07:00
  • a38cbfc6f8 docs: added copyright holder to license file (#1266) david-pa 2024-06-11 13:40:34 -05:00
  • 86591de88a Update run_python_examples.sh (#1269) Mark Saroufim 2024-06-08 13:13:14 -07:00
  • 26c99ebbbe Update run_python_examples.sh msaroufim-patch-1 Mark Saroufim 2024-06-07 23:15:32 -07:00
  • 37a1866d0e Remove the unused import in MNIST example (#1261) lancerts 2024-05-25 22:36:20 -07:00
  • f30df59986 Deploying to gh-pages from @ 102beb09a3 🚀 msaroufim 2024-05-25 05:16:19 +00:00
  • 102beb09a3 Update index.rst Mark Saroufim 2024-05-24 22:15:24 -07:00
  • e8a38cc478 Remove Vision Transformer Example (#1258) Mark Saroufim 2024-05-24 22:14:30 -07:00
  • d8c6c988c7 Deploying to gh-pages from @ cd29c12ac8 🚀 tianyu-l 2024-05-16 17:41:34 +00:00
  • cd29c12ac8 [Tensor Parallel] update examples to simplify embedding + first transformer block Tianyu Liu 2024-05-15 18:42:20 -07:00
  • a1802eb027 Deploying to gh-pages from @ 851c4cf0ef 🚀 msaroufim 2024-05-11 03:16:30 +00:00
  • 851c4cf0ef Fix the MNIST dataset url (#1256) lancerts 2024-05-10 20:15:12 -07:00
  • e48aaeb88b Update run_cpp_examples.sh Mark Saroufim 2024-05-03 15:58:19 -07:00
  • c49554ccf1 Update distributed example tests in run_python_examples.sh (#1250) Sirut Buasai 2024-05-03 13:25:49 -07:00
  • 911816ceba Update main_cpp.yml (#1251) Mark Saroufim 2024-04-30 09:21:06 -07:00
  • 61e266f1b9 Update run_cpp_examples.sh Mark Saroufim 2024-04-29 20:03:15 -07:00
  • 0d2d5071a6 Deploying to gh-pages from @ d22a29f338 🚀 msaroufim 2024-04-16 22:44:29 +00:00
  • d22a29f338 Update doc-build.yml Mark Saroufim 2024-04-16 15:43:37 -07:00
  • 89396d1ed1 chore: remove repetitive words (#1244) STEVEN ADAMS 2024-04-14 13:02:12 +08:00
  • ecd951d952 Update TP examples to align with tutorials (#1243) Wanchao 2024-04-10 23:11:50 -07:00
  • 7df10c2a86 Language translation example added (#1131) (#1240) Noah Schiro 2024-04-02 12:50:04 -04:00
  • 2d725b6ab2 fix minor typo in README.md (#1234) Gang Hyeok Lee (Robin) 2024-02-19 14:49:54 +09:00
  • 83ff2f5402 Minor improvement of the GCN doc (#1231) lancerts 2024-02-12 17:34:18 -08:00
  • 8c246ba77d warning added for single GPU and NCCL (#1226) jaiaid 2024-02-03 02:05:02 -05:00
  • ec8a172616 Revert "Fixes in Imagenet training script" (#1225) Mark Saroufim 2024-01-30 15:28:12 -08:00
  • a848347177 Fixes in Imagenet training script (#1224) jaiaid 2024-01-30 18:10:19 -05:00
  • 76cd9d02a0 Improve readme for cpp/custom-dataset and cpp/dcgan (#1223) lancerts 2024-01-25 14:16:18 -08:00
  • a537659a51 Improve code readability and make number of epochs a command line argument (#1222) lancerts 2024-01-24 21:20:29 -08:00
  • b88d8059e5 Add a CI job for cpp/dcgan (#1221) lancerts 2024-01-23 20:22:58 -08:00
  • d78cff0224 Add CI jobs for cpp/mnist and cpp/regression (#1220) lancerts 2024-01-23 09:23:58 -08:00
  • b2832cc107 Ensure the const-ness of the data member in cpp/custom-dataset (#1215) lancerts 2024-01-22 21:23:05 -08:00
  • 42b3bda039 Include a CI job for cpp/custom-dataset (#1219) lancerts 2024-01-22 20:12:24 -08:00
  • 814f04751f Include the CI for cpp/autograd (#1217) lancerts 2024-01-22 14:54:16 -08:00
  • 97adea1b17 Update the readme and fix bugs in custom-dataset example (#1214) lancerts 2024-01-12 18:25:54 -08:00
  • 5921fc191b Fix args description (#1209) Michael Monashev 2024-01-12 07:40:09 +03:00
  • 3e56db211e Add MPS device (#1197) Jakub Chmura 2024-01-12 05:40:00 +01:00
  • de85c0917c Bugfix in vision transformer - save class token and pos embedding (#1204) ManukyanD 2024-01-12 08:39:33 +04:00
  • 5a3b3336cf update-minimum-cmake-version to version 3.5 (#1206) lancerts 2024-01-11 20:38:31 -08:00
  • bdb948c76c Fix the DCGAN C++ shape warning (#1207) lancerts 2024-01-11 20:38:03 -08:00
  • 30b310a977 [2D] Update 2d example to use get_local_rank (#1203) Iris Z 2023-12-08 13:49:15 -08:00
  • c0b889d5f4 simpler subsequent mask generator (#1198) Alex Shroyer 2023-11-26 20:10:21 -05:00
  • c4dc481e68 [T170073014] Rewrite distributed examples for Tensor Parallel, Sequence Parallel, 2D (FSDP + TP) (#1201) Less Wright 2023-11-22 15:29:22 -08:00
  • e741545fda Deploying to gh-pages from @ f0d6fc9909 🚀 malfet 2023-11-10 04:45:08 +00:00
  • f0d6fc9909 Update CXX compiler from 14 to 17 Nikita Shulga 2023-11-09 20:43:23 -08:00
  • c67bbaba01 [TorchFix] Update deprecated TorchVision pretrained parameters (#1193) Sergii Dymchenko 2023-10-06 09:44:59 -07:00
  • 39dbfb9b8c Update deprecated TorchVision pretrained parameters sdym/pretrained Sergii Dymchenko 2023-10-03 12:11:32 -07:00
  • cead596caa Added the --save_model arg for mnist_hogwild example (#1189) Pranav Prajapati 2023-09-04 00:09:49 +05:30
  • 13009eff7a fix the broken link about minGPT-DDP (#1156) zhou fan 2023-08-20 13:29:44 +08:00
  • 001d493285 Update TransformerModel using nn.Transformer module (#1138) Tairen Piao 2023-08-09 04:52:04 +09:00
  • 20955149ad Bugfix in vision transformer example - change lr datatype to float (#1161) Siddharth Singh 2023-08-08 12:50:31 -07:00
  • 1dd0f46029 Use gymnasium and reflect new API (#1152) Nguyen Trung Duc (john) 2023-08-09 02:49:44 +07:00
  • 508743bbef fix a typo in the FSDP example (#1159) Sepehr Sameni 2023-08-08 12:49:21 -07:00
  • 24fc1b9cb2 Deploying to gh-pages from @ 4440841a10 🚀 msaroufim 2023-08-08 15:49:59 +00:00
  • 4440841a10 Added Graph Attention Network example (#1174) Ebrahim Pichka 2023-08-08 11:49:10 -04:00
  • 741de70c4a Fixed default num_classes to 10 for CIFAR10 (#1176) Niyar R Barman 2023-07-28 21:09:29 +05:30
  • 638f174c25 Deploying to gh-pages from @ 92790cdf20 🚀 msaroufim 2023-07-22 01:11:55 +00:00
  • 6fc19c76b9 fix zero_grad default parameter (#1172) Yanhua Huang 2023-07-22 09:11:49 +08:00
  • 0676c07856 Fix import in minGPT example (#1169) Jeongwook Park 2023-07-22 10:11:15 +09:00
  • 92790cdf20 📝🎯 Added README for SNLI Classifier Training (#1173) Kadir Nar 2023-07-22 04:10:40 +03:00
  • 4516511e62 Deploying to gh-pages from @ 7f7c222b35 🚀 msaroufim 2023-06-12 18:23:22 +00:00
  • 7f7c222b35 Graph Convolutional Network (#1163) José Luis Castro García 2023-06-12 12:22:18 -06:00
  • 8c16e96816 Fix 2D example to pass in data parallel pg (#1160) Hugo 2023-06-05 13:55:19 -07:00
  • 673c6a516f Fix 2D example to pass in data parallel pg fix_example fduwjj 2023-06-05 20:15:49 +00:00
  • 55c663f9a5 Update imports (#1155) Suraj Subramanian 2023-05-25 11:30:20 -04:00
  • 9474336ef2 Update main.py subramen-patch-2 Suraj Subramanian 2023-05-25 10:16:50 -04:00
  • 7b7c7084f8 FSDP example (#1019) Hamid Shojanazeri 2023-05-24 12:55:44 -07:00
  • 79ef786ec4 Adds torch.cuda.set_device calls to DDP examples (#1142) Suraj Subramanian 2023-05-15 17:59:18 -04:00
  • b283f4bd7f Deploying to gh-pages from @ 6a64939872 🚀 fduwjj 2023-05-10 21:40:11 +00:00
  • 6a64939872 Add Sequence parallel and 2D parallel examples (#1149) Hugo 2023-05-10 14:38:43 -07:00
  • 27362cef9a fix import 2d_example fduwjj 2023-05-10 20:09:30 +00:00
  • 8b114dc9b3 Add sq safe guard since it is still not released fduwjj 2023-05-10 18:56:24 +00:00
  • 56d684b5f8 Update test script fduwjj 2023-05-10 18:14:07 +00:00
  • a9c7bb53f8 Split files and extract common logic fduwjj 2023-05-10 17:55:08 +00:00
  • 53cc387b47 Add Sequence parallel and 2D parallel examples fduwjj 2023-05-10 01:29:33 +00:00
  • c9ef23fd0e Fix typo in README.md (#1145) Suraj Subramanian 2023-05-01 22:03:53 -04:00
  • 82f4d65786 fix typo subramen-patch-1 Suraj Subramanian 2023-05-01 14:25:20 -04:00
  • d7337eda81 Add set_device calls to DDP examples ddp-tutorial-setdevice Suraj Subramanian 2023-04-27 17:16:44 -04:00
  • 1a59fcce0f Deploying to gh-pages from @ 33cbdfdad4 🚀 msaroufim 2023-04-26 15:55:09 +00:00
  • 33cbdfdad4 Implemented Vision Transformer in PyTorch (#1141) Niyar R Barman 2023-04-26 21:24:13 +05:30
  • 54f4572509 Typo (#1126) Vivek Patel 2023-03-10 03:08:09 +05:30
  • 7ec911c46c Revert "Change ninp to nhid" (#1124) Mark Saroufim 2023-03-05 13:29:10 -08:00
  • 3d494f7fa4 Deploying to gh-pages from @ 9d17ef8beb 🚀 msaroufim 2023-02-26 23:48:11 +00:00
  • 76f01205d6 Change ninp to nhid (#1109) Avinash Madasu 2023-02-26 18:47:44 -05:00
  • 9d17ef8beb Fixes #1111 : Added Forward-Forward Algorithm (#1114) Vivek Patel 2023-02-27 05:17:16 +05:30
  • a1271e82e9 Removes warnings displayed on running main.py (#1117) Deep Chordia 2023-02-27 05:17:03 +05:30
  • 0252bda5b6 Remove reference to non-existing --model argument (#1110) Sopot Cela 2023-02-22 01:44:31 +00:00
  • e4e8da8467 fix: fixed local device name in multinode example. Fabian Joswig 2023-02-12 19:33:42 +00:00
  • d8456a36d1 Update CONTRIBUTING.md (#1107) Dylan Ngare Gatua 2023-02-01 17:38:44 -08:00
  • 40289773aa Update requirements.txt for minGPT ddp example (#1106) Kurman Karabukaev 2023-01-05 13:41:54 -08:00
  • 244e4eefb1 set type of batch_size argument to int in ddp-tutorial-series (#1104) Conrad Stack 2022-12-20 02:21:54 -06:00
  • 47ac714389 word language model on Jetson NX (#1103) Guiying Li 2022-12-14 09:39:26 +08:00
  • f8401e9e5a Fix exception causes in word_language_model/model.py (#1102) Ram Rachum 2022-12-13 09:43:35 +02:00
  • dceeac6b80 Deploying to gh-pages from @ 63fc2764db 🚀 fduwjj 2022-12-02 23:58:33 +00:00