Commit Graph

  • 9467c10690 [hot-fix] Fix memory leakage bug, support TP+PP (#6258) YeAnbang 2025-04-10 10:52:18 +08:00
  • 964f9a7974 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-10 02:20:40 +00:00
  • eaef783ec3 fix flybird11111 2025-04-10 10:19:38 +08:00
  • 99298c6a6d Merge branch 'upgrade-transformers' of github.com:flybird11111/ColossalAI into upgrade-transformers flybird11111 2025-04-09 18:25:56 +08:00
  • 25c5e420f2 fix flybird11111 2025-04-09 18:24:33 +08:00
  • dce221283d [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-09 09:57:33 +00:00
  • 603e2296c7 fix flybird11111 2025-04-09 17:56:07 +08:00
  • d5a3d1a44e fix flybird11111 2025-04-09 17:29:58 +08:00
  • 0e900ac5cd fix flybird11111 2025-04-09 17:29:08 +08:00
  • 57d7b16a18 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-09 08:34:30 +00:00
  • e92a692c97 Merge branch 'upgrade-transformers' of github.com:flybird11111/ColossalAI into upgrade-transformers flybird11111 2025-04-09 16:33:16 +08:00
  • a4e5ed9990 fix flybird11111 2025-04-09 16:32:10 +08:00
  • 466b61e674 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-09 07:53:50 +00:00
  • c0811d7342 fix flybird11111 2025-04-09 15:52:42 +08:00
  • b38d45ee51 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-09 07:23:03 +00:00
  • 28cf1e2c57 fix flybird11111 2025-04-09 15:20:14 +08:00
  • 397875e640 Update build_on_pr.yml flybird11111 2025-04-09 15:14:17 +08:00
  • ca914147eb Update test_fp16_torch.py flybird11111 2025-04-09 14:01:47 +08:00
  • ed43a4be04 [Distributed RLHF] Integration of PP (#6257) YeAnbang 2025-04-09 13:23:24 +08:00
  • 3491a9f7e3 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-01 07:34:48 +00:00
  • 4b8b67ae23 fix flybird11111 2025-04-01 15:32:11 +08:00
  • 822556a8ca [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-03-31 08:17:16 +00:00
  • 621cb93bb1 fix flybird11111 2025-03-31 16:16:15 +08:00
  • 8c66b7c3e9 fix flybird11111 2025-03-31 15:39:37 +08:00
  • 837a503f50 fix flybird11111 2025-03-31 15:32:51 +08:00
  • 43885a4317 fix flybird11111 2025-03-31 15:17:30 +08:00
  • 6c728df3e3 fix flybird11111 2025-03-31 11:22:59 +08:00
  • 0b81be7f7f add ci machine flybird11111 2025-03-28 18:04:03 +08:00
  • 50153005b4 [feat] add microbatch forwarding (#6251) YeAnbang 2025-03-28 10:24:58 +08:00
  • 40cf89d66e Merge branch 'hpcaitech:main' into upgrade-transformers flybird11111 2025-03-27 18:11:39 +08:00
  • 3ecb5000e3 test for upgrading transformers flybird11111 2025-03-27 18:08:37 +08:00
  • 489f215ad9 Merge pull request #6250 from hpcaitech/grpo-latest-dev YeAnbang 2025-03-21 16:25:35 +08:00
  • 2aa7385c88 update logging YeAnbang 2025-03-21 16:12:07 +08:00
  • d8eaf0d473 simplify vllm preprocessing input ids YeAnbang 2025-03-21 15:03:10 +08:00
  • 0472f44163 fix logprob, add filtering, temperature annealing, lr descent YeAnbang 2025-03-21 10:24:24 +08:00
  • 7ee4452f8c fix vllm YeAnbang 2025-03-19 17:07:20 +08:00
  • 7795d4c50d [Feature] Support Distributed LogProb for GRPO Training (#6247) duanjunwen 2025-03-18 17:47:55 +08:00
  • bc0171d392 fix transformers backend YeAnbang 2025-03-14 18:12:35 +08:00
  • 57b49da5e4 setup update Tong Li 2025-03-13 16:52:15 +08:00
  • 45ac6c6cb2 print results Tong Li 2025-03-13 16:51:22 +08:00
  • 4702d57841 convert to 8 generation Tong Li 2025-03-13 16:49:02 +08:00
  • afddfde2dd fix consumer Tong Li 2025-03-13 14:55:26 +08:00
  • 131eeceb5d fix tp bug Tong Li 2025-03-13 14:52:09 +08:00
  • 704866a240 detach Tong Li 2025-03-11 16:17:02 +08:00
  • 47d6493778 add response length Tong Li 2025-03-11 13:06:09 +08:00
  • abca66e69f fix reward score Tong Li 2025-03-11 10:17:32 +08:00
  • 71a0181fce update reward Tong Li 2025-03-10 14:19:10 +08:00
  • 754b16dfbf update reward fn Tong Li 2025-03-10 14:18:22 +08:00
  • 9d9d51614e update grpo Tong Li 2025-03-10 14:12:04 +08:00
  • 6e096362ef [pre-commit.ci] auto fixes from pre-commit.com hooks feat/ppo pre-commit-ci[bot] 2025-03-07 10:43:01 +00:00
  • c8e13a9403 run pre-commit YeAnbang 2025-03-07 18:40:31 +08:00
  • d31f9e4d0f run pre-commit YeAnbang 2025-03-07 18:30:19 +08:00
  • 6a6634b6e8 add ppo YeAnbang 2025-03-07 18:29:34 +08:00
  • 44d4053fec [HotFix] update load lora model Readme; (#6240) uxsqN94oMfWTUNCA duanjunwen 2025-03-07 14:14:26 +08:00
  • eb6337f07f [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-03-06 08:29:58 +00:00
  • 22cc1558a8 Merge branch 'grpo-latest' of github.com:hpcaitech/ColossalAI into grpo-latest Tong Li 2025-03-06 16:28:47 +08:00
  • 0590f10fb7 update select algo Tong Li 2025-03-06 16:27:13 +08:00
  • 0cc0c843ed add save Tong Li 2025-03-06 16:26:14 +08:00
  • ab5b6d8432 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-03-06 06:30:26 +00:00
  • 0f566cc2d4 add algo selection Tong Li 2025-03-06 14:29:22 +08:00
  • 812f4b7750 update loader Tong Li 2025-03-06 11:44:42 +08:00
  • 7f2ceac5c3 update example Tong Li 2025-03-06 10:54:23 +08:00
  • d03cdea949 update reward fn Tong Li 2025-03-06 10:53:48 +08:00
  • 678f5a9eca update loss Tong Li 2025-03-06 10:53:03 +08:00
  • b96d69055e grpo consumer Tong Li 2025-03-06 10:51:27 +08:00
  • c15225bc52 modify data loader Tong Li 2025-03-06 10:49:44 +08:00
  • 6d676ee0e9 [release] update version (#6236) v0.4.9 Hongxin Liu 2025-03-03 16:15:09 +08:00
  • 56fe130b15 [hotfix] fix lora load (#6231) Hongxin Liu 2025-03-01 19:04:14 +08:00
  • 070907dd7f polish Tong Li 2025-02-28 10:16:42 +08:00
  • f736d747e3 update grpo feat/grpo Tong Li 2025-02-25 18:12:04 +08:00
  • 2bb71c6248 [feature] fit non tensor broadcast (#6218) Hongxin Liu 2025-02-24 14:36:04 +08:00
  • f32861ccc5 [misc] update torch version (#6206) feature/dualpipe Hongxin Liu 2025-02-24 14:35:48 +08:00
  • ffd3878a1e add simple grpo Tong Li 2025-02-23 22:54:26 +08:00
  • 8e6c9a4ab3 add reward related function Tong Li 2025-02-23 11:02:54 +08:00
  • de282dd694 [feature] fit RL style generation (#6213) Hongxin Liu 2025-02-21 17:28:19 +08:00
  • 43c9b5fb44 [chat] add distributed impl (#6210) Hongxin Liu 2025-02-21 15:24:23 +08:00
  • b9e60559b8 Merge pull request #6208 from hpcaitech/grpo_dev YeAnbang 2025-02-20 21:23:16 +08:00
  • 7595c453a5 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-02-20 10:25:18 +00:00
  • 53834b74b9 fix num_train_step update YeAnbang 2025-02-20 18:24:04 +08:00
  • 0171884664 fix inference rebatching bug YeAnbang 2025-02-20 17:25:36 +08:00
  • 9379cbd668 [release] update version (#6195) v0.4.8 Hongxin Liu 2025-02-20 11:36:18 +08:00
  • 24dee8f0b7 [doc] DeepSeek V3/R1 news (#6199) binmakeswell 2025-02-19 15:07:29 +08:00
  • f73ae55394 [application] add lora sft example data (#6198) Hongxin Liu 2025-02-18 20:18:18 +08:00
  • f8b9e88484 [application] Update README (#6196) Tong Li 2025-02-18 20:17:56 +08:00
  • d54642a263 [application] add lora sft example (#6192) Hongxin Liu 2025-02-18 13:06:38 +08:00
  • d20c8ffd97 Add GRPO and Support RLVR for PPO (#6186) YeAnbang 2025-02-18 09:43:36 +08:00
  • ce0ec40811 [checkpointio] fix for async io (#6189) flybird11111 2025-02-14 17:34:13 +08:00
  • 510ff7bec2 Merge branch 'hpcaitech:main' into main flybird11111 2025-02-14 15:24:15 +08:00
  • 5ff5323538 [hotfix] fix zero optim save (#6191) Hongxin Liu 2025-02-14 15:09:50 +08:00
  • 014837e725 [shardformer] support pipeline for deepseek v3 and optimize lora save (#6188) Hongxin Liu 2025-02-14 14:48:54 +08:00
  • 6fc6a059a0 fix for async io flybird11111 2025-02-13 14:06:57 +08:00
  • ec73f1b5e2 [CI] Cleanup Dist Optim tests with shared helper funcs (#6125) Wenxuan Tan 2025-02-11 23:42:34 -06:00
  • 5c09d726a6 [checkpointio] fix checkpoint for 3d (#6187) flybird11111 2025-02-12 11:54:55 +08:00
  • 2b415e5999 [shardformer] support ep for deepseek v3 (#6185) Hongxin Liu 2025-02-11 16:10:25 +08:00
  • 17062c83b9 [hotfix] fix hybrid checkpointio for sp+dp (#6184) flybird11111 2025-02-06 17:21:04 +08:00
  • ca0aa2365d [Issue template] Add checkbox asking for details to reproduce error (#6104) Wenxuan Tan 2025-01-24 00:36:25 -06:00
  • 97e60cbbcb [checkpointio] gather tensor before unpad it if the tensor is both padded and distributed (#6168) feature/dist-ckp-io Lemon Qin 2025-01-21 10:23:15 +08:00
  • 5b094a836b [Inference]Fix example in readme (#6178) Guangyao Zhang 2025-01-08 11:51:50 +08:00
  • ee81366cac [checkpointio] support load-pin overlap (#6177) Hongxin Liu 2025-01-07 16:16:04 +08:00
  • 479067e9bc [release] update version (#6174) v0.4.7 Hongxin Liu 2025-01-03 11:52:23 +08:00