Commit Graph

  • 50070c1e84 move logging to producer YeAnbang 2025-05-14 18:10:57 +08:00
  • 13c2676612 update agpo reward Chen Li 2025-05-14 16:54:52 +08:00
  • aca547623f [feat] Support prompt level dynamic (#6300) Tong Li 2025-05-14 16:40:35 +08:00
  • 5374601741 Merge pull request #6283 from wangbluo/upgrade_falcon Hanks 2025-05-14 15:05:31 +08:00
  • 0e9d628bb7 add the explanation wangbluo 2025-05-14 12:50:07 +08:00
  • b032cf9b16 upgrade_sam wangbluo 2025-05-14 12:45:34 +08:00
  • 89917e247b [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-05-14 04:24:23 +00:00
  • 0dede489d6 Merge branch 'upgrade_transformers' into upgrade_falcon Wang Binluo 2025-05-14 12:23:28 +08:00
  • 47a7dc7142 Support evaluation during training YeAnbang 2025-04-30 18:13:40 +08:00
  • 1ace29b54d Merge pull request #6299 from wangbluo/upgrade_bloom Hanks 2025-05-14 10:19:44 +08:00
  • c28b3c39db Merge pull request #6305 from wangbluo/update_bert Hanks 2025-05-14 10:19:34 +08:00
  • d665d6740a add explantion wangbluo 2025-05-14 10:15:25 +08:00
  • 07349e0014 fix wangbluo 2025-05-14 10:09:35 +08:00
  • a5380d7073 agpo reward Chen Li 2025-05-13 18:36:13 +08:00
  • 2237531137 update_bloom wangbluo 2025-05-13 18:21:57 +08:00
  • b920af427b update pad seq (#6303) Tong Li 2025-05-13 16:51:27 +08:00
  • e08626d740 register agpo Chen Li 2025-05-13 16:08:38 +08:00
  • f118146564 [upgrade]Upgrade qwen2 (#6302) flybird11111 2025-05-13 15:49:53 +08:00
  • 4fbbf4737a fix wangbluo 2025-05-13 14:51:54 +08:00
  • d6f3508910 fix wangbluo 2025-05-13 10:15:48 +08:00
  • b124603c68 fix wangbluo 2025-05-08 18:06:56 +08:00
  • fe94d73f6b fix wangbluo 2025-05-08 18:03:53 +08:00
  • 4eced5cf8a [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-05-08 09:58:04 +00:00
  • cefdfc4125 add explanation wangbluo 2025-05-08 17:46:54 +08:00
  • e78c4560c6 fix wangbluo 2025-05-08 16:22:08 +08:00
  • 06724492ca [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-05-08 08:13:33 +00:00
  • a9bb7cb943 upgrade command wangbluo 2025-05-08 16:06:05 +08:00
  • a4c6e189fa [upgrade] upgrade gpt2 (#6291) flybird11111 2025-05-08 14:10:21 +08:00
  • eb6b5dd62e [fix] revert reward update and evaluation (#6295) YeAnbang 2025-05-07 10:56:47 +08:00
  • 367ae3f233 Revert "Support evaluation during training" grpo-latest-revert YeAnbang 2025-05-07 10:52:08 +08:00
  • 08cad304f2 Revert "reuse comm-group" YeAnbang 2025-05-07 10:51:30 +08:00
  • 4c5879d92b Revert "fix bug" YeAnbang 2025-05-07 10:51:08 +08:00
  • d34115c5c4 Revert "upgrade reward math verification" YeAnbang 2025-05-07 10:50:41 +08:00
  • 2cd70cbcb1 Revert "rewrite reward fn" YeAnbang 2025-05-07 10:49:22 +08:00
  • 5480b811c5 upgrade_bloom wangbluo 2025-05-06 15:58:53 +08:00
  • 16169d1f22 Revert "[feat] Update reward verification" revert-6292-grpo-latest-dev-reward-update Tong Li 2025-05-06 12:59:30 +08:00
  • 4d18e7d772 spot a possible bug grpo-latest-dev-debug YeAnbang 2025-05-05 18:48:42 +08:00
  • 6fff36dd63 fix reward YeAnbang 2025-05-05 15:40:22 +08:00
  • 08787f0b6e upgrade_bert wangbluo 2025-05-05 09:50:07 +08:00
  • d4a6b6c4a7 update evaluation parameters grpo-latest-v3 YeAnbang 2025-05-04 16:41:27 +08:00
  • 2999bd4cc8 fix reward taging bug YeAnbang 2025-05-03 14:34:04 +08:00
  • da867a4d8f small fix YeAnbang 2025-05-03 10:08:12 +08:00
  • 17928ad84f Merge pull request #6292 from hpcaitech/grpo-latest-dev-reward-update YeAnbang 2025-05-03 10:00:32 +08:00
  • dd74f496c0 small fix YeAnbang 2025-05-03 09:55:24 +08:00
  • 7d658402da fix schedualing for multi-node training YeAnbang 2025-05-02 19:45:07 +08:00
  • d06042b434 rewrite reward fn grpo-latest-dev-reward-update YeAnbang 2025-05-01 11:28:05 +08:00
  • a6085ff676 upgrade reward math verification YeAnbang 2025-04-30 22:59:54 +08:00
  • 01640ebd65 fix bug YeAnbang 2025-04-30 22:53:12 +08:00
  • bd61918dcf reuse comm-group YeAnbang 2025-04-30 21:36:11 +08:00
  • 57a88395fe Support evaluation during training YeAnbang 2025-04-30 18:13:40 +08:00
  • 5fd4bcb9d8 [feat] Sync shard model (#6289) Tong Li 2025-04-30 14:47:01 +08:00
  • 87bac841ea Merge pull request #6288 from duanjunwen/support_hybrid_model_sync feat/hybrid_model_sync YeAnbang 2025-04-29 18:22:32 +08:00
  • b2362ed33e Merge branch 'feat/hybrid_model_sync' into support_hybrid_model_sync YeAnbang 2025-04-29 18:22:13 +08:00
  • 93b40e888f Merge branch 'grpo-latest' of https://github.com/hpcaitech/ColossalAI into grpo-dev grpo-dev YeAnbang 2025-04-29 17:07:27 +08:00
  • 2f293248f7 [feat] support hybrid parallel model sync duanjunwen 2025-04-29 17:00:31 +08:00
  • 3381881908 clean code YeAnbang 2025-04-29 16:57:46 +08:00
  • 14f237ce7e [feat] Support boxed math reward (#6284) YeAnbang 2025-04-29 16:46:47 +08:00
  • 8c5dd4131d merge YeAnbang 2025-04-29 16:09:44 +08:00
  • 6c1b3b694f fix pp state dict incomplete issue YeAnbang 2025-04-29 16:06:01 +08:00
  • 885210dc27 fix wangbluo 2025-04-28 18:17:12 +08:00
  • 064be50946 add boxed reward YeAnbang 2025-04-28 18:15:43 +08:00
  • 5d167f2148 fix wangbluo 2025-04-28 18:01:53 +08:00
  • 14c01aec00 support boxed reward YeAnbang 2025-04-28 17:53:20 +08:00
  • 2ca1e3c630 fix pp+tp, fix dataloader (#6280) YeAnbang 2025-04-28 17:10:00 +08:00
  • 263a9cbe7a fixed plugin micro-batch size YeAnbang 2025-04-28 16:18:50 +08:00
  • 0f794f7294 fix pp+tp, fix dataloader YeAnbang 2025-04-28 13:09:41 +08:00
  • 28795f560c fix save issue (#6279) Tong Li 2025-04-27 17:54:06 +08:00
  • 38008858e4 fix checkpoint naming; add num_epoch parameter (#6277) YeAnbang 2025-04-26 14:00:28 +08:00
  • 26d859f68e [feat] Support DAPO (#6263) YeAnbang 2025-04-25 17:39:17 +08:00
  • 8497ecc3e5 Merge pull request #6276 from flybird11111/upgrade-transformers Hanks 2025-04-24 17:30:40 +08:00
  • c6291be1b1 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-24 08:35:01 +00:00
  • 2f615a49fd fix flybird11111 2025-04-24 16:20:42 +08:00
  • d7a9eb0f67 Merge branch 'hpcaitech:main' into upgrade-transformers flybird11111 2025-04-24 16:15:57 +08:00
  • e891501c55 fix flybird11111 2025-04-24 15:44:20 +08:00
  • 686982764c upgrade llama flybird11111 2025-04-24 14:54:15 +08:00
  • 56e4e74140 boxed version feat/boxed-version Tong Li 2025-04-23 17:20:09 +08:00
  • b823c6eec7 [feat] Add final save at the end (#6274) Tong Li 2025-04-23 10:03:46 +08:00
  • 03f4b1dde3 add prompt template (#6273) Tong Li 2025-04-22 10:39:47 +08:00
  • 7bb7e80476 [feat] GRPO with distributed implementation (#6230) feature/ray-rlhf Tong Li 2025-04-21 10:43:49 +08:00
  • 46ed5d856b [ci] update ci (#6254) flybird11111 2025-04-18 16:40:53 +08:00
  • 0c5ed65305 fix flybird11111 2025-04-18 11:33:44 +08:00
  • 52ead00795 fix flybird11111 2025-04-18 11:29:24 +08:00
  • 7af46ab667 fix flybird11111 2025-04-17 17:59:46 +08:00
  • afe07a63ac fiux flybird11111 2025-04-17 17:53:48 +08:00
  • a2e623db78 fix flybird11111 2025-04-17 16:49:48 +08:00
  • 7ecdf9a211 Update README.md (#6268) Yanjia0 2025-04-17 12:07:25 +08:00
  • dc60efe154 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-11 03:22:25 +00:00
  • fd69a821bb fix flybird11111 2025-04-11 11:21:09 +08:00
  • db4c73f643 fix flybird11111 2025-04-11 11:20:35 +08:00
  • 0950b07a32 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-10 09:32:53 +00:00
  • 910433f070 fix flybird11111 2025-04-10 17:28:59 +08:00
  • 21707a77d3 fix flybird11111 2025-04-10 16:39:08 +08:00
  • c37107ce2a Merge branch 'upgrade-transformers' of github.com:flybird11111/ColossalAI into upgrade-transformers flybird11111 2025-04-10 15:42:53 +08:00
  • 914b179435 fix flybird11111 2025-04-10 15:41:54 +08:00
  • 0d09c0e80f [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-10 06:36:21 +00:00
  • 517bedcaf0 Merge branch 'upgrade-transformers' of github.com:flybird11111/ColossalAI into upgrade-transformers flybird11111 2025-04-10 14:34:57 +08:00
  • de4f7a1d25 fix flybird11111 2025-04-10 14:34:39 +08:00
  • 6997862a91 [pre-commit.ci] auto fixes from pre-commit.com hooks pre-commit-ci[bot] 2025-04-10 04:58:49 +00:00
  • 5c56a7fd7b Merge branch 'upgrade-transformers' of github.com:flybird11111/ColossalAI into upgrade-transformers flybird11111 2025-04-10 12:57:46 +08:00
  • e8a3d52381 fix flybird11111 2025-04-10 12:55:02 +08:00