Making large AI models cheaper, faster and more accessible
COMMITS
May 24, 2024
Y
[inference] Fix running time of test_continuous_batching (#5750)
Yuanheng Zhao committed
E
[Feature] auto-cast optimizers to distributed version (#5746)
Edenzzzz committed
B
J
[Inference]Fix readme and example for API server (#5742)
Jianghai committed
May 23, 2024
B
[inference] release (#5747)
binmakeswell committed
May 22, 2024
Y
[Colossal-Inference] (v0.1.0) Merge pull request #5739 from hpcaitech/feature/colossal-infer
Yuanheng Zhao committed
Y
[NFC] fix requirements (#5744)
Yuanheng Zhao committed
May 21, 2024
Y
[NFC] Fix code factors on inference triton kernels (#5743)
Yuanheng Zhao committed
Y
[ci] Temporary fix for build on pr (#5741)
Yuanheng Zhao committed
Y
Merge pull request #5737 from yuanheng-zhao/inference/sync/main
Yuanheng Zhao committed
May 20, 2024
Y
[sync] Sync feature/colossal-infer with main
Yuanheng Zhao committed
Y
[doc] Update Inference Readme (#5736)
Yuanheng Zhao committed
Y
[Fix/Inference] Add unsupported auto-policy error message (#5730)
Yuanheng Zhao committed
May 19, 2024
Y
[Inference] Fix Inference Generation Config and Sampling (#5710)
Yuanheng Zhao committed
May 17, 2024
F
[lazy] fix lazy cls init (#5720)
flybird11111 committed
Y
[example] Update Inference Example (#5725)
Yuanheng Zhao committed
May 16, 2024
B
[misc] Update PyTorch version in docs (#5724)
binmakeswell committed
傅
【Inference] Delete duplicated package (#5723)
傅剑寒 committed
May 15, 2024
J
[Inference] Fix API server, test and example (#5712)
Jianghai committed
T
[Colossal-LLaMA] Fix sft issue for llama2 (#5719)
Tong Li committed
May 14, 2024
R
[Fix] Llama3 Load/Omit CheckpointIO Temporarily (#5717)
Runyu Lu committed
Y
[ci] Fix example tests (#5714)
Yuanheng Zhao committed
傅
[Inference] Delete duplicated copy_vector (#5716)
傅剑寒 committed
E
[Feature] Distributed optimizers: Lamb, Galore, CAME and Adafactor (#5694)
Edenzzzz committed
S
add paged-attetionv2: support seq length split across thread block (#5707)
Steve Luo committed
R
[Feat]Inference RPC Server Support (#5705)
Runyu Lu committed
May 13, 2024
H
[hotfix] fix inference typo (#5438)
hugo-syn committed
E
[misc] Update PyTorch version in docs (#5711)
Edenzzzz committed