Commits: extensions/pybind/inference/inference.cpp - hpcaitech/ColossalAI - Morph

SIGN IN SIGN UP

hpcaitech / ColossalAI UNCLAIMED

Making large AI models cheaper, faster and more accessible

41374 0 0 Python

COMMITS

/ extensions/pybind/inference/inference.cpp

main

May 14, 2024

S

add paged-attetionv2: support seq length split across thread block (#5707)

Steve Luo committed 1y ago

May 10, 2024

傅

[Inference/Feat] Add convert_fp8 op for fp8 test in the future (#5706)

傅剑寒 committed 1y ago

April 30, 2024

S

[Inference/Kernel] refactor kvcache manager and rotary_embedding and kvcache_memcpy oper… (#5663)

Steve Luo committed 1y ago

April 25, 2024

S

[Inference/Kernel] Optimize paged attention: Refactor key cache layout (#5643)

Steve Luo committed 1y ago

April 24, 2024

傅

[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613)

傅剑寒 committed 1y ago