Large Language Model Text Generation Inference
TAGS
20 tags v3.3.7
fix(num_devices): fix num_shard/num device auto compute when NVIDIA_VISIBLE_DEVICES == "all" or "void" (#3346) * fix(num_devices): fix num_shard/num devices auto compute when NVIDIA_VISIBLE_DEVICES == "all" the computed num_shards was always 1 in this case, no matter what * fix(num_devices): make TGI shard auto compute compliant with nvidia-container-toolkit in cdi mode
v3.3.6
Patch version 3.3.6 (#3329) * chore: prepare version 3.3.6 * fix(benchmark): clear up progress_gauge fn signature Otherwise there is a compiler error.