DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
fix: remove unnecessary shell=True in ROCm GPU architecture detection (#7915)
### Summary `get_rocm_gpu_arch()` and `get_rocm_wavefront_size()` in `op_builder/builder.py` use `subprocess.check_output()` with `shell=True` to pipe `rocminfo` output through `grep`. This is unnecessary and inconsistent with the rest of the codebase, which uses list-based subprocess calls after the CVE-2024-43497 fix. While the commands currently use fixed paths (not user-controlled), removing `shell=True` eliminates a potential attack surface and aligns with the hardening approach taken for CVE-2024-43497. ### Root Cause ```python # Before (shell=True) rocm_gpu_arch_cmd = str(rocm_info) + " | grep -o -m 1 'gfx.*'" result = subprocess.check_output(rocm_gpu_arch_cmd, shell=True) rocm_wavefront_size_cmd = str(rocm_info) + " | grep -Eo -m1 'Wavefront Size:[[:space:]]+[0-9]+' | grep -Eo '[0-9]+'" result = subprocess.check_output(rocm_wavefront_size_cmd, shell=True) ``` ### Changes - `op_builder/builder.py`: Replace shell pipeline with list-based `subprocess.check_output()` and Python `re.search()` on the full output of `rocminfo` (no `shell=True`), also catch `FileNotFoundError` for robustness ### Testing - Verified `shell=True` is fully removed from `op_builder/builder.py` (0 occurrences) - Python syntax validation: PASS - Both functions use list-based `subprocess.check_output([str(rocm_info)])`: confirmed - Non-ROCm systems return expected defaults (empty string / "32") --------- Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
I
instantraaamen committed
5ce3abbf2ae3d9c6a2f7b10b88ffe3f6455f85be
Parent: f2bb1ec
Committed by GitHub <noreply@github.com>
on 3/24/2026, 1:48:20 PM