DeepSpeed

mirror of https://github.com/deepspeedai/DeepSpeed.git synced 2026-03-31 05:00:06 +00:00

Author	SHA1	Message	Date
Masahiro Tanaka	36f0b0c7bb	Add fallback to full test (#7933 ) The recent attempts of the night full test [kept failing](https://github.com/deepspeedai/DeepSpeed/actions/workflows/aws-torch-latest-full.yml). We added a fallback to an A100 node on the infra side. This PR detects the CUDA architecture and number of GPUs and sets them to env vars. Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-03-30 08:16:09 -04:00
Ma, Guokai	a41a96b19f	Remove amp() from abstract accelerator (#7879 ) Pytorch now provide torch.amp https://docs.pytorch.org/docs/stable/amp.html as recommended AMP API instead of torch.<device_type>.amp which is used in DeepSpeed abstract accelerator amp(). Some PyTorch backend such as XPU does not provide the legacy `torch.xpu.amp` module. This PR replace `get_accelerator().amp()` by `torch.amp` which is the recommended way of using AMP. Related issues and PRs https://github.com/deepspeedai/DeepSpeed/issues/7876 https://github.com/deepspeedai/DeepSpeed/pull/7877 --------- Signed-off-by: Ma, Guokai <guokai.ma@intel.com> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com>	2026-03-02 14:51:22 -05:00
Ma, Guokai	d8e15da43f	XPU use stock pytorch instead of Intel Extension for PyTorch (#7877 ) With Intel Extension for PyTorch retiring, XPU device would be supported by PyTorch 2.8+ and dependency to Intel Extension for PyTorch would not be needed. This PR removed IPEX dependency, adapt to builder protocol in PyTorch for XPU, and updated documents and tests accordingly. Note after this update, DeepSpeed will not work with previous PyTorch+IPEX on XPU devices. Suggest user to upgrade to latest PyTorch to get latest XPU features on XPU devices. Come with this PR is removal of InferenceBuilder, the kernel needed by InferenceBuilder is supported through Intel Extension for PyTorch. --------- Signed-off-by: Ma, Guokai <guokai.ma@intel.com> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2026-03-01 22:37:51 -05:00
Masahiro Tanaka	0416cf68df	Schedule nightly full test (#7870 ) The full test workflow passed though it is still flakey ([Success](https://github.com/deepspeedai/DeepSpeed/actions/runs/22269243373) / [Failure](https://github.com/deepspeedai/DeepSpeed/actions/runs/22266498530)) This PR schedules a nightly run of the full test. It is launched only when we have update since the last successful run. --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2026-02-24 11:37:10 -08:00
Masahiro Tanaka	c89e0db8e2	Ignore evoformer test (#7815 ) Evoformer tests fail with this error. We ignore this in the full test for now. ``` RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method ``` Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-01-25 22:18:40 -08:00
Masahiro Tanaka	5aa2d17dd7	Add full test suite workflow (#7795 ) We have been disabled the full unit test workflow for a while. This PR migrates the full test to our AWS test infra. To make the tests pass, we need to merge these PRs: - #7786 - #7788 - #7789 - #7790 - #7793 - #7794 In addition having these PRs merged, this PR has the following changes in the full test workflow and test harness: - Ignore flags for some known issues: - nvme: Requires an actual NVMe device. Our CI currently doesn't have NVMe storage configured - GDS: GDS requires special kernel drivers and NVIDIA Magnum IO to enable direct GPU-to-storage transfers. CI instances don't have this configured. - Zenflow: 1. Stage 3 bugs: The ZenFlow + ZeRO Stage 3 implementation has pre-existing bugs that cause internal pytest errors and worker crashes, 2. CUDA/fork incompatibility: test_zf_torch_adam.py uses torch.optim.AdamW which does CUDA graph capture checks that fail in forked processes (--forked flag, we can just move it to sequential tests) - `/mnt/aio` mount for async I/O tests - CUTLASS installation for Evoformer tests - Add `DS_DISABLE_REUSE_DIST_ENV` to the test harness to prevent worker cleanup hangs Once we merge this PR, we will be able to run the full test manually or at scheduled times. --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-01-20 15:17:19 -08:00
Logan Adams	53c87ce8ed	Remove cron/PR triggers for outdated V100 tests (#7777 ) The V100 tests are not needed anymore but this prevents the CI cron jobs from being spun up even though the jobs are disabled. The next step will be to remove the yaml files we do not use anymore/have already ported.	2026-01-13 07:53:09 -08:00
Masahiro Tanaka	52361686fd	Add timeout to test workflows (#7774 ) This PR adds timeout to CI workflows. This will prevent zombie jobs from holding GPU instances. Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2026-01-12 20:06:43 -08:00
Masahiro Tanaka	d988f6ca10	Udpate workflow trigger (#7768 ) The new CI workflows using AWS are not triggered when the path filters don't match. However, it keeps "waiting for status to be reported" because they are set as "required." This PR always launches workflow but skips tests when the filter doesn't match. Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-01-09 19:49:42 -08:00
Masahiro Tanaka	bfb66c65c6	Add CI workflow to run tests on AWS (#7753 ) This PR migrates CI workflows for unit tests to AWS. v1 tests use 4xL40S and accelerate tests use 1xL40S. @sfc-gh-truwase This looks working now. We could disable modal tests after this PR is merged, or keep both for a while just in case. --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-01-03 07:06:18 +09:00
Logan Adams	c0e9b2c9b2	Enable python 3.11 and 3.12 tests (#7007 ) Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <mtanaka@anyscale.com>	2026-01-01 09:34:52 +00:00
Masahiro Tanaka	51dc888423	Trust intel server for XPU tests (#7698 ) The SSL certificate of Intel's wheel server has expired. To unblock PRs, trust `pytorch-extension.intel.com`. Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>	2025-11-19 08:42:49 +09:00
Stas Bekman	dff81cb619	modal ci: fix group concurrency (#7691 ) currently only one modal CI job is possible across all PRs, which is not workable - all running jobs get cancelled on a new PR or existing PR update - fixing the dependency to make the group concurrency work across PRs not to waste valuable resources. Signed-off-by: Stas Bekman <stas.bekman@snowflake.com> Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>	2025-11-12 13:00:38 -08:00
Stas Bekman	283f6f5fde	disable nv-lightning-v100.yml cI (#7681 ) as we lost v100s - disable first so that it stops interfering with PRs, then port to modal.	2025-11-08 09:05:21 -05:00
Stas Bekman	b073a557c1	[modal ci] fixes (#7676 ) 1. `modal-accelerate` needs now `uv` installed explicitly since the image change to 2025 one. 2. moved accelerate repo cloning into the job, since the original way was incorrect as it was caching some accelerate version and not updating it. 3. annotated that how to actually test the ci work when changing the workflow as `pull_request_target` will not run the updated .py+.yaml files. --------- Signed-off-by: Stas Bekman <stas@stason.org>	2025-11-06 11:42:22 -08:00
Liangliang Ma	69e03e52d0	[XPU][CI] recover xpu-max1100 workflow (#7630 ) Reduce some test scope to recover CI workflow. Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-10-13 16:43:17 +00:00
Olatunji Ruwase	64ac13f72e	Enable forked PRs (#7486 ) Enable forked PRs --------- Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-08-14 17:43:08 -04:00
Olatunji Ruwase	a12de38db6	Modal CI (#7289 ) This is an initial effort to migrate CI unto Modal infra. This PR creates two new workflows that run on Modal 1. modal-torch-latest: a subset of nv-torch-latest-v100 that includes `tests/unit/runtime/zero/test_zero.py`. 2. modal-accelerate: a full copy of nv-accelerate-v100. Follow up PRs will selectively migrate relevant workflows onto Modal. --------- Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com> Signed-off-by: Olatunji Ruwase <tjruwase@gmail.com> Signed-off-by: Tunji Ruwase <tunji.ruwase@snowflake.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com> Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>	2025-08-11 20:13:39 +00:00
Olatunji Ruwase	8c83e42ba1	Fix cpu CI (#7481 ) Fix torch version --------- Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-08-11 11:53:09 -07:00
Logan Adams	43f00ba31c	Remove additional unused tests (human-eval) (#7445 )	2025-07-24 13:16:57 -07:00
Logan Adams	3bf53451e5	Remove tests from README that are already removed. (#7441 )	2025-07-21 20:56:11 -07:00
Stas Bekman	affee605e4	trying to fix nv-accelerate-v100.yml CI job (#7424 ) trying a day old accelerate from the day before `1ac8643df7` --------- Signed-off-by: Stas Bekman <stas@stason.org>	2025-07-11 10:07:27 -04:00
Stas Bekman	d3b9cb8c4e	sequence parallel default dtype (#7364 ) the newly released nccl finally started to use fp32 accumulation for reduction ops! * Floating point summation is always done in fp32 accumulators (with the exception of fp8 on NVLS, where it uses fp16 inside the switch). Thus, the accuracy with fp8 and fp16 data types should be much improved. `72d2432094` So we should change the fp32 comms default for SP to the same dtype as inputs if `nccl>=2.27.3` - the user can still override the default. --------- Signed-off-by: Stas Bekman <stas@stason.org> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com>	2025-06-19 18:32:14 +00:00
Olatunji Ruwase	10b106619a	Don't break set_start_method (#7349 ) Fix #7347 --------- Signed-off-by: Tunji Ruwase <tunji@ip-172-31-0-204.us-west-2.compute.internal> Signed-off-by: Olatunji Ruwase <tjruwase@gmail.com> Co-authored-by: Tunji Ruwase <tunji@ip-172-31-0-204.us-west-2.compute.internal>	2025-06-11 13:00:58 -04:00
Logan Adams	2ce5505799	Move pytest pinning from individual tests to requirements-dev.txt until fixed. (#7327 ) pytest 8.4.0 seems to break a number of our tests, rather than pinning in each individually, we should just put this in the requirements file until we resolve the issue. --------- Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com>	2025-06-09 22:42:55 +00:00
Raza Sikander	2ad2011cc9	Fix pytest version to 8.3.5 in hpu-gaudi actions (#7337 ) This is needed to avoid the issue of ci failure in #7330 PR. Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com>	2025-06-05 23:10:19 +00:00
Michael Wyatt	720787e79b	Bump to v0.17.0 (#7324 ) Co-authored-by: Logan Adams <loadams@microsoft.com>	2025-06-02 16:01:44 -07:00
Stas Bekman	4d00b38ada	Ulysses SP for HF Integration (#7268 ) This is the Deepspeed counterpart of https://github.com/snowflakedb/ArcticTraining/pull/45 - as the new feature(s) require changes on both sides. For PR reviewers: Readiness status: - [x] Code - [x] Tests - [ ] Docs - working on it Features: - [x] add support for delaying grad addition via `param.ds_grad_is_ready` flag (used when performing tiled compute in an autograd function) - [x] add light sp-only mpu version (Jeff Rasley) - [x] improved debug - [x] added `all_gather_object` to `dist` - [x] `UlyssesSPAttentionHF` (port of UlyssesAttention from Megatron-Deepspeed plus modern MHA-variations) - [x] `UlyssesSPDataLoaderAdapter` - DL adapter to shard the normal DL batches to be used by `UlyssesSPAttentionHF` - [x] `SequenceTiledCompute` - generic autograd function to perform compute after tiling on the sequence dimension - [x] `TiledMLP` - a specific autograd function to perform tiled MLP (it's much easier to understand before trying to grok `SequenceTiledCompute`) - [x] added a differentiable `_DimZeroAllToAll` (Samyam Rajbhandari) - [x] torch-dist-check now allows `torch.distributed.nn` (which is needed since deepspeed's dist is not up to date with `torch.distributed.nn`) --------- Signed-off-by: Stas Bekman <stas.bekman@snowflake.com> Signed-off-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas.bekman@snowflake.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-05-31 07:25:23 +00:00
Stas Bekman	b66c81077c	anchor transformers version (#7316 ) some features require minimal transformers versions so let's start anchoring. and fixing tests that break with recent transformers. I need this fixed to be able to merge https://github.com/deepspeedai/DeepSpeed/pull/7268 which requires `transformers>=4.51.3` --------- Signed-off-by: Stas Bekman <stas.bekman@snowflake.com> Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>	2025-05-29 06:19:54 +00:00
Raza Sikander	ec6b254dce	Update gaudi2 nightly,ci to latest 1.21.0 build (#7313 ) Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>	2025-05-29 02:58:52 +00:00
Stas Bekman	b4cc079eee	CI: prefer bf16 over fp16 (#7304 ) these days fp16 is barely ever used, so we should be testing bf16 instead of fp16 where possible. had to fix a bunch of tests to adapt to this change. a few bugs as well on the way. --------- Signed-off-by: Stas Bekman <stas.bekman@snowflake.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com> Co-authored-by: Stas Bekman <stas.bekman@snowflake.com>	2025-05-28 00:49:21 +00:00
Olatunji Ruwase	0e741714f5	Enable ZeRO set/get APIs for NVMe offload (#7046 ) - Extend APIs for [debugging](https://deepspeed.readthedocs.io/en/latest/zero3.html#debugging) and [modifying](https://deepspeed.readthedocs.io/en/latest/zero3.html#modifying-partitioned-states) ZeRO partitioned states to NVMe offload. - Add vectorized update API. This is performance-critical for NVMe offloading scenarios. --------- Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Guanhua Wang <alexwgh333@gmail.com>	2025-05-20 00:11:17 +00:00
Logan Adams	d46947db4a	Temporarily skip AIO tests due to an issue with runners (#7288 ) Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Olatunji Ruwase <tunji.ruwase@snowflake.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com>	2025-05-18 23:36:06 +00:00
Logan Adams	930ab46e63	Fix issues XPU tests hit with extra-index-url (#7291 ) cc: @Liangliang-Ma --------- Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-05-16 19:07:35 -07:00
Liangliang Ma	5a4e7a08ec	[XPU] update xpu-max1100 CI workflow to torch 2.7 (#7284 ) Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Logan Adams <loadams@microsoft.com>	2025-05-15 10:02:53 -07:00
Logan Adams	9926879b59	Update CPU torch version to 2.7 (#7241 ) Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-04-23 21:58:01 +00:00
Logan Adams	8d2865e014	Revert "Update torch cpu test version" This reverts commit `00b5678bbf`.	2025-04-23 13:26:40 -07:00
Logan Adams	00b5678bbf	Update torch cpu test version Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-04-23 13:26:02 -07:00
Masahiro Tanaka	227a60c0c4	DeepCompile for enhanced compiler integration (#7154 ) This PR introduces DeepCompile, a new feature that efficiently integrates compiler optimizations with other DeepSpeed features. DeepCompile utilizes torch's dynamo to capture the computation graph and modifies it to incorporate DeepSpeed’s optimizations seamlessly. Currently, DeepCompile supports ZeRO-1 and ZeRO-3, with enhancements such as proactive prefetching and selective unsharding to improve performance. (More details will be added later.) --------- Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: zafarsadiq <zafarsadiq120@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2025-04-16 04:33:53 +00:00
Logan Adams	3388f8331b	Update container version that runs on A6000 tests. (#7153 ) Changes from https://github.com/huggingface/transformers/pull/36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-03-19 23:42:38 +00:00
Raza Sikander	29e9fd53b5	Enhance Gaudi2 CI/Nightly Coverage with Model Parallelism and Linear Tests (#7146 ) Enhancing ci/nightly coverage for gaudi2 device Tests added : test_autotp_training.py test_ulysses.py test_linear::TestLoRALinear and test_linear::TestBasicLinear test_ctx::TestEngine these provide coverage for model_parallesim and linear feature. The tests are stable. 10/10 runs pass. New tests addition is expected to increase ci time by 3-4 mins and nightly job time by 15 min. Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai>	2025-03-18 23:49:01 +00:00
Logan Adams	d095b18185	Unpin transformers version for most workflows (#7139 ) Unpin transformers version for all workflows except `nv-torch-latest-v100` as this still has a tolerance issue with some quantization tests. Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-03-14 13:52:44 -07:00
Raza Sikander	c1acd49cdf	Update gaudi2 nightly,ci to latest 1.20.0 build (#7093 ) Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com>	2025-03-07 22:46:47 +00:00
Logan Adams	02bbf50109	Remove workflows for very old torch versions (#7090 ) These jobs haven't been run in a long time and were originally used when compatibility with torch <2 was more important. Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-02-28 01:33:01 +00:00
Logan Adams	f2ed2531a7	Update parallelism for nv-torch-latest/nightly tests due to more GPUs/runner (#7086 ) Signed-off-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2025-02-27 10:47:49 -08:00
Logan Adams	f8d34295d0	Pin transformers version on tests that use latest. (#7085 ) Latest transformers causes failures when cpu-torch-latest test, so we pin it for now to unblock other PRs. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-02-27 08:15:11 -08:00
Logan Adams	1d30b58cba	Replace calls to `python setup.py sdist` with `python -m build --sdist` (#7069 ) With future changes coming to pip/python/etc, we need to modify to no longer call `python setup.py ...` and replace it instead: https://packaging.python.org/en/latest/guides/modernize-setup-py-project/#should-setup-py-be-deleted ![image](https://github.com/user-attachments/assets/ea39ef7b-3cbe-4916-86f0-bc46a5fce96d) This means we need to install the build package which is added here as well. Additionally, we pass the `--sdist` flag to only build the sdist rather than the wheel as well here. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-02-24 20:40:24 +00:00
Logan Adams	33dd2e2165	nv-ds-chat breaks with latest transformers (#7052 ) Signed-off-by: Logan Adams <loadams@microsoft.com>	2025-02-19 15:48:41 +00:00
Logan Adams	079de6bdff	Update workflows to cuda 12.4 (#7000 ) - Update existing workflows that use cu121 to cu124. Note, this means that where we download torch latest, we will now be getting torch 2.6 rather than the torch latest 2.5 provided with cuda 12.1. - Note, nv-nightly is failing in master currently due to unrelated errors, so this could be ignored in this PR (nv-nightly tested locally, where it passes with 12.1 and it also passes with 12.4). --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Omar Elayan <oelayan@habana.ai> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Omar Elayan <142979319+oelayan7@users.noreply.github.com>	2025-02-12 15:25:41 -08:00
Logan Adams	a83ab17d3d	Update A6000 tests transformers version (#7016 ) Signed-off-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2025-02-08 00:26:02 +00:00

1 2 3 4 5 ...

325 Commits