mirror of
https://github.com/openai/codex.git
synced 2026-04-05 12:57:13 +00:00
# Why this PR exists This PR is trying to fix a coverage gap in the Windows Bazel Rust test lane. Before this change, the Windows `bazel test //...` job was nominally part of PR CI, but a non-trivial set of `//codex-rs/...` Rust test targets did not actually contribute test signal on Windows. In particular, targets such as `//codex-rs/core:core-unit-tests`, `//codex-rs/core:core-all-test`, and `//codex-rs/login:login-unit-tests` were incompatible during Bazel analysis on the Windows gnullvm platform, so they never reached test execution there. That is why the Cargo-powered Windows CI job could surface Windows-only failures that the Bazel-powered job did not report: Cargo was executing those tests, while Bazel was silently dropping them from the runnable target set. The main goal of this PR is to make the Windows Bazel test lane execute those Rust test targets instead of skipping them during analysis, while still preserving `windows-gnullvm` as the target configuration for the code under test. In other words: use an MSVC host/exec toolchain where Bazel helper binaries and build scripts need it, but continue compiling the actual crate targets with the Windows gnullvm cfgs that our current Bazel matrix is supposed to exercise. # Important scope note This branch intentionally removes the non-resource-loading `.rs` test and production-code changes from the earlier `codex/windows-bazel-rust-test-coverage` branch. The only Rust source changes kept here are runfiles/resource-loading fixes in TUI tests: - `codex-rs/tui/src/chatwidget/tests.rs` - `codex-rs/tui/tests/manager_dependency_regression.rs` That is deliberate. Since the corresponding tests already pass under Cargo, this PR is meant to test whether Bazel infrastructure/toolchain fixes alone are enough to get a healthy Windows Bazel test signal, without changing test behavior for Windows timing, shell output, or SQLite file-locking. # How this PR changes the Windows Bazel setup ## 1. Split Windows host/exec and target concerns in the Bazel test lane The core change is that the Windows Bazel test job now opts into an MSVC host platform for Bazel execution-time tools, but only for `bazel test`, not for the Bazel clippy build. Files: - `.github/workflows/bazel.yml` - `.github/scripts/run-bazel-ci.sh` - `MODULE.bazel` What changed: - `run-bazel-ci.sh` now accepts `--windows-msvc-host-platform`. - When that flag is present on Windows, the wrapper appends `--host_platform=//:local_windows_msvc` unless the caller already provided an explicit `--host_platform`. - `bazel.yml` passes that wrapper flag only for the Windows `bazel test //...` job. - The Bazel clippy job intentionally does **not** pass that flag, so clippy stays on the default Windows gnullvm host/exec path and continues linting against the target cfgs we care about. - `run-bazel-ci.sh` also now forwards `CODEX_JS_REPL_NODE_PATH` on Windows and normalizes the `node` executable path with `cygpath -w`, so tests that need Node resolve the runner's Node installation correctly under the Windows Bazel test environment. Why this helps: - The original incompatibility chain was mostly on the **exec/tool** side of the graph, not in the Rust test code itself. Moving host tools to MSVC lets Bazel resolve helper binaries and generators that were not viable on the gnullvm exec platform. - Keeping the target platform on gnullvm preserves cfg coverage for the crates under test, which is important because some Windows behavior differs between `msvc` and `gnullvm`. ## 2. Teach the repo's Bazel Rust macro about Windows link flags and integration-test knobs Files: - `defs.bzl` - `codex-rs/core/BUILD.bazel` - `codex-rs/otel/BUILD.bazel` - `codex-rs/tui/BUILD.bazel` What changed: - Replaced the old gnullvm-only linker flag block with `WINDOWS_RUSTC_LINK_FLAGS`, which now handles both Windows ABIs: - gnullvm gets `-C link-arg=-Wl,--stack,8388608` - MSVC gets `-C link-arg=/STACK:8388608`, `-C link-arg=/NODEFAULTLIB:libucrt.lib`, and `-C link-arg=ucrt.lib` - Threaded those Windows link flags into generated `rust_binary`, unit-test binaries, and integration-test binaries. - Extended `codex_rust_crate(...)` with: - `integration_test_args` - `integration_test_timeout` - Used those new knobs to: - mark `//codex-rs/core:core-all-test` as a long-running integration test - serialize `//codex-rs/otel:otel-all-test` with `--test-threads=1` - Added `src/**/*.rs` to `codex-rs/tui` test runfiles, because one regression test scans source files at runtime and Bazel does not expose source-tree directories unless they are declared as data. Why this helps: - Once host-side MSVC tools are available, we still need the generated Rust test binaries to link correctly on Windows. The MSVC-side stack/UCRT flags make those binaries behave more like their Cargo-built equivalents. - The integration-test macro knobs avoid hardcoding one-off test behavior in ad hoc BUILD rules and make the generated test targets more expressive where Bazel and Cargo have different runtime defaults. ## 3. Patch `rules_rs` / `rules_rust` so Windows MSVC exec-side Rust and build scripts are actually usable Files: - `MODULE.bazel` - `patches/rules_rs_windows_exec_linker.patch` - `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch` - `patches/rules_rust_windows_build_script_runner_paths.patch` - `patches/rules_rust_windows_exec_msvc_build_script_env.patch` - `patches/rules_rust_windows_msvc_direct_link_args.patch` - `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch` - `patches/BUILD.bazel` What these patches do: - `rules_rs_windows_exec_linker.patch` - Adds a `rust-lld` filegroup for Windows Rust toolchain repos, symlinked to `lld-link.exe` from `PATH`. - Marks Windows toolchains as using a direct linker driver. - Supplies Windows stdlib link flags for both gnullvm and MSVC. - `rules_rust_windows_bootstrap_process_wrapper_linker.patch` - For Windows MSVC Rust targets, prefers the Rust toolchain linker over an inherited C++ linker path like `clang++`. - This specifically avoids the broken mixed-mode command line where rustc emits MSVC-style `/NOLOGO` / `/LIBPATH:` / `/OUT:` arguments but Bazel still invokes `clang++.exe`. - `rules_rust_windows_build_script_runner_paths.patch` - Normalizes forward-slash execroot-relative paths into Windows path separators before joining them on Windows. - Uses short Windows paths for `RUSTC`, `OUT_DIR`, and the build-script working directory to avoid path-length and quoting issues in third-party build scripts. - Exposes `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER=1` to build scripts so crate-local patches can detect "this is running under Bazel's build-script runner". - Fixes the Windows runfiles cleanup filter so generated files with retained suffixes are actually retained. - `rules_rust_windows_exec_msvc_build_script_env.patch` - For exec-side Windows MSVC build scripts, stops force-injecting Bazel's `CC`, `CXX`, `LD`, `CFLAGS`, and `CXXFLAGS` when that would send GNU-flavored tool paths/flags into MSVC-oriented Cargo build scripts. - Rewrites or strips GNU-only `--sysroot`, MinGW include/library paths, stack-protector, and `_FORTIFY_SOURCE` flags on the MSVC exec path. - The practical effect is that build scripts can fall back to the Visual Studio toolchain environment already exported by CI instead of crashing inside Bazel's hermetic `clang.exe` setup. - `rules_rust_windows_msvc_direct_link_args.patch` - When using a direct linker on Windows, stops forwarding GNU driver flags such as `-L...` and `--sysroot=...` that `lld-link.exe` does not understand. - Passes non-`.lib` native artifacts as explicit `-Clink-arg=<path>` entries when needed. - Filters C++ runtime libraries to `.lib` artifacts on the Windows direct-driver path. - `rules_rust_windows_process_wrapper_skip_temp_outputs.patch` - Excludes transient `*.tmp*` and `*.rcgu.o` files from process-wrapper dependency search-path consolidation, so unstable compiler outputs do not get treated as real link search-path inputs. Why this helps: - The host-platform split alone was not enough. Once Bazel started analyzing/running previously incompatible Rust tests on Windows, the next failures were in toolchain plumbing: - MSVC-targeted Rust tests were being linked through `clang++` with MSVC-style arguments. - Cargo build scripts running under Bazel's Windows MSVC exec platform were handed Unix/GNU-flavored path and flag shapes. - Some generated paths were too long or had path-separator forms that third-party Windows build scripts did not tolerate. - These patches make that mixed Bazel/Cargo/Rust/MSVC path workable enough for the test lane to actually build and run the affected crates. ## 4. Patch third-party crate build scripts that were not robust under Bazel's Windows MSVC build-script path Files: - `MODULE.bazel` - `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch` - `patches/ring_windows_msvc_include_dirs.patch` - `patches/zstd-sys_windows_msvc_include_dirs.patch` What changed: - `aws-lc-sys` - Detects Bazel's Windows MSVC build-script runner via `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER` or a `bazel-out` manifest-dir path. - Uses `clang-cl` for Bazel Windows MSVC builds when no explicit `CC`/`CXX` is set. - Allows prebuilt NASM on the Bazel Windows MSVC path even when `nasm` is not available directly in the runner environment. - Avoids canonicalizing `CARGO_MANIFEST_DIR` in the Bazel Windows MSVC case, because that path may point into Bazel output/runfiles state where preserving the given path is more reliable than forcing a local filesystem canonicalization. - `ring` - Under the Bazel Windows MSVC build-script runner, copies the pregenerated source tree into `OUT_DIR` and uses that as the generated-source root. - Adds include paths needed by MSVC compilation for Fiat/curve25519/P-256 generated headers. - Rewrites a few relative includes in C sources so the added include directories are sufficient. - `zstd-sys` - Adds MSVC-only include directories for `compress`, `decompress`, and feature-gated dictionary/legacy/seekable sources. - Skips `-fvisibility=hidden` on MSVC targets, where that GCC/Clang-style flag is not the right mechanism. Why this helps: - After the `rules_rust` plumbing started running build scripts on the Windows MSVC exec path, some third-party crates still failed for crate-local reasons: wrong compiler choice, missing include directories, build-script assumptions about manifest paths, or Unix-only C compiler flags. - These crate patches address those crate-local assumptions so the larger toolchain change can actually reach first-party Rust test execution. ## 5. Keep the only `.rs` test changes to Bazel/Cargo runfiles parity Files: - `codex-rs/tui/src/chatwidget/tests.rs` - `codex-rs/tui/tests/manager_dependency_regression.rs` What changed: - Instead of asking `find_resource!` for a directory runfile like `src/chatwidget/snapshots` or `src`, these tests now resolve one known file runfile first and then walk to its parent directory. Why this helps: - Bazel runfiles are more reliable for explicitly declared files than for source-tree directories that happen to exist in a Cargo checkout. - This keeps the tests working under both Cargo and Bazel without changing their actual assertions. # What we tried before landing on this shape, and why those attempts did not work ## Attempt 1: Force `--host_platform=//:local_windows_msvc` for all Windows Bazel jobs This did make the previously incompatible test targets show up during analysis, but it also pushed the Bazel clippy job and some unrelated build actions onto the MSVC exec path. Why that was bad: - Windows clippy started running third-party Cargo build scripts with Bazel's MSVC exec settings and crashed in crates such as `tree-sitter` and `libsqlite3-sys`. - That was a regression in a job that was previously giving useful gnullvm-targeted lint signal. What this PR does instead: - The wrapper flag is opt-in, and `bazel.yml` uses it only for the Windows `bazel test` lane. - The clippy lane stays on the default Windows gnullvm host/exec configuration. ## Attempt 2: Broaden the `rules_rust` linker override to all Windows Rust actions This fixed the MSVC test-lane failure where normal `rust_test` targets were linked through `clang++` with MSVC-style arguments, but it broke the default gnullvm path. Why that was bad: - `@@rules_rs++rules_rust+rules_rust//util/process_wrapper:process_wrapper` on the gnullvm exec platform started linking with `lld-link.exe` and then failed to resolve MinGW-style libraries such as `-lkernel32`, `-luser32`, and `-lmingw32`. What this PR does instead: - The linker override is restricted to Windows MSVC targets only. - The gnullvm path keeps its original linker behavior, while MSVC uses the direct Windows linker. ## Attempt 3: Keep everything on pure Windows gnullvm and patch the V8 / Python incompatibility chain instead This would have preserved a single Windows ABI everywhere, but it is a much larger project than this PR. Why that was not the practical first step: - The original incompatibility chain ran through exec-side generators and helper tools, not only through crate code. - `third_party/v8` is already special-cased on Windows gnullvm because `rusty_v8` only publishes Windows prebuilts under MSVC names. - Fixing that path likely means deeper changes in V8/rules_python/rules_rust toolchain resolution and generator execution, not just one local CI flag. What this PR does instead: - Keep gnullvm for the target cfgs we want to exercise. - Move only the Windows test lane's host/exec platform to MSVC, then patch the build-script/linker boundary enough for that split configuration to work. ## Attempt 4: Validate compatibility with `bazel test --nobuild ...` This turned out to be a misleading local validation command. Why: - `bazel test --nobuild ...` can successfully analyze targets and then still exit 1 with "Couldn't start the build. Unable to run tests" because there are no runnable test actions after `--nobuild`. Better local check: ```powershell bazel build --nobuild --keep_going --host_platform=//:local_windows_msvc //codex-rs/login:login-unit-tests //codex-rs/core:core-unit-tests //codex-rs/core:core-all-test ``` # Which patches probably deserve upstream follow-up My rough take is that the `rules_rs` / `rules_rust` patches are the highest-value upstream candidates, because they are fixing generic Windows host/exec + MSVC direct-linker behavior rather than Codex-specific test logic. Strong upstream candidates: - `patches/rules_rs_windows_exec_linker.patch` - `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch` - `patches/rules_rust_windows_build_script_runner_paths.patch` - `patches/rules_rust_windows_exec_msvc_build_script_env.patch` - `patches/rules_rust_windows_msvc_direct_link_args.patch` - `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch` Why these seem upstreamable: - They address general-purpose problems in the Windows MSVC exec path: - missing direct-linker exposure for Rust toolchains - wrong linker selection when rustc emits MSVC-style args - Windows path normalization/short-path issues in the build-script runner - forwarding GNU-flavored CC/link flags into MSVC Cargo build scripts - unstable temp outputs polluting process-wrapper search-path state Potentially upstreamable crate patches, but likely with more care: - `patches/zstd-sys_windows_msvc_include_dirs.patch` - `patches/ring_windows_msvc_include_dirs.patch` - `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch` Notes on those: - The `zstd-sys` and `ring` include-path fixes look fairly generic for MSVC/Bazel build-script environments and may be straightforward to propose upstream after we confirm CI stability. - The `aws-lc-sys` patch is useful, but it includes a Bazel-specific environment probe and CI-specific compiler fallback behavior. That probably needs a cleaner upstream-facing shape before sending it out, so upstream maintainers are not forced to adopt Codex's exact CI assumptions. Probably not worth upstreaming as-is: - The repo-local Starlark/test target changes in `defs.bzl`, `codex-rs/*/BUILD.bazel`, and `.github/scripts/run-bazel-ci.sh` are mostly Codex-specific policy and CI wiring, not generic rules changes. # Validation notes for reviewers On this branch, I ran the following local checks after dropping the non-resource-loading Rust edits: ```powershell cargo test -p codex-tui just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc -- fix -p codex-tui python .\tools\argument-comment-lint\run-prebuilt-linter.py -p codex-tui just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc fmt ``` One local caveat: - `just argument-comment-lint` still fails on this Windows machine for an unrelated Bazel toolchain-resolution issue in `//codex-rs/exec:exec-all-test`, so I used the direct prebuilt linter for `codex-tui` as the local fallback. # Expected reviewer takeaway If this PR goes green, the important conclusion is that the Windows Bazel test coverage gap was primarily a Bazel host/exec toolchain problem, not a need to make the Rust tests themselves Windows-specific. That would be a strong signal that the deleted non-resource-loading Rust test edits from the earlier branch should stay out, and that future work should focus on upstreaming the generic `rules_rs` / `rules_rust` Windows fixes and reducing the crate-local patch surface.