Blame: lib/std/process.zig - ziglang/zig

ziglang / zig UNCLAIMED

Moved to Codeberg

do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const std = @import("std.zig");`
migrate from `std.Target.current` to `@import("builtin").target` closes #9388 closes #9321 2021-10-04 23:47:27 -07:00			`const builtin = @import("builtin");`
tests passing on linux 2019-05-26 23:35:26 -04:00			`const fs = std.fs;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const mem = std.mem;`
clean up references to os 2019-05-26 13:17:34 -04:00			`const math = std.math;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const Allocator = mem.Allocator;`
			`const assert = std.debug.assert;`
			`const testing = std.testing;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const native_os = builtin.os.tag;`
			`const posix = std.posix;`
			`const windows = std.os.windows;`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const unicode = std.unicode;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`pub const Child = @import("process/Child.zig");`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`pub const abort = posix.abort;`
			`pub const exit = posix.exit;`
			`pub const changeCurDir = posix.chdir;`
Fix chdirC compile error 2025-01-17 00:22:13 -06:00			`pub const changeCurDirZ = posix.chdirZ;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00
			`pub const GetCwdError = posix.GetCwdError;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
tests passing on linux 2019-05-26 23:35:26 -04:00			/// The result is a slice of `out_buffer`, from index `0`.
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// On Windows, the result is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On other platforms, the result is an opaque sequence of bytes with no particular encoding.`
std.process: Actually use explicit GetCwdError/GetCwdAllocError sets Also fix GetCwdAllocError to include only the set of possible errors. 2025-11-19 04:09:54 -08:00			`pub fn getCwd(out_buffer: []u8) GetCwdError![]u8 {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.getcwd(out_buffer);`
tests passing on linux 2019-05-26 23:35:26 -04:00			`}`

std.process: Actually use explicit GetCwdError/GetCwdAllocError sets Also fix GetCwdAllocError to include only the set of possible errors. 2025-11-19 04:09:54 -08:00			`// Same as GetCwdError, minus error.NameTooLong + Allocator.Error`
			`pub const GetCwdAllocError = Allocator.Error \|\| error{CurrentWorkingDirectoryUnlinked} \|\| posix.UnexpectedError;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00
tests passing on linux 2019-05-26 23:35:26 -04:00			`/// Caller must free the returned memory.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// On Windows, the result is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On other platforms, the result is an opaque sequence of bytes with no particular encoding.`
std.process: Actually use explicit GetCwdError/GetCwdAllocError sets Also fix GetCwdAllocError to include only the set of possible errors. 2025-11-19 04:09:54 -08:00			`pub fn getCwdAlloc(allocator: Allocator) GetCwdAllocError![]u8 {`
std: Convert deprecated aliases to compile errors and fix usages Deprecated aliases that are now compile errors: - `std.fs.MAX_PATH_BYTES` (renamed to `std.fs.max_path_bytes`) - `std.mem.tokenize` (split into `tokenizeAny`, `tokenizeSequence`, `tokenizeScalar`) - `std.mem.split` (split into `splitSequence`, `splitAny`, `splitScalar`) - `std.mem.splitBackwards` (split into `splitBackwardsSequence`, `splitBackwardsAny`, `splitBackwardsScalar`) - `std.unicode` + `utf16leToUtf8Alloc`, `utf16leToUtf8AllocZ`, `utf16leToUtf8`, `fmtUtf16le` (all renamed to have capitalized `Le`) + `utf8ToUtf16LeWithNull` (renamed to `utf8ToUtf16LeAllocZ`) - `std.zig.CrossTarget` (moved to `std.Target.Query`) Deprecated `lib/std/std.zig` decls were deleted instead of made a `@compileError` because the `refAllDecls` in the test block would trigger the `@compileError`. The deleted top-level `std` namespaces are: - `std.rand` (renamed to `std.Random`) - `std.TailQueue` (renamed to `std.DoublyLinkedList`) - `std.ChildProcess` (renamed/moved to `std.process.Child`) This is not exhaustive. Deprecated aliases that I didn't touch: + `std.io.` + `std.Build.` + `std.builtin.Mode` + `std.zig.c_translation.CIntLiteralRadix` + anything in `src/` 2024-05-02 20:20:41 -07:00			`// The use of max_path_bytes here is just a heuristic: most paths will fit`
In getCwdAlloc, geometrically allocate larger buffers to find an appropriate size. 2020-03-28 00:12:40 -05:00			`// in stack_buf, avoiding an extra allocation in the common case.`
std: Convert deprecated aliases to compile errors and fix usages Deprecated aliases that are now compile errors: - `std.fs.MAX_PATH_BYTES` (renamed to `std.fs.max_path_bytes`) - `std.mem.tokenize` (split into `tokenizeAny`, `tokenizeSequence`, `tokenizeScalar`) - `std.mem.split` (split into `splitSequence`, `splitAny`, `splitScalar`) - `std.mem.splitBackwards` (split into `splitBackwardsSequence`, `splitBackwardsAny`, `splitBackwardsScalar`) - `std.unicode` + `utf16leToUtf8Alloc`, `utf16leToUtf8AllocZ`, `utf16leToUtf8`, `fmtUtf16le` (all renamed to have capitalized `Le`) + `utf8ToUtf16LeWithNull` (renamed to `utf8ToUtf16LeAllocZ`) - `std.zig.CrossTarget` (moved to `std.Target.Query`) Deprecated `lib/std/std.zig` decls were deleted instead of made a `@compileError` because the `refAllDecls` in the test block would trigger the `@compileError`. The deleted top-level `std` namespaces are: - `std.rand` (renamed to `std.Random`) - `std.TailQueue` (renamed to `std.DoublyLinkedList`) - `std.ChildProcess` (renamed/moved to `std.process.Child`) This is not exhaustive. Deprecated aliases that I didn't touch: + `std.io.` + `std.Build.` + `std.builtin.Mode` + `std.zig.c_translation.CIntLiteralRadix` + anything in `src/` 2024-05-02 20:20:41 -07:00			`var stack_buf: [fs.max_path_bytes]u8 = undefined;`
In getCwdAlloc, geometrically allocate larger buffers to find an appropriate size. 2020-03-28 00:12:40 -05:00			`var heap_buf: ?[]u8 = null;`
			`defer if (heap_buf) \|buf\| allocator.free(buf);`

			`var current_buf: []u8 = &stack_buf;`
			`while (true) {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (posix.getcwd(current_buf)) \|slice\| {`
std.mem.dupe is deprecated, move all references in std Replaced all occurences of std.mem.dupe in stdlib with Allocator.dupe/std.mem.dupeZ -> Allocator.dupeZ 2020-07-04 13:44:28 +02:00			`return allocator.dupe(u8, slice);`
cleanups 2020-05-29 16:39:47 -04:00			`} else \|err\| switch (err) {`
In getCwdAlloc, geometrically allocate larger buffers to find an appropriate size. 2020-03-28 00:12:40 -05:00			`error.NameTooLong => {`
			`// The path is too long to fit in stack_buf. Allocate geometrically`
			`// increasing buffers until we find one that works`
			`const new_capacity = current_buf.len * 2;`
			`if (heap_buf) \|buf\| allocator.free(buf);`
			`current_buf = try allocator.alloc(u8, new_capacity);`
			`heap_buf = current_buf;`
			`},`
cleanups 2020-05-29 16:39:47 -04:00			`else => \|e\| return e,`
In getCwdAlloc, geometrically allocate larger buffers to find an appropriate size. 2020-03-28 00:12:40 -05:00			`}`
			`}`
tests passing on linux 2019-05-26 23:35:26 -04:00			`}`

CLI: finish updating module API usage Finish the work started in 4c4fb839972f66f55aa44fc0aca5f80b0608c731. Now the compiler compiles again. Wire up dependency tree fetching code in the CLI for `zig build`. Everything is hooked up except for `createDependenciesModule` is not yet implemented. 2023-10-06 21:29:08 -07:00			`test getCwdAlloc {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .wasi) return error.SkipZigTest;`
Add/fix missing WASI functionality to pass libstd tests This rather large commit adds/fixes missing WASI functionality in `libstd` needed to pass the `libstd` tests. As such, now by default tests targeting `wasm32-wasi` target are enabled in `test/tests.zig` module. However, they can be disabled by passing the `-Dskip-wasi=true` flag when invoking the `zig build test` command. When the flag is set to `false`, i.e., when WASI tests are included, `wasmtime` with `--dir=.` is used as the default testing command. Since the majority of `libstd` tests were relying on `fs.cwd()` call to get current working directory handle wrapped in `Dir` struct, in order to make the tests WASI-friendly, `fs.cwd()` call was replaced with `testing.getTestDir()` function which resolved to either `fs.cwd()` for non-WASI targets, or tries to fetch the preopen list from the WASI runtime and extract a preopen for '.' path. The summary of changes introduced by this commit: * implement `Dir.makeDir` and `Dir.openDir` targeting WASI * implement `Dir.deleteFile` and `Dir.deleteDir` targeting WASI * fix `os.close` and map errors in `unlinkat` * move WASI-specific `mkdirat` and `unlinkat` from `std.fs.wasi` to `std.os` module * implement `lseek_{SET, CUR, END}` targeting WASI * implement `futimens` targeting WASI * implement `ftruncate` targeting WASI * implement `readv`, `writev`, `pread{v}`, `pwrite{v}` targeting WASI * make sure ANSI escape codes are _not_ used in stderr or stdout in WASI, as WASI always sanitizes stderr, and sanitizes stdout if fd is a TTY * fix specifying WASI rights when opening/creating files/dirs * tweak `AtomicFile` to be WASI-compatible * implement `os.renameatWasi` for WASI-compliant `os.renameat` function * implement sleep() targeting WASI * fix `process.getEnvMap` targeting WASI 2020-05-05 17:23:49 +02:00
Switch a bunch of FBA to use testing.allocator 2020-01-31 19:06:50 -06:00			`const cwd = try getCwdAlloc(testing.allocator);`
			`testing.allocator.free(cwd);`
tests passing on linux 2019-05-26 23:35:26 -04:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub const EnvMap = struct {`
			`hash_map: HashMap,`

			`const HashMap = std.HashMap(`
			`[]const u8,`
			`[]const u8,`
			`EnvNameHashContext,`
			`std.hash_map.default_max_load_percentage,`
			`);`

some fixes to the EnvMap HashContext 2022-02-04 23:42:10 -07:00			`pub const Size = HashMap.Size;`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub const EnvNameHashContext = struct {`
add unicode support 2022-02-04 22:36:24 -07:00			`fn upcase(c: u21) u21 {`
			`if (c <= std.math.maxInt(u16))`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return windows.ntdll.RtlUpcaseUnicodeChar(@as(u16, @intCast(c)));`
add unicode support 2022-02-04 22:36:24 -07:00			`return c;`
			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn hash(self: @This(), s: []const u8) u64 {`
			`_ = self;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
some fixes to the EnvMap HashContext 2022-02-04 23:42:10 -07:00			`var h = std.hash.Wyhash.init(0);`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`var it = unicode.Wtf8View.initUnchecked(s).iterator();`
add unicode support 2022-02-04 22:36:24 -07:00			`while (it.nextCodepoint()) \|cp\| {`
			`const cp_upper = upcase(cp);`
			`h.update(&[_]u8{`
all: migrate code to new cast builtin syntax Most of this migration was performed automatically with `zig fmt`. There were a few exceptions which I had to manually fix: * `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten * `@truncate`'s fixup is incorrect for vectors * Test cases are not formatted, and their error locations change 2023-06-22 18:46:56 +01:00			`@as(u8, @intCast((cp_upper >> 16) & 0xff)),`
			`@as(u8, @intCast((cp_upper >> 8) & 0xff)),`
			`@as(u8, @intCast((cp_upper >> 0) & 0xff)),`
add unicode support 2022-02-04 22:36:24 -07:00			`});`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`}`
			`return h.final();`
			`}`
			`return std.hash_map.hashString(s);`
			`}`
add unicode support 2022-02-04 22:36:24 -07:00
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn eql(self: @This(), a: []const u8, b: []const u8) bool {`
			`_ = self;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`var it_a = unicode.Wtf8View.initUnchecked(a).iterator();`
			`var it_b = unicode.Wtf8View.initUnchecked(b).iterator();`
add unicode support 2022-02-04 22:36:24 -07:00			`while (true) {`
			`const c_a = it_a.nextCodepoint() orelse break;`
			`const c_b = it_b.nextCodepoint() orelse return false;`
			`if (upcase(c_a) != upcase(c_b))`
			`return false;`
			`}`
incorporate review changes from squeek 2022-02-06 23:30:06 -07:00			`return if (it_b.nextCodepoint()) \|_\| false else true;`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`}`
			`return std.hash_map.eqlString(a, b);`
			`}`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`};`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Create a EnvMap backed by a specific allocator.`
			`/// That allocator will be used for both backing allocations`
			`/// and string deduplication.`
			`pub fn init(allocator: Allocator) EnvMap {`
			`return EnvMap{ .hash_map = HashMap.init(allocator) };`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Free the backing storage of the map, as well as all`
			`/// of the stored keys and values.`
			`pub fn deinit(self: *EnvMap) void {`
			`var it = self.hash_map.iterator();`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`while (it.next()) \|entry\| {`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`self.free(entry.key_ptr.*);`
			`self.free(entry.value_ptr.*);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`self.hash_map.deinit();`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			/// Same as `put` but the key and value become owned by the EnvMap rather
			`/// than being copied.`
			/// If `putMove` fails, the ownership of key and value does not transfer.
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows `key` must be a valid [WTF-8](https://wtf-8.codeberg.page/) string.
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn putMove(self: *EnvMap, key: []u8, value: []u8) !void {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`assert(unicode.wtf8ValidateSlice(key));`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`const get_or_put = try self.hash_map.getOrPut(key);`
			`if (get_or_put.found_existing) {`
			`self.free(get_or_put.key_ptr.*);`
			`self.free(get_or_put.value_ptr.*);`
			`get_or_put.key_ptr.* = key;`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`get_or_put.value_ptr.* = value;`
			`}`

			/// `key` and `value` are copied into the EnvMap.
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows `key` must be a valid [WTF-8](https://wtf-8.codeberg.page/) string.
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn put(self: *EnvMap, key: []const u8, value: []const u8) !void {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`assert(unicode.wtf8ValidateSlice(key));`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`const value_copy = try self.copy(value);`
			`errdefer self.free(value_copy);`
			`const get_or_put = try self.hash_map.getOrPut(key);`
			`if (get_or_put.found_existing) {`
			`self.free(get_or_put.value_ptr.*);`
			`} else {`
			`get_or_put.key_ptr.* = self.copy(key) catch \|err\| {`
			`_ = self.hash_map.remove(key);`
			`return err;`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`};`
			`}`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`get_or_put.value_ptr.* = value_copy;`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Find the address of the value associated with a key.`
			`/// The returned pointer is invalidated if the map resizes.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows `key` must be a valid [WTF-8](https://wtf-8.codeberg.page/) string.
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn getPtr(self: EnvMap, key: []const u8) ?*[]const u8 {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`assert(unicode.wtf8ValidateSlice(key));`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`return self.hash_map.getPtr(key);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Return the map's copy of the value associated with`
			`/// a key. The returned string is invalidated if this`
			`/// key is removed from the map.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows `key` must be a valid [WTF-8](https://wtf-8.codeberg.page/) string.
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn get(self: EnvMap, key: []const u8) ?[]const u8 {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`assert(unicode.wtf8ValidateSlice(key));`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`return self.hash_map.get(key);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Removes the item from the map and frees its value.`
			`/// This invalidates the value returned by get() for this key.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows `key` must be a valid [WTF-8](https://wtf-8.codeberg.page/) string.
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`pub fn remove(self: *EnvMap, key: []const u8) void {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`assert(unicode.wtf8ValidateSlice(key));`
remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`const kv = self.hash_map.fetchRemove(key) orelse return;`
			`self.free(kv.key);`
			`self.free(kv.value);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Returns the number of KV pairs stored in the map.`
			`pub fn count(self: EnvMap) HashMap.Size {`
			`return self.hash_map.count();`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`/// Returns an iterator over entries in the map.`
			`pub fn iterator(self: *const EnvMap) HashMap.Iterator {`
			`return self.hash_map.iterator();`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

std.Build: don't force all children to inherit color option The build runner was previously forcing child processes to have their stderr colorization match the build runner by setting `CLICOLOR_FORCE` or `NO_COLOR`. This is a nice idea in some cases---for instance a simple `Run` step which we just expect to exit with code 0 and whose stderr is not being programmatically inspected---but is a bad idea in others, for instance if there is a check on stderr or if stderr is captured, in which case forcing color on the child could cause checks to fail. Instead, this commit adds a field to `std.Build.Step.Run` which specifies a behavior for the build runner to employ in terms of assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The default behavior is to set `CLICOLOR_FORCE` if the build runner's output is colorized and the step's stderr is not captured, and to set `NO_COLOR` otherwise. Alternatively, colors can be always enabled, always disabled, always match the build runner, or the environment variables can be left untouched so they can be manually controlled through `env_map`. Notably, this fixes a failure when running `zig build test-cli` in a TTY (or with colors explicitly enabled). GitHub CI hadn't caught this because it does not request color, but Codeberg CI now does, and we were seeing a failure in the `zig init` test because the actual output had color escape codes in it due to 6d280dc. 2025-11-13 09:46:57 +00:00			/// Returns a full copy of `em` allocated with `gpa`, which is not necessarily
			/// the same allocator used to allocate `em`.
			`pub fn clone(em: *const EnvMap, gpa: Allocator) Allocator.Error!EnvMap {`
			`var new: EnvMap = .init(gpa);`
			`errdefer new.deinit();`
			`// Since we need to dupe the keys and values, the only way for error handling to not be a`
			`// nightmare is to add keys to an empty map one-by-one. This could be avoided if this`
			`// abstraction were a bit less... OOP-esque.`
			`try new.hash_map.ensureUnusedCapacity(em.hash_map.count());`
			`var it = em.hash_map.iterator();`
			`while (it.next()) \|entry\| {`
			`try new.put(entry.key_ptr., entry.value_ptr.);`
			`}`
			`return new;`
			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`fn free(self: EnvMap, value: []const u8) void {`
			`self.hash_map.allocator.free(value);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

remove extra storage from EnvMap on windows 2022-02-04 12:08:38 -07:00			`fn copy(self: EnvMap, value: []const u8) ![]u8 {`
			`return self.hash_map.allocator.dupe(u8, value);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`
			`};`

std: promote tests to doctests Now these show up as "example usage" in generated documentation. 2024-03-13 15:56:09 -07:00			`test EnvMap {`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`var env = EnvMap.init(testing.allocator);`
			`defer env.deinit();`

			`try env.put("SOMETHING_NEW", "hello");`
			`try testing.expectEqualStrings("hello", env.get("SOMETHING_NEW").?);`
			`try testing.expectEqual(@as(EnvMap.Size, 1), env.count());`

			`// overwrite`
			`try env.put("SOMETHING_NEW", "something");`
			`try testing.expectEqualStrings("something", env.get("SOMETHING_NEW").?);`
			`try testing.expectEqual(@as(EnvMap.Size, 1), env.count());`

			`// a new longer name to test the Windows-specific conversion buffer`
			`try env.put("SOMETHING_NEW_AND_LONGER", "1");`
			`try testing.expectEqualStrings("1", env.get("SOMETHING_NEW_AND_LONGER").?);`
			`try testing.expectEqual(@as(EnvMap.Size, 2), env.count());`

			`// case insensitivity on Windows only`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`try testing.expectEqualStrings("1", env.get("something_New_aNd_LONGER").?);`
			`} else {`
			`try testing.expect(null == env.get("something_New_aNd_LONGER"));`
			`}`

			`var it = env.iterator();`
			`var count: EnvMap.Size = 0;`
			`while (it.next()) \|entry\| {`
some fixes to the EnvMap HashContext 2022-02-04 23:42:10 -07:00			`const is_an_expected_name = std.mem.eql(u8, "SOMETHING_NEW", entry.key_ptr.) or std.mem.eql(u8, "SOMETHING_NEW_AND_LONGER", entry.key_ptr.);`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`try testing.expect(is_an_expected_name);`
			`count += 1;`
			`}`
			`try testing.expectEqual(@as(EnvMap.Size, 2), count);`

			`env.remove("SOMETHING_NEW");`
			`try testing.expect(env.get("SOMETHING_NEW") == null);`

			`try testing.expectEqual(@as(EnvMap.Size, 1), env.count());`
incorporate review changes from squeek 2022-02-06 23:30:06 -07:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`// test Unicode case-insensitivity on Windows`
incorporate review changes from squeek 2022-02-06 23:30:06 -07:00			`try env.put("КИРиллИЦА", "something else");`
			`try testing.expectEqualStrings("something else", env.get("кириллица").?);`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00
			`// and WTF-8 that's not valid UTF-8`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const wtf8_with_surrogate_pair = try unicode.wtf16LeToWtf8Alloc(testing.allocator, &[_]u16{`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`std.mem.nativeToLittle(u16, 0xD83D), // unpaired high surrogate`
			`});`
			`defer testing.allocator.free(wtf8_with_surrogate_pair);`

			`try env.put(wtf8_with_surrogate_pair, wtf8_with_surrogate_pair);`
			`try testing.expectEqualSlices(u8, wtf8_with_surrogate_pair, env.get(wtf8_with_surrogate_pair).?);`
incorporate review changes from squeek 2022-02-06 23:30:06 -07:00			`}`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`}`

Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`pub const GetEnvMapError = error{`
			`OutOfMemory,`
			/// WASI-only. `environ_sizes_get` or `environ_get`
			`/// failed for an unexpected reason.`
			`Unexpected,`
			`};`

Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`/// Returns a snapshot of the environment variables of the current process.`
doc: fix typo in getEnvMap 2024-02-06 21:54:20 -05:00			`/// Any modifications to the resulting EnvMap will not be reflected in the environment, and`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`/// likewise, any future modifications to the environment will not be reflected in the EnvMap.`
			/// Caller owns resulting `EnvMap` and should call its `deinit` fn when done.
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`pub fn getEnvMap(allocator: Allocator) GetEnvMapError!EnvMap {`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`var result = EnvMap.init(allocator);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`errdefer result.deinit();`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
			`const ptr = windows.peb().ProcessParameters.Environment;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`var i: usize = 0;`
improve handling of environment variables on Windows std.os.getenv and std.os.getenvZ have nice compile errors when not linking libc and using Windows. std.os.getenvW is provided as a Windows-only API that does not require an allocator. It uses the Process Environment Block. std.process.getEnvVarOwned is improved to be a simple wrapper on top of std.os.getenvW. std.process.getEnvMap is improved to use the Process Environment Block rather than calling GetEnvironmentVariableW. std.zig.system.NativePaths uses process.getEnvVarOwned instead of std.os.getenvZ, which works on Windows as well as POSIX. 2020-02-22 17:35:36 -05:00			`while (ptr[i] != 0) {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const key_start = i;`

Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`// There are some special environment variables that start with =,`
			`// so we need a special case to not treat = as a key/value separator`
			`// if it's the first character.`
			`// https://devblogs.microsoft.com/oldnewthing/20100506-00/?p=14133`
			`if (ptr[key_start] == '=') i += 1;`

do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`while (ptr[i] != 0 and ptr[i] != '=') : (i += 1) {}`
			`const key_w = ptr[key_start..i];`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const key = try unicode.wtf16LeToWtf8Alloc(allocator, key_w);`
reverse some of the now unneeded changes from squeek 2022-02-04 23:35:22 -07:00			`errdefer allocator.free(key);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`if (ptr[i] == '=') i += 1;`

			`const value_start = i;`
			`while (ptr[i] != 0) : (i += 1) {}`
			`const value_w = ptr[value_start..i];`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const value = try unicode.wtf16LeToWtf8Alloc(allocator, value_w);`
reverse some of the now unneeded changes from squeek 2022-02-04 23:35:22 -07:00			`errdefer allocator.free(value);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`i += 1; // skip over null byte`

reverse some of the now unneeded changes from squeek 2022-02-04 23:35:22 -07:00			`try result.putMove(key, value);`
			`}`
improve handling of environment variables on Windows std.os.getenv and std.os.getenvZ have nice compile errors when not linking libc and using Windows. std.os.getenvW is provided as a Windows-only API that does not require an allocator. It uses the Process Environment Block. std.process.getEnvVarOwned is improved to be a simple wrapper on top of std.os.getenvW. std.process.getEnvMap is improved to use the Process Environment Block rather than calling GetEnvironmentVariableW. std.zig.system.NativePaths uses process.getEnvVarOwned instead of std.os.getenvZ, which works on Windows as well as POSIX. 2020-02-22 17:35:36 -05:00			`return result;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`} else if (native_os == .wasi and !builtin.link_libc) {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`var environ_count: usize = undefined;`
			`var environ_buf_size: usize = undefined;`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const environ_sizes_get_ret = std.os.wasi.environ_sizes_get(&environ_count, &environ_buf_size);`
std: [breaking] move errno to become an nonexhaustive enum The primary purpose of this change is to eliminate one usage of `usingnamespace` in the standard library - specifically the usage for errno values in `std.os.linux`. This is accomplished by truncating the `E` prefix from error values, and making errno a proper enum. A similar strategy can be used to eliminate some other `usingnamespace` sites in the std lib. 2021-08-23 17:06:56 -07:00			`if (environ_sizes_get_ret != .SUCCESS) {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.unexpectedErrno(environ_sizes_get_ret);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

wasm: avoids allocating zero length buffers for args or env I was testing this with wazero, which defaults to not propagate any env variables. This ensures we don't try to allocate zero length buffers when there are no results from either function. Signed-off-by: Adrian Cole <adrian@tetrate.io> 2023-01-19 13:50:23 +08:00			`if (environ_count == 0) {`
			`return result;`
			`}`

lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const environ = try allocator.alloc([*:0]u8, environ_count);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`defer allocator.free(environ);`
lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const environ_buf = try allocator.alloc(u8, environ_buf_size);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`defer allocator.free(environ_buf);`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const environ_get_ret = std.os.wasi.environ_get(environ.ptr, environ_buf.ptr);`
std: [breaking] move errno to become an nonexhaustive enum The primary purpose of this change is to eliminate one usage of `usingnamespace` in the standard library - specifically the usage for errno values in `std.os.linux`. This is accomplished by truncating the `E` prefix from error values, and making errno a proper enum. A similar strategy can be used to eliminate some other `usingnamespace` sites in the std lib. 2021-08-23 17:06:56 -07:00			`if (environ_get_ret != .SUCCESS) {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.unexpectedErrno(environ_get_ret);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

			`for (environ) \|env\| {`
std lib API deprecations for the upcoming 0.9.0 release See #3811 2021-11-30 00:13:07 -07:00			`const pair = mem.sliceTo(env, 0);`
Update all std.mem.split calls to their appropriate function Everywhere that can now use `splitScalar` should get a nice little performance boost. 2023-05-04 18:15:50 -07:00			`var parts = mem.splitScalar(u8, pair, '=');`
std.mem: add `first` method to `SplitIterator` and `SplitBackwardsIterator` 2022-07-25 21:04:30 +02:00			`const key = parts.first();`
Fix bug in WASI envmap handling. 2022-12-29 17:38:19 -08:00			`const value = parts.rest();`
Breaking hash map changes for 0.8.0 - hash/eql functions moved into a Context object - Context functions pass an explicit context - Adapted functions pass specialized keys and contexts - new getPtr() function returns a pointer to value - remove functions renamed to fetchRemove - new remove functions return bool - removeAssertDiscard deleted, use assert(remove(...)) instead - Keys and values are stored in separate arrays - Entry is now {K, V}, the new KV is {K, V} - BufSet/BufMap functions renamed to match other set/map types - fixed iterating-while-modifying bug in src/link/C.zig 2021-06-03 15:39:26 -05:00			`try result.put(key, value);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`return result;`
update std lib to integrate with libc for environ closes #3511 2020-02-22 15:59:13 -05:00			`} else if (builtin.link_libc) {`
			`var ptr = std.c.environ;`
Sema: validate deref operator type and value 2022-06-30 17:22:16 +03:00			`while (ptr[0]) \|line\| : (ptr += 1) {`
update std lib to integrate with libc for environ closes #3511 2020-02-22 15:59:13 -05:00			`var line_i: usize = 0;`
			`while (line[line_i] != 0 and line[line_i] != '=') : (line_i += 1) {}`
			`const key = line[0..line_i];`

			`var end_i: usize = line_i;`
			`while (line[end_i] != 0) : (end_i += 1) {}`
			`const value = line[line_i + 1 .. end_i];`

Breaking hash map changes for 0.8.0 - hash/eql functions moved into a Context object - Context functions pass an explicit context - Adapted functions pass specialized keys and contexts - new getPtr() function returns a pointer to value - remove functions renamed to fetchRemove - new remove functions return bool - removeAssertDiscard deleted, use assert(remove(...)) instead - Keys and values are stored in separate arrays - Entry is now {K, V}, the new KV is {K, V} - BufSet/BufMap functions renamed to match other set/map types - fixed iterating-while-modifying bug in src/link/C.zig 2021-06-03 15:39:26 -05:00			`try result.put(key, value);`
update std lib to integrate with libc for environ closes #3511 2020-02-22 15:59:13 -05:00			`}`
			`return result;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`} else {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`for (std.os.environ) \|line\| {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`var line_i: usize = 0;`
update std lib to integrate with libc for environ closes #3511 2020-02-22 15:59:13 -05:00			`while (line[line_i] != 0 and line[line_i] != '=') : (line_i += 1) {}`
			`const key = line[0..line_i];`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`var end_i: usize = line_i;`
update std lib to integrate with libc for environ closes #3511 2020-02-22 15:59:13 -05:00			`while (line[end_i] != 0) : (end_i += 1) {}`
			`const value = line[line_i + 1 .. end_i];`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Breaking hash map changes for 0.8.0 - hash/eql functions moved into a Context object - Context functions pass an explicit context - Adapted functions pass specialized keys and contexts - new getPtr() function returns a pointer to value - remove functions renamed to fetchRemove - new remove functions return bool - removeAssertDiscard deleted, use assert(remove(...)) instead - Keys and values are stored in separate arrays - Entry is now {K, V}, the new KV is {K, V} - BufSet/BufMap functions renamed to match other set/map types - fixed iterating-while-modifying bug in src/link/C.zig 2021-06-03 15:39:26 -05:00			`try result.put(key, value);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`return result;`
			`}`
			`}`

std: promote tests to doctests Now these show up as "example usage" in generated documentation. 2024-03-13 15:56:09 -07:00			`test getEnvMap {`
Add `process.EnvMap`, a platform-independent environment variable map EnvMap provides the same API as the previously used BufMap (besides `putMove` and `getPtr`), so usage sites of `getEnvMap` can usually remain unchanged. For non-Windows, EnvMap is a wrapper around BufMap. On Windows, it uses a new EnvMapWindows to handle some Windows-specific behavior: - Lookups use Unicode-aware case insensitivity (but `get` cannot return an error because EnvMapWindows has an internal buffer to use for lookup conversions) - Canonical names are returned when iterating the EnvMap Fixes #10561, closes #4603 2022-01-16 20:11:08 -08:00			`var env = try getEnvMap(testing.allocator);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`defer env.deinit();`
			`}`

			`pub const GetEnvVarOwnedError = error{`
			`OutOfMemory,`
			`EnvironmentVariableNotFound,`

Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On Windows, environment variable keys provided by the user must be valid WTF-8.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// https://wtf-8.codeberg.page/`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`InvalidWtf8,`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`};`

			`/// Caller must free returned memory.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows, if `key` is not valid [WTF-8](https://wtf-8.codeberg.page/),
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			/// then `error.InvalidWtf8` is returned.
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// On Windows, the value is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On other platforms, the value is an opaque sequence of bytes with no particular encoding.`
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn getEnvVarOwned(allocator: Allocator, key: []const u8) GetEnvVarOwnedError![]u8 {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
improve handling of environment variables on Windows std.os.getenv and std.os.getenvZ have nice compile errors when not linking libc and using Windows. std.os.getenvW is provided as a Windows-only API that does not require an allocator. It uses the Process Environment Block. std.process.getEnvVarOwned is improved to be a simple wrapper on top of std.os.getenvW. std.process.getEnvMap is improved to use the Process Environment Block rather than calling GetEnvironmentVariableW. std.zig.system.NativePaths uses process.getEnvVarOwned instead of std.os.getenvZ, which works on Windows as well as POSIX. 2020-02-22 17:35:36 -05:00			`const result_w = blk: {`
Use stack fallback allocator to usually avoid extra heap allocation in getEnvVarOwned 2024-02-09 19:28:57 -08:00			`var stack_alloc = std.heap.stackFallback(256 * @sizeOf(u16), allocator);`
			`const stack_allocator = stack_alloc.get();`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const key_w = try unicode.wtf8ToWtf16LeAllocZ(stack_allocator, key);`
Use stack fallback allocator to usually avoid extra heap allocation in getEnvVarOwned 2024-02-09 19:28:57 -08:00			`defer stack_allocator.free(key_w);`
improve handling of environment variables on Windows std.os.getenv and std.os.getenvZ have nice compile errors when not linking libc and using Windows. std.os.getenvW is provided as a Windows-only API that does not require an allocator. It uses the Process Environment Block. std.process.getEnvVarOwned is improved to be a simple wrapper on top of std.os.getenvW. std.process.getEnvMap is improved to use the Process Environment Block rather than calling GetEnvironmentVariableW. std.zig.system.NativePaths uses process.getEnvVarOwned instead of std.os.getenvZ, which works on Windows as well as POSIX. 2020-02-22 17:35:36 -05:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`break :blk getenvW(key_w) orelse return error.EnvironmentVariableNotFound;`
improve handling of environment variables on Windows std.os.getenv and std.os.getenvZ have nice compile errors when not linking libc and using Windows. std.os.getenvW is provided as a Windows-only API that does not require an allocator. It uses the Process Environment Block. std.process.getEnvVarOwned is improved to be a simple wrapper on top of std.os.getenvW. std.process.getEnvMap is improved to use the Process Environment Block rather than calling GetEnvironmentVariableW. std.zig.system.NativePaths uses process.getEnvVarOwned instead of std.os.getenvZ, which works on Windows as well as POSIX. 2020-02-22 17:35:36 -05:00			`};`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`// wtf16LeToWtf8Alloc can only fail with OutOfMemory`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`return unicode.wtf16LeToWtf8Alloc(allocator, result_w);`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`} else if (native_os == .wasi and !builtin.link_libc) {`
Implement some more environment functions for WASI. 2023-01-06 08:40:16 -08:00			`var envmap = getEnvMap(allocator) catch return error.OutOfMemory;`
			`defer envmap.deinit();`
			`const val = envmap.get(key) orelse return error.EnvironmentVariableNotFound;`
			`return allocator.dupe(u8, val);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`} else {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const result = posix.getenv(key) orelse return error.EnvironmentVariableNotFound;`
std.mem.dupe is deprecated, move all references in std Replaced all occurences of std.mem.dupe in stdlib with Allocator.dupe/std.mem.dupeZ -> Allocator.dupeZ 2020-07-04 13:44:28 +02:00			`return allocator.dupe(u8, result);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`}`

std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			/// On Windows, `key` must be valid WTF-8.
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`pub fn hasEnvVarConstant(comptime key: []const u8) bool {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			`const key_w = comptime unicode.wtf8ToWtf16LeStringLiteral(key);`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return getenvW(key_w) != null;`
			`} else if (native_os == .wasi and !builtin.link_libc) {`
Implement some more environment functions for WASI. 2023-01-06 08:40:16 -08:00			`@compileError("hasEnvVarConstant is not supported for WASI without libc");`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`} else {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.getenv(key) != null;`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`}`
			`}`

std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			/// On Windows, `key` must be valid WTF-8.
std.process: adding hasNonEmptyEnvVar() and using for NO_COLOR 2025-02-04 21:19:02 -08:00			`pub fn hasNonEmptyEnvVarConstant(comptime key: []const u8) bool {`
			`if (native_os == .windows) {`
std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			`const key_w = comptime unicode.wtf8ToWtf16LeStringLiteral(key);`
std.process: adding hasNonEmptyEnvVar() and using for NO_COLOR 2025-02-04 21:19:02 -08:00			`const value = getenvW(key_w) orelse return false;`
			`return value.len != 0;`
			`} else if (native_os == .wasi and !builtin.link_libc) {`
			`@compileError("hasNonEmptyEnvVarConstant is not supported for WASI without libc");`
			`} else {`
			`const value = posix.getenv(key) orelse return false;`
			`return value.len != 0;`
			`}`
			`}`

std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`pub const ParseEnvVarIntError = std.fmt.ParseIntError \|\| error{EnvironmentVariableNotFound};`

			`/// Parses an environment variable as an integer.`
			`///`
			`/// Since the key is comptime-known, no allocation is needed.`
			`///`
std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			/// On Windows, `key` must be valid WTF-8.
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`pub fn parseEnvVarInt(comptime key: []const u8, comptime I: type, base: u8) ParseEnvVarIntError!I {`
			`if (native_os == .windows) {`
std.process: Allow WTF-8 in env var functions with comptime-known keys 2025-03-17 17:24:00 -07:00			`const key_w = comptime std.unicode.wtf8ToWtf16LeStringLiteral(key);`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`const text = getenvW(key_w) orelse return error.EnvironmentVariableNotFound;`
Progress: fix compile errors on windows Works for `zig build-exe`, IPC still not implemented yet. 2024-05-26 07:07:44 -04:00			`return std.fmt.parseIntWithGenericCharacter(I, u16, text, base);`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`} else if (native_os == .wasi and !builtin.link_libc) {`
			`@compileError("parseEnvVarInt is not supported for WASI without libc");`
			`} else {`
			`const text = posix.getenv(key) orelse return error.EnvironmentVariableNotFound;`
			`return std.fmt.parseInt(I, text, base);`
			`}`
			`}`

Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`pub const HasEnvVarError = error{`
			`OutOfMemory,`

			`/// On Windows, environment variable keys provided by the user must be valid WTF-8.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// https://wtf-8.codeberg.page/`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`InvalidWtf8,`
			`};`

replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows, if `key` is not valid [WTF-8](https://wtf-8.codeberg.page/),
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			/// then `error.InvalidWtf8` is returned.
			`pub fn hasEnvVar(allocator: Allocator, key: []const u8) HasEnvVarError!bool {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`var stack_alloc = std.heap.stackFallback(256 * @sizeOf(u16), allocator);`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`const stack_allocator = stack_alloc.get();`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const key_w = try unicode.wtf8ToWtf16LeAllocZ(stack_allocator, key);`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`defer stack_allocator.free(key_w);`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return getenvW(key_w) != null;`
			`} else if (native_os == .wasi and !builtin.link_libc) {`
Implement some more environment functions for WASI. 2023-01-06 08:40:16 -08:00			`var envmap = getEnvMap(allocator) catch return error.OutOfMemory;`
			`defer envmap.deinit();`
			`return envmap.getPtr(key) != null;`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`} else {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.getenv(key) != null;`
			`}`
			`}`

replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			/// On Windows, if `key` is not valid [WTF-8](https://wtf-8.codeberg.page/),
std.process: adding hasNonEmptyEnvVar() and using for NO_COLOR 2025-02-04 21:19:02 -08:00			/// then `error.InvalidWtf8` is returned.
			`pub fn hasNonEmptyEnvVar(allocator: Allocator, key: []const u8) HasEnvVarError!bool {`
			`if (native_os == .windows) {`
			`var stack_alloc = std.heap.stackFallback(256 * @sizeOf(u16), allocator);`
			`const stack_allocator = stack_alloc.get();`
			`const key_w = try unicode.wtf8ToWtf16LeAllocZ(stack_allocator, key);`
			`defer stack_allocator.free(key_w);`
			`const value = getenvW(key_w) orelse return false;`
			`return value.len != 0;`
			`} else if (native_os == .wasi and !builtin.link_libc) {`
			`var envmap = getEnvMap(allocator) catch return error.OutOfMemory;`
			`defer envmap.deinit();`
			`const value = envmap.getPtr(key) orelse return false;`
			`return value.len != 0;`
			`} else {`
			`const value = posix.getenv(key) orelse return false;`
			`return value.len != 0;`
			`}`
			`}`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`/// Windows-only. Get an environment variable with a null-terminated, WTF-16 encoded name.`
process.getenvW: Document that returned memory points to the PEB 2025-11-16 04:07:48 -08:00			`/// The returned slice points to memory in the PEB.`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`///`
			`/// This function performs a Unicode-aware case-insensitive lookup using RtlEqualUnicodeString.`
			`///`
			`/// See also:`
			/// * `std.posix.getenv`
			/// * `getEnvMap`
			/// * `getEnvVarOwned`
			/// * `hasEnvVarConstant`
			/// * `hasEnvVar`
			`pub fn getenvW(key: [*:0]const u16) ?[:0]const u16 {`
			`if (native_os != .windows) {`
			`@compileError("Windows-only");`
			`}`
			`const key_slice = mem.sliceTo(key, 0);`
getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`// '=' anywhere but the start makes this an invalid environment variable name`
			`if (key_slice.len > 0 and std.mem.indexOfScalar(u16, key_slice[1..], '=') != null) {`
			`return null;`
			`}`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const ptr = windows.peb().ProcessParameters.Environment;`
			`var i: usize = 0;`
			`while (ptr[i] != 0) {`
getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`const key_value = mem.sliceTo(ptr[i..], 0);`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00
			`// There are some special environment variables that start with =,`
			`// so we need a special case to not treat = as a key/value separator`
			`// if it's the first character.`
			`// https://devblogs.microsoft.com/oldnewthing/20100506-00/?p=14133`
getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`const equal_search_start: usize = if (key_value[0] == '=') 1 else 0;`
			`const equal_index = std.mem.indexOfScalarPos(u16, key_value, equal_search_start, '=') orelse {`
			`// This is enforced by CreateProcess.`
			`// If violated, CreateProcess will fail with INVALID_PARAMETER.`
			`unreachable; // must contain a =`
			`};`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00
getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`const this_key = key_value[0..equal_index];`
windows.eqlIgnoreCaseWTF16 -> eqlIgnoreCaseWtf16 Consistent with naming of other, similar functions 2025-11-16 03:45:38 -08:00			`if (windows.eqlIgnoreCaseWtf16(key_slice, this_key)) {`
getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`return key_value[equal_index + 1 ..];`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`}`

getenvW: Take advantage of sliceTo/indexOfScalarPos optimizations Both sliceTo and indexOfScalarPos use SIMD when available to speed up the search. On my x86_64 machine, this leads to getenvW being around 2-3x faster overall. Additionally, any future improvements to sliceTo/indexOfScalarPos will benefit getenvW. 2025-03-16 17:37:31 -07:00			`// skip past the NUL terminator`
			`i += key_value.len + 1;`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`}`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return null;`
Add support for NO_COLOR 2021-06-21 13:47:38 -05:00			`}`

Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`test getEnvVarOwned {`
			`try testing.expectError(`
			`error.EnvironmentVariableNotFound,`
			`getEnvVarOwned(std.testing.allocator, "BADENV"),`
			`);`
			`}`

			`test hasEnvVarConstant {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .wasi and !builtin.link_libc) return error.SkipZigTest;`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00
			`try testing.expect(!hasEnvVarConstant("BADENV"));`
			`}`

			`test hasEnvVar {`
			`const has_env = try hasEnvVar(std.testing.allocator, "BADENV");`
			`try testing.expect(!has_env);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

			`pub const ArgIteratorPosix = struct {`
			`index: usize,`
			`count: usize,`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`pub const InitError = error{};`

do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`pub fn init() ArgIteratorPosix {`
			`return ArgIteratorPosix{`
			`.index = 0,`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`.count = std.os.argv.len,`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`};`
			`}`

Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`pub fn next(self: *ArgIteratorPosix) ?[:0]const u8 {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`if (self.index == self.count) return null;`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const s = std.os.argv[self.index];`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`self.index += 1;`
std lib API deprecations for the upcoming 0.9.0 release See #3811 2021-11-30 00:13:07 -07:00			`return mem.sliceTo(s, 0);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

			`pub fn skip(self: *ArgIteratorPosix) bool {`
			`if (self.index == self.count) return false;`

			`self.index += 1;`
			`return true;`
			`}`
			`};`

Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`pub const ArgIteratorWasi = struct {`
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`allocator: Allocator,`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`index: usize,`
Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`args: [][:0]u8,`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`pub const InitError = error{OutOfMemory} \|\| posix.UnexpectedError;`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
			`/// You must call deinit to free the internal buffer of the`
			`/// iterator after you are done.`
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn init(allocator: Allocator) InitError!ArgIteratorWasi {`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`const fetched_args = try ArgIteratorWasi.internalInit(allocator);`
			`return ArgIteratorWasi{`
			`.allocator = allocator,`
			`.index = 0,`
			`.args = fetched_args,`
			`};`
			`}`

std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`fn internalInit(allocator: Allocator) InitError![][:0]u8 {`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`var count: usize = undefined;`
			`var buf_size: usize = undefined;`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`switch (std.os.wasi.args_sizes_get(&count, &buf_size)) {`
std: [breaking] move errno to become an nonexhaustive enum The primary purpose of this change is to eliminate one usage of `usingnamespace` in the standard library - specifically the usage for errno values in `std.os.linux`. This is accomplished by truncating the `E` prefix from error values, and making errno a proper enum. A similar strategy can be used to eliminate some other `usingnamespace` sites in the std lib. 2021-08-23 17:06:56 -07:00			`.SUCCESS => {},`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`else => \|err\| return posix.unexpectedErrno(err),`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`}`

wasm: avoids allocating zero length buffers for args or env I was testing this with wazero, which defaults to not propagate any env variables. This ensures we don't try to allocate zero length buffers when there are no results from either function. Signed-off-by: Adrian Cole <adrian@tetrate.io> 2023-01-19 13:50:23 +08:00			`if (count == 0) {`
			`return &[_][:0]u8{};`
			`}`

lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const argv = try allocator.alloc([*:0]u8, count);`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`defer allocator.free(argv);`

lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const argv_buf = try allocator.alloc(u8, buf_size);`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`switch (std.os.wasi.args_get(argv.ptr, argv_buf.ptr)) {`
std: [breaking] move errno to become an nonexhaustive enum The primary purpose of this change is to eliminate one usage of `usingnamespace` in the standard library - specifically the usage for errno values in `std.os.linux`. This is accomplished by truncating the `E` prefix from error values, and making errno a proper enum. A similar strategy can be used to eliminate some other `usingnamespace` sites in the std lib. 2021-08-23 17:06:56 -07:00			`.SUCCESS => {},`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`else => \|err\| return posix.unexpectedErrno(err),`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`}`

Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`var result_args = try allocator.alloc([:0]u8, count);`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`var i: usize = 0;`
			`while (i < count) : (i += 1) {`
std lib API deprecations for the upcoming 0.9.0 release See #3811 2021-11-30 00:13:07 -07:00			`result_args[i] = mem.sliceTo(argv[i], 0);`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`}`

			`return result_args;`
			`}`

Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`pub fn next(self: *ArgIteratorWasi) ?[:0]const u8 {`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`if (self.index == self.args.len) return null;`

			`const arg = self.args[self.index];`
			`self.index += 1;`
			`return arg;`
			`}`

			`pub fn skip(self: *ArgIteratorWasi) bool {`
			`if (self.index == self.args.len) return false;`

			`self.index += 1;`
			`return true;`
			`}`

			`/// Call to free the internal buffer of the iterator.`
			`pub fn deinit(self: *ArgIteratorWasi) void {`
			`const last_item = self.args[self.args.len - 1];`
all: zig fmt and rename "@XToY" to "@YFromX" Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me> 2023-06-15 13:14:16 +06:00			`const last_byte_addr = @intFromPtr(last_item.ptr) + last_item.len + 1; // null terminated`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`const first_item_ptr = self.args[0].ptr;`
all: zig fmt and rename "@XToY" to "@YFromX" Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me> 2023-06-15 13:14:16 +06:00			`const len = last_byte_addr - @intFromPtr(first_item_ptr);`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`self.allocator.free(first_item_ptr[0..len]);`
			`self.allocator.free(self.args);`
			`}`
			`};`

Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`/// Iterator that implements the Windows command-line parsing algorithm.`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`/// The implementation is intended to be compatible with the post-2008 C runtime,`
			/// but is not intended to be compatible with `CommandLineToArgvW` since
			/// `CommandLineToArgvW` uses the pre-2008 parsing rules.
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`///`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`/// This iterator faithfully implements the parsing behavior observed from the C runtime with`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`/// one exception: if the command-line string is empty, the iterator will immediately complete`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`/// without returning any arguments (whereas the C runtime will return a single argument`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`/// representing the name of the current executable).`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`///`
			`/// The essential parts of the algorithm are described in Microsoft's documentation:`
			`///`
			`/// - https://learn.microsoft.com/en-us/cpp/cpp/main-function-command-line-args?view=msvc-170#parsing-c-command-line-arguments`
			`///`
			`/// David Deley explains some additional undocumented quirks in great detail:`
			`///`
			`/// - https://daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULES`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`pub const ArgIteratorWindows = struct {`
			`allocator: Allocator,`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`/// Encoded as WTF-16 LE.`
ArgIteratorWindows.init: Take `[]const u16` slice instead of multi-item pointer Now that we use the PEB to get the precise length of the command line string, there's no need for a multi-item pointer/sliceTo call. This provides a minor speedup: Benchmark 1 (153 runs): benchargv-before.exe measurement mean ± σ min … max outliers delta wall_time 32.7ms ± 429us 32.1ms … 36.9ms 1 ( 1%) 0% peak_rss 6.49MB ± 5.62KB 6.46MB … 6.49MB 14 ( 9%) 0% Benchmark 2 (157 runs): benchargv-after.exe measurement mean ± σ min … max outliers delta wall_time 31.9ms ± 236us 31.4ms … 32.7ms 4 ( 3%) ⚡- 2.4% ± 0.2% peak_rss 6.49MB ± 4.77KB 6.46MB … 6.49MB 14 ( 9%) + 0.0% ± 0.0% 2024-07-13 17:29:18 -07:00			`cmd_line: []const u16,`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`index: usize = 0,`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`/// Owned by the iterator. Long enough to hold contiguous NUL-terminated slices`
			`/// of each argument encoded as WTF-8.`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`buffer: []u8,`
			`start: usize = 0,`
			`end: usize = 0,`

Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`pub const InitError = error{OutOfMemory};`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			/// `cmd_line_w` must be a WTF16-LE-encoded string.
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`///`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			/// The iterator stores and uses `cmd_line_w`, so its memory must be valid for
			`/// at least as long as the returned ArgIteratorWindows.`
ArgIteratorWindows.init: Take `[]const u16` slice instead of multi-item pointer Now that we use the PEB to get the precise length of the command line string, there's no need for a multi-item pointer/sliceTo call. This provides a minor speedup: Benchmark 1 (153 runs): benchargv-before.exe measurement mean ± σ min … max outliers delta wall_time 32.7ms ± 429us 32.1ms … 36.9ms 1 ( 1%) 0% peak_rss 6.49MB ± 5.62KB 6.46MB … 6.49MB 14 ( 9%) 0% Benchmark 2 (157 runs): benchargv-after.exe measurement mean ± σ min … max outliers delta wall_time 31.9ms ± 236us 31.4ms … 32.7ms 4 ( 3%) ⚡- 2.4% ± 0.2% peak_rss 6.49MB ± 4.77KB 6.46MB … 6.49MB 14 ( 9%) + 0.0% ± 0.0% 2024-07-13 17:29:18 -07:00			`pub fn init(allocator: Allocator, cmd_line_w: []const u16) InitError!ArgIteratorWindows {`
			`const wtf8_len = unicode.calcWtf8Len(cmd_line_w);`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00
			`// This buffer must be large enough to contain contiguous NUL-terminated slices`
ArgIteratorWindows: Clarify buffer length comment 2024-07-13 15:03:55 -07:00			`// of each argument.`
			`// - During parsing, the length of a parsed argument will always be equal to`
			`// to less than its unparsed length`
			`// - The first argument needs one extra byte of space allocated for its NUL`
			`// terminator, but for each subsequent argument the necessary whitespace`
			`// between arguments guarantees room for their NUL terminator(s).`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`const buffer = try allocator.alloc(u8, wtf8_len + 1);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`errdefer allocator.free(buffer);`

			`return .{`
			`.allocator = allocator,`
ArgIteratorWindows.init: Take `[]const u16` slice instead of multi-item pointer Now that we use the PEB to get the precise length of the command line string, there's no need for a multi-item pointer/sliceTo call. This provides a minor speedup: Benchmark 1 (153 runs): benchargv-before.exe measurement mean ± σ min … max outliers delta wall_time 32.7ms ± 429us 32.1ms … 36.9ms 1 ( 1%) 0% peak_rss 6.49MB ± 5.62KB 6.46MB … 6.49MB 14 ( 9%) 0% Benchmark 2 (157 runs): benchargv-after.exe measurement mean ± σ min … max outliers delta wall_time 31.9ms ± 236us 31.4ms … 32.7ms 4 ( 3%) ⚡- 2.4% ± 0.2% peak_rss 6.49MB ± 4.77KB 6.46MB … 6.49MB 14 ( 9%) + 0.0% ± 0.0% 2024-07-13 17:29:18 -07:00			`.cmd_line = cmd_line_w,`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`.buffer = buffer,`
			`};`
			`}`

			/// Returns the next argument and advances the iterator. Returns `null` if at the end of the
			`/// command-line string. The iterator owns the returned slice.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// The result is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`pub fn next(self: *ArgIteratorWindows) ?[:0]const u8 {`
			`return self.nextWithStrategy(next_strategy);`
			`}`

			/// Skips the next argument and advances the iterator. Returns `true` if an argument was
			/// skipped, `false` if at the end of the command-line string.
			`pub fn skip(self: *ArgIteratorWindows) bool {`
			`return self.nextWithStrategy(skip_strategy);`
			`}`

			`const next_strategy = struct {`
			`const T = ?[:0]const u8;`

			`const eof = null;`

ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			/// Returns '\' if any backslashes are emitted, otherwise returns `last_emitted_code_unit`.
			`fn emitBackslashes(self: *ArgIteratorWindows, count: usize, last_emitted_code_unit: ?u16) ?u16 {`
			`for (0..count) \|_\| {`
			`self.buffer[self.end] = '\\';`
			`self.end += 1;`
			`}`
			`return if (count != 0) '\\' else last_emitted_code_unit;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`

ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			/// If `last_emitted_code_unit` and `code_unit` form a surrogate pair, then
			`/// the previously emitted high surrogate is overwritten by the codepoint encoded`
			/// by the surrogate pair, and `null` is returned.
			/// Otherwise, `code_unit` is emitted and returned.
			`fn emitCharacter(self: *ArgIteratorWindows, code_unit: u16, last_emitted_code_unit: ?u16) ?u16 {`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`// Because we are emitting WTF-8, we need to`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`// check to see if we've emitted two consecutive surrogate`
			`// codepoints that form a valid surrogate pair in order`
			`// to ensure that we're always emitting well-formed WTF-8`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`// (https://wtf-8.codeberg.page/#concatenating).`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`//`
			`// If we do have a valid surrogate pair, we need to emit`
			`// the UTF-8 sequence for the codepoint that they encode`
			`// instead of the WTF-8 encoding for the two surrogate pairs`
			`// separately.`
			`//`
			`// This is relevant when dealing with a WTF-16 encoded`
			`// command line like this:`
			`// "<0xD801>"<0xDC37>`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`// which would get parsed and converted to WTF-8 as:`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`// <0xED><0xA0><0x81><0xED><0xB0><0xB7>`
			`// but instead, we need to recognize the surrogate pair`
			`// and emit the codepoint it encodes, which in this`
			`// example is U+10437 (𐐷), which is encoded in UTF-8 as:`
			`// <0xF0><0x90><0x90><0xB7>`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`if (last_emitted_code_unit != null and`
			`std.unicode.utf16IsLowSurrogate(code_unit) and`
			`std.unicode.utf16IsHighSurrogate(last_emitted_code_unit.?))`
			`{`
			`const codepoint = std.unicode.utf16DecodeSurrogatePair(&.{ last_emitted_code_unit.?, code_unit }) catch unreachable;`

			`// Unpaired surrogate is 3 bytes long`
			`const dest = self.buffer[self.end - 3 ..];`
			`const len = unicode.utf8Encode(codepoint, dest) catch unreachable;`
			`// All codepoints that require a surrogate pair (> U+FFFF) are encoded as 4 bytes`
			`assert(len == 4);`
			`self.end += 1;`
			`return null;`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`}`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00
			`const wtf8_len = std.unicode.wtf8Encode(code_unit, self.buffer[self.end..]) catch unreachable;`
			`self.end += wtf8_len;`
			`return code_unit;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`

			`fn yieldArg(self: *ArgIteratorWindows) [:0]const u8 {`
			`self.buffer[self.end] = 0;`
			`const arg = self.buffer[self.start..self.end :0];`
			`self.end += 1;`
			`self.start = self.end;`
			`return arg;`
			`}`
			`};`

			`const skip_strategy = struct {`
			`const T = bool;`

			`const eof = false;`

ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`fn emitBackslashes(_: *ArgIteratorWindows, _: usize, last_emitted_code_unit: ?u16) ?u16 {`
			`return last_emitted_code_unit;`
			`}`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`fn emitCharacter(_: *ArgIteratorWindows, _: u16, last_emitted_code_unit: ?u16) ?u16 {`
			`return last_emitted_code_unit;`
			`}`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00
			`fn yieldArg(_: *ArgIteratorWindows) bool {`
			`return true;`
			`}`
			`};`

			`fn nextWithStrategy(self: *ArgIteratorWindows, comptime strategy: type) strategy.T {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`var last_emitted_code_unit: ?u16 = null;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`// The first argument (the executable name) uses different parsing rules.`
			`if (self.index == 0) {`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`if (self.cmd_line.len == 0 or self.cmd_line[0] == 0) {`
			`// Immediately complete the iterator.`
			`// The C runtime would return the name of the current executable here.`
			`return strategy.eof;`
			`}`

			`var inside_quotes = false;`
			`while (true) : (self.index += 1) {`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`const char = if (self.index != self.cmd_line.len)`
			`mem.littleToNative(u16, self.cmd_line[self.index])`
			`else`
			`0;`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`switch (char) {`
			`0 => {`
			`return strategy.yieldArg(self);`
			`},`
			`'"' => {`
			`inside_quotes = !inside_quotes;`
			`},`
			`' ', '\t' => {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`if (inside_quotes) {`
			`last_emitted_code_unit = strategy.emitCharacter(self, char, last_emitted_code_unit);`
			`} else {`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`self.index += 1;`
			`return strategy.yieldArg(self);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`},`
			`else => {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitCharacter(self, char, last_emitted_code_unit);`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`},`
			`}`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`
			`}`

			`// Skip spaces and tabs. The iterator completes if we reach the end of the string here.`
			`while (true) : (self.index += 1) {`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`const char = if (self.index != self.cmd_line.len)`
			`mem.littleToNative(u16, self.cmd_line[self.index])`
			`else`
			`0;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`switch (char) {`
			`0 => return strategy.eof,`
			`' ', '\t' => continue,`
			`else => break,`
			`}`
			`}`

			`// Parsing rules for subsequent arguments:`
			`//`
			`// - The end of the string always terminates the current argument.`
			`// - When not in 'inside_quotes' mode, a space or tab terminates the current argument.`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`// - 2n backslashes followed by a quote emit n backslashes (note: n can be zero).`
			`// If in 'inside_quotes' and the quote is immediately followed by a second quote,`
			`// one quote is emitted and the other is skipped, otherwise, the quote is skipped`
			`// and 'inside_quotes' is toggled.`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`// - 2n + 1 backslashes followed by a quote emit n backslashes followed by a quote.`
			`// - n backslashes not followed by a quote emit n backslashes.`
			`var backslash_count: usize = 0;`
			`var inside_quotes = false;`
			`while (true) : (self.index += 1) {`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`const char = if (self.index != self.cmd_line.len)`
			`mem.littleToNative(u16, self.cmd_line[self.index])`
			`else`
			`0;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`switch (char) {`
			`0 => {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitBackslashes(self, backslash_count, last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`return strategy.yieldArg(self);`
			`},`
			`' ', '\t' => {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitBackslashes(self, backslash_count, last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`backslash_count = 0;`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`if (inside_quotes) {`
			`last_emitted_code_unit = strategy.emitCharacter(self, char, last_emitted_code_unit);`
			`} else return strategy.yieldArg(self);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`},`
			`'"' => {`
			`const char_is_escaped_quote = backslash_count % 2 != 0;`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitBackslashes(self, backslash_count / 2, last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`backslash_count = 0;`
			`if (char_is_escaped_quote) {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitCharacter(self, '"', last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`} else {`
			`if (inside_quotes and`
			`self.index + 1 != self.cmd_line.len and`
ArgIteratorWindows: Reduce allocated memory by parsing the WTF-16 string directly Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned. After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly. This has a few (minor) benefits: - Cuts the amount of memory allocated by ArgIteratorWindows in half (or better) - Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`. 2024-07-12 00:38:10 -07:00			`mem.littleToNative(u16, self.cmd_line[self.index + 1]) == '"')`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`{`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitCharacter(self, '"', last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`self.index += 1;`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`} else {`
			`inside_quotes = !inside_quotes;`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`
			`}`
			`},`
			`'\\' => {`
			`backslash_count += 1;`
			`},`
			`else => {`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitBackslashes(self, backslash_count, last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`backslash_count = 0;`
ArgIteratorWindows: Store last emitted code unit instead of checking the last 6 emitted bytes Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate). After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair. This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation: Benchmark 1 (111 runs): benchargv-master.exe measurement mean ± σ min … max outliers delta wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0% peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0% Benchmark 2 (154 runs): benchargv-storelast.exe measurement mean ± σ min … max outliers delta wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2% peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0% Benchmark 3 (131 runs): benchargv-onlylow.exe measurement mean ± σ min … max outliers delta wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2% peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0% 2024-07-13 16:54:00 -07:00			`last_emitted_code_unit = strategy.emitCharacter(self, char, last_emitted_code_unit);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`},`
			`}`
			`}`
			`}`

			`/// Frees the iterator's copy of the command-line string and all previously returned`
			`/// argument slices.`
			`pub fn deinit(self: *ArgIteratorWindows) void {`
			`self.allocator.free(self.buffer);`
			`}`
			`};`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			/// Optional parameters for `ArgIteratorGeneral`
			`pub const ArgIteratorGeneralOptions = struct {`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`comments: bool = false,`
			`single_quotes: bool = false,`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`};`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// A general Iterator to parse a string into a set of arguments`
			`pub fn ArgIteratorGeneral(comptime options: ArgIteratorGeneralOptions) type {`
			`return struct {`
			`allocator: Allocator,`
			`index: usize = 0,`
			`cmd_line: []const u8,`

			`/// Should the cmd_line field be free'd (using the allocator) on deinit()?`
			`free_cmd_line_on_deinit: bool,`

			`/// buffer MUST be long enough to hold the cmd_line plus a null terminator.`
			`/// buffer will we free'd (using the allocator) on deinit()`
			`buffer: []u8,`
			`start: usize = 0,`
			`end: usize = 0,`

			`pub const Self = @This();`

			`pub const InitError = error{OutOfMemory};`

			`/// cmd_line_utf8 MUST remain valid and constant while using this instance`
			`pub fn init(allocator: Allocator, cmd_line_utf8: []const u8) InitError!Self {`
lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const buffer = try allocator.alloc(u8, cmd_line_utf8.len + 1);`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`errdefer allocator.free(buffer);`

			`return Self{`
			`.allocator = allocator,`
			`.cmd_line = cmd_line_utf8,`
			`.free_cmd_line_on_deinit = false,`
			`.buffer = buffer,`
			`};`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// cmd_line_utf8 will be free'd (with the allocator) on deinit()`
			`pub fn initTakeOwnership(allocator: Allocator, cmd_line_utf8: []const u8) InitError!Self {`
lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const buffer = try allocator.alloc(u8, cmd_line_utf8.len + 1);`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`errdefer allocator.free(buffer);`

			`return Self{`
			`.allocator = allocator,`
			`.cmd_line = cmd_line_utf8,`
			`.free_cmd_line_on_deinit = true,`
			`.buffer = buffer,`
			`};`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`// Skips over whitespace in the cmd_line.`
			`// Returns false if the terminating sentinel is reached, true otherwise.`
			`// Also skips over comments (if supported).`
			`fn skipWhitespace(self: *Self) bool {`
			`while (true) : (self.index += 1) {`
			`const character = if (self.index != self.cmd_line.len) self.cmd_line[self.index] else 0;`
			`switch (character) {`
			`0 => return false,`
			`' ', '\t', '\r', '\n' => continue,`
			`'#' => {`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`if (options.comments) {`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`while (true) : (self.index += 1) {`
			`switch (self.cmd_line[self.index]) {`
			`'\n' => break,`
			`0 => return false,`
			`else => continue,`
			`}`
			`}`
			`continue;`
			`} else {`
			`break;`
			`}`
			`},`
			`else => break,`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`return true;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`pub fn skip(self: *Self) bool {`
			`if (!self.skipWhitespace()) {`
			`return false;`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`var backslash_count: usize = 0;`
			`var in_quote = false;`
			`while (true) : (self.index += 1) {`
			`const character = if (self.index != self.cmd_line.len) self.cmd_line[self.index] else 0;`
			`switch (character) {`
			`0 => return true,`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`'"', '\'' => {`
			`if (!options.single_quotes and character == '\'') {`
			`backslash_count = 0;`
			`continue;`
			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`const quote_is_real = backslash_count % 2 == 0;`
			`if (quote_is_real) {`
			`in_quote = !in_quote;`
			`}`
			`},`
			`'\\' => {`
			`backslash_count += 1;`
			`},`
			`' ', '\t', '\r', '\n' => {`
			`if (!in_quote) {`
			`return true;`
			`}`
			`backslash_count = 0;`
			`},`
			`else => {`
			`backslash_count = 0;`
			`continue;`
			`},`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// Returns a slice of the internal buffer that contains the next argument.`
			`/// Returns null when it reaches the end.`
			`pub fn next(self: *Self) ?[:0]const u8 {`
			`if (!self.skipWhitespace()) {`
			`return null;`
			`}`

			`var backslash_count: usize = 0;`
			`var in_quote = false;`
			`while (true) : (self.index += 1) {`
			`const character = if (self.index != self.cmd_line.len) self.cmd_line[self.index] else 0;`
			`switch (character) {`
			`0 => {`
			`self.emitBackslashes(backslash_count);`
			`self.buffer[self.end] = 0;`
lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const token = self.buffer[self.start..self.end :0];`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`self.end += 1;`
			`self.start = self.end;`
			`return token;`
			`},`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`'"', '\'' => {`
			`if (!options.single_quotes and character == '\'') {`
			`self.emitBackslashes(backslash_count);`
			`backslash_count = 0;`
			`self.emitCharacter(character);`
			`continue;`
			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`const quote_is_real = backslash_count % 2 == 0;`
			`self.emitBackslashes(backslash_count / 2);`
			`backslash_count = 0;`

			`if (quote_is_real) {`
			`in_quote = !in_quote;`
			`} else {`
			`self.emitCharacter('"');`
			`}`
			`},`
			`'\\' => {`
			`backslash_count += 1;`
			`},`
			`' ', '\t', '\r', '\n' => {`
			`self.emitBackslashes(backslash_count);`
			`backslash_count = 0;`
			`if (in_quote) {`
			`self.emitCharacter(character);`
			`} else {`
			`self.buffer[self.end] = 0;`
lib: correct unnecessary uses of 'var' 2023-11-10 05:27:17 +00:00			`const token = self.buffer[self.start..self.end :0];`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`self.end += 1;`
			`self.start = self.end;`
			`return token;`
			`}`
			`},`
			`else => {`
			`self.emitBackslashes(backslash_count);`
			`backslash_count = 0;`
			`self.emitCharacter(character);`
			`},`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`fn emitBackslashes(self: *Self, emit_count: usize) void {`
			`var i: usize = 0;`
			`while (i < emit_count) : (i += 1) {`
			`self.emitCharacter('\\');`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`fn emitCharacter(self: *Self, char: u8) void {`
			`self.buffer[self.end] = char;`
			`self.end += 1;`
			`}`
Switch to using unicode when parsing the command line on windows (#7241) * Switch to using unicode when parsing the command line on windows * Apply changes by LemonBoy and hopefully fix tests on MIPs Co-authored-by: LemonBoy <LemonBoy@users.noreply.github.com> * Fix up next and skip * Move comment to more relevant place Co-authored-by: LemonBoy <LemonBoy@users.noreply.github.com> 2020-11-30 10:47:01 -08:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// Call to free the internal buffer of the iterator.`
			`pub fn deinit(self: *Self) void {`
			`self.allocator.free(self.buffer);`

			`if (self.free_cmd_line_on_deinit) {`
			`self.allocator.free(self.cmd_line);`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`};`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// Cross-platform command line argument iterator.`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`pub const ArgIterator = struct {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`const InnerType = switch (native_os) {`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`.windows => ArgIteratorWindows,`
WASI,libc: enable tests. Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io> 2021-07-27 08:59:34 +09:00			`.wasi => if (builtin.link_libc) ArgIteratorPosix else ArgIteratorWasi,`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`else => ArgIteratorPosix,`
			`};`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`inner: InnerType,`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// Initialize the args iterator. Consider using initWithAllocator() instead`
			`/// for cross-platform compatibility.`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`pub fn init() ArgIterator {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .wasi) {`
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`@compileError("In WASI, use initWithAllocator instead.");`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`@compileError("In Windows, use initWithAllocator instead.");`
			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`return ArgIterator{ .inner = InnerType.init() };`
			`}`

Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`pub const InitError = InnerType.InitError;`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			/// You must deinitialize iterator's internal buffers by calling `deinit` when done.
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn initWithAllocator(allocator: Allocator) InitError!ArgIterator {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .wasi and !builtin.link_libc) {`
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`return ArgIterator{ .inner = try InnerType.init(allocator) };`
			`}`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Replace GetCommandLineW with PEB access, delete GetCommandLine bindings 2024-07-13 17:25:06 -07:00			`const cmd_line = std.os.windows.peb().ProcessParameters.CommandLine;`
ArgIteratorWindows.init: Take `[]const u16` slice instead of multi-item pointer Now that we use the PEB to get the precise length of the command line string, there's no need for a multi-item pointer/sliceTo call. This provides a minor speedup: Benchmark 1 (153 runs): benchargv-before.exe measurement mean ± σ min … max outliers delta wall_time 32.7ms ± 429us 32.1ms … 36.9ms 1 ( 1%) 0% peak_rss 6.49MB ± 5.62KB 6.46MB … 6.49MB 14 ( 9%) 0% Benchmark 2 (157 runs): benchargv-after.exe measurement mean ± σ min … max outliers delta wall_time 31.9ms ± 236us 31.4ms … 32.7ms 4 ( 3%) ⚡- 2.4% ± 0.2% peak_rss 6.49MB ± 4.77KB 6.46MB … 6.49MB 14 ( 9%) + 0.0% ± 0.0% 2024-07-13 17:29:18 -07:00			`const cmd_line_w = cmd_line.Buffer.?[0 .. cmd_line.Length / 2];`
			`return ArgIterator{ .inner = try InnerType.init(allocator, cmd_line_w) };`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`return ArgIterator{ .inner = InnerType.init() };`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`/// Get the next argument. Returns 'null' if we are at the end.`
			`/// Returned slice is pointing to the iterator's internal buffer.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// On Windows, the result is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On other platforms, the result is an opaque sequence of bytes with no particular encoding.`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`pub fn next(self: *ArgIterator) ?([:0]const u8) {`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`return self.inner.next();`
			`}`

do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`/// Parse past 1 argument without capturing it.`
			/// Returns `true` if skipped an arg, `false` if we are at the end.
			`pub fn skip(self: *ArgIterator) bool {`
			`return self.inner.skip();`
			`}`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`/// Call this to free the iterator's internal buffer if the iterator`
			/// was created with `initWithAllocator` function.
			`pub fn deinit(self: *ArgIterator) void {`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`// Unless we're targeting WASI or Windows, this is a no-op.`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .wasi and !builtin.link_libc) {`
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`self.inner.deinit();`
			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`if (native_os == .windows) {`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`self.inner.deinit();`
			`}`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00			`}`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`};`

process: add args definition comment To improve understandability of its purpose. 2023-04-06 22:57:25 +02:00			`/// Holds the command-line arguments, with the program name as the first entry.`
			`/// Use argsWithAllocator() for cross-platform code.`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`pub fn args() ArgIterator {`
			`return ArgIterator.init();`
			`}`

Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			/// You must deinitialize iterator's internal buffers by calling `deinit` when done.
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn argsWithAllocator(allocator: Allocator) ArgIterator.InitError!ArgIterator {`
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`return ArgIterator.initWithAllocator(allocator);`
			`}`

do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`/// Caller must call argsFree on result.`
replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 17:18:34 +00:00			`/// On Windows, the result is encoded as [WTF-8](https://wtf-8.codeberg.page/).`
Fix handling of Windows (WTF-16) and WASI (UTF-8) paths Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior. WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8. Closes #18694 Closes #1774 Closes #2565 2024-02-13 16:56:50 -08:00			`/// On other platforms, the result is an opaque sequence of bytes with no particular encoding.`
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn argsAlloc(allocator: Allocator) ![][:0]u8 {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`// TODO refactor to only make 1 allocation.`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`var it = try argsWithAllocator(allocator);`
Make ArgIterator.init() a compile error in WASI Given that the previous design would require the use of a default allocator to have `ArgIterator.init()` work in WASI, and since in Zig we're trying to avoid default allocators, I've changed the design slightly in that now `init()` is a compile error in WASI, and instead in its message it points to `initWithAllocator(*mem.Allocator)`. The latter by virtue of requiring an allocator as an argument can safely be used in WASI as well as on other OSes (where the allocator argument is simply unused). When using `initWithAllocator` it is then natural to remember to call `deinit()` after being done with the iterator. Also, to make use of this, I've also added `argsWithAllocator` function which is equivalent to `args` minus the requirement of supplying an allocator and being fallible. Finally, I've also modified the WASI only test `process.ArgWasiIterator` to test all OSes. 2020-05-29 08:40:32 +02:00			`defer it.deinit();`
Add ArgIteratorWasi and integrate it with ArgIterator This commit pulls WASI specific implementation of args extraction from the runtime from `process.argsAlloc` and `process.argsFree` into a new iterator struct `process.ArgIteratorWasi`. It also integrates the struct with platform-independent `process.ArgIterator`. 2020-05-20 19:42:15 +02:00
std.ArrayList: make unmanaged the default 2025-07-31 21:54:07 -07:00			`var contents = std.array_list.Managed(u8).init(allocator);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`defer contents.deinit();`

std.ArrayList: make unmanaged the default 2025-07-31 21:54:07 -07:00			`var slice_list = std.array_list.Managed(usize).init(allocator);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`defer slice_list.deinit();`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`while (it.next()) \|arg\| {`
Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`try contents.appendSlice(arg[0 .. arg.len + 1]);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`try slice_list.append(arg.len);`
			`}`

remove deprecated uses of ArrayList.span 2020-11-06 18:54:08 +00:00			`const contents_slice = contents.items;`
			`const slice_sizes = slice_list.items;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const slice_list_bytes = try math.mul(usize, @sizeOf([]u8), slice_sizes.len);`
fix argsAlloc buffer size The buffer `buf` contains N (= `slice_sizes.len`) slices followed by the N null-terminated arguments. The N null-terminated arguments are stored in the `contents` array list. Thus, `buf` size should be: @sizeOf([]u8) * slice_sizes.len + contents_slice.len Instead of: @sizeOf([]u8) * slice_sizes.len + contents_slice.len + slice_sizes.len This bug was found thanks to the gpa allocator which checks if freed size matches allocated sizes for large allocations. 2022-01-28 10:40:03 +01:00			`const total_bytes = try math.add(usize, slice_list_bytes, contents_slice.len);`
std: eradicate u29 and embrace std.mem.Alignment 2025-04-11 17:55:25 -07:00			`const buf = try allocator.alignedAlloc(u8, .of([]u8), total_bytes);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`errdefer allocator.free(buf);`

Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`const result_slice_list = mem.bytesAsSlice([:0]u8, buf[0..slice_list_bytes]);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const result_contents = buf[slice_list_bytes..];`
update codebase to use `@memset` and `@memcpy` 2023-04-26 13:57:08 -07:00			`@memcpy(result_contents[0..contents_slice.len], contents_slice);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
			`var contents_index: usize = 0;`
update std lib and compiler sources to new for loop syntax 2023-02-18 09:02:57 -07:00			`for (slice_sizes, 0..) \|len, i\| {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const new_index = contents_index + len;`
Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`result_slice_list[i] = result_contents[contents_index..new_index :0];`
			`contents_index = new_index + 1;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

			`return result_slice_list;`
			`}`

std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn argsFree(allocator: Allocator, args_alloc: []const [:0]u8) void {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`var total_bytes: usize = 0;`
			`for (args_alloc) \|arg\| {`
Make argsAlloc/ArgIterator return zero-sentinel strings (#6720) 2020-10-22 17:52:48 -04:00			`total_bytes += @sizeOf([]u8) + arg.len + 1;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
all: migrate code to new cast builtin syntax Most of this migration was performed automatically with `zig fmt`. There were a few exceptions which I had to manually fix: * `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten * `@truncate`'s fixup is incorrect for vectors * Test cases are not formatted, and their error locations change 2023-06-22 18:46:56 +01:00			`const unaligned_allocated_buf = @as([*]const u8, @ptrCast(args_alloc.ptr))[0..total_bytes];`
			`const aligned_allocated_buf: []align(@alignOf([]u8)) const u8 = @alignCast(unaligned_allocated_buf);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`return allocator.free(aligned_allocated_buf);`
			`}`

std: promote tests to doctests Now these show up as "example usage" in generated documentation. 2024-03-13 15:56:09 -07:00			`test ArgIteratorWindows {`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`const t = testArgIteratorWindows;`

			`try t(`
			`\\"C:\Program Files\zig\zig.exe" run .\src\main.zig -target x86_64-windows-gnu -O ReleaseSafe -- --emoji=🗿 --eval="new Regex(\"Dwayne \\\"The Rock\\\" Johnson\")"`
			`, &.{`
			`\\C:\Program Files\zig\zig.exe`
			`,`
			`\\run`
			`,`
			`\\.\src\main.zig`
			`,`
			`\\-target`
			`,`
			`\\x86_64-windows-gnu`
			`,`
			`\\-O`
			`,`
			`\\ReleaseSafe`
			`,`
			`\\--`
			`,`
			`\\--emoji=🗿`
			`,`
			`\\--eval=new Regex("Dwayne \"The Rock\" Johnson")`
			`,`
			`});`

			`// Empty`
			`try t("", &.{});`

			`// Separators`
			`try t("aa bb cc", &.{ "aa", "bb", "cc" });`
			`try t("aa\tbb\tcc", &.{ "aa", "bb", "cc" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t("aa\nbb\ncc", &.{"aa\nbb\ncc"});`
			`try t("aa\r\nbb\r\ncc", &.{"aa\r\nbb\r\ncc"});`
			`try t("aa\rbb\rcc", &.{"aa\rbb\rcc"});`
			`try t("aa\x07bb\x07cc", &.{"aa\x07bb\x07cc"});`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t("aa\x7Fbb\x7Fcc", &.{"aa\x7Fbb\x7Fcc"});`
			`try t("aa🦎bb🦎cc", &.{"aa🦎bb🦎cc"});`

			`// Leading/trailing whitespace`
			`try t(" ", &.{""});`
			`try t(" aa bb ", &.{ "", "aa", "bb" });`
			`try t("\t\t", &.{""});`
			`try t("\t\taa\t\tbb\t\t", &.{ "", "aa", "bb" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t("\n\n", &.{"\n\n"});`
			`try t("\n\naa\n\nbb\n\n", &.{"\n\naa\n\nbb\n\n"});`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00
			`// Executable name with quotes/backslashes`
			`try t("\"aa bb\tcc\ndd\"", &.{"aa bb\tcc\ndd"});`
			`try t("\"", &.{""});`
			`try t("\"\"", &.{""});`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t("\"\"\"", &.{""});`
			`try t("\"\"\"\"", &.{""});`
			`try t("\"\"\"\"\"", &.{""});`
			`try t("aa\"bb\"cc\"dd", &.{"aabbccdd"});`
			`try t("aa\"bb cc\"dd", &.{"aabb ccdd"});`
			`try t("\"aa\\\"bb\"", &.{"aa\\bb"});`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t("\"aa\\\\\"", &.{"aa\\\\"});`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t("aa\\\"bb", &.{"aa\\bb"});`
			`try t("aa\\\\\"bb", &.{"aa\\\\bb"});`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00
			`// Arguments with quotes/backslashes`
			`try t(". \"aa bb\tcc\ndd\"", &.{ ".", "aa bb\tcc\ndd" });`
			`try t(". aa\" \"bb\"\t\"cc\"\n\"dd\"", &.{ ".", "aa bb\tcc\ndd" });`
			`try t(". ", &.{"."});`
			`try t(". \"", &.{ ".", "" });`
			`try t(". \"\"", &.{ ".", "" });`
			`try t(". \"\"\"", &.{ ".", "\"" });`
			`try t(". \"\"\"\"", &.{ ".", "\"" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t(". \"\"\"\"\"", &.{ ".", "\"\"" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t(". \"\"\"\"\"\"", &.{ ".", "\"\"" });`
			`try t(". \" \"", &.{ ".", " " });`
			`try t(". \" \"\"", &.{ ".", " \"" });`
			`try t(". \" \"\"\"", &.{ ".", " \"" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t(". \" \"\"\"\"", &.{ ".", " \"\"" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t(". \" \"\"\"\"\"", &.{ ".", " \"\"" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t(". \" \"\"\"\"\"\"", &.{ ".", " \"\"\"" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t(". \\\"", &.{ ".", "\"" });`
			`try t(". \\\"\"", &.{ ".", "\"" });`
			`try t(". \\\"\"\"", &.{ ".", "\"" });`
			`try t(". \\\"\"\"\"", &.{ ".", "\"\"" });`
			`try t(". \\\"\"\"\"\"", &.{ ".", "\"\"" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t(". \\\"\"\"\"\"\"", &.{ ".", "\"\"\"" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t(". \" \\\"", &.{ ".", " \"" });`
			`try t(". \" \\\"\"", &.{ ".", " \"" });`
			`try t(". \" \\\"\"\"", &.{ ".", " \"\"" });`
			`try t(". \" \\\"\"\"\"", &.{ ".", " \"\"" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00			`try t(". \" \\\"\"\"\"\"", &.{ ".", " \"\"\"" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`try t(". \" \\\"\"\"\"\"\"", &.{ ".", " \"\"\"" });`
			`try t(". aa\\bb\\\\cc\\\\\\dd", &.{ ".", "aa\\bb\\\\cc\\\\\\dd" });`
			`try t(". \\\\\\\"aa bb\"", &.{ ".", "\\\"aa", "bb" });`
			`try t(". \\\\\\\\\"aa bb\"", &.{ ".", "\\\\aa bb" });`
ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file. 2024-04-15 02:09:19 -07:00
			`// From https://learn.microsoft.com/en-us/cpp/cpp/main-function-command-line-args#results-of-parsing-command-lines`
			`try t(`
			`\\foo.exe "abc" d e`
			`, &.{ "foo.exe", "abc", "d", "e" });`
			`try t(`
			`\\foo.exe a\\b d"e f"g h`
			`, &.{ "foo.exe", "a\\\\b", "de fg", "h" });`
			`try t(`
			`\\foo.exe a\\\"b c d`
			`, &.{ "foo.exe", "a\\\"b", "c", "d" });`
			`try t(`
			`\\foo.exe a\\\\"b c" d e`
			`, &.{ "foo.exe", "a\\\\b c", "d", "e" });`
			`try t(`
			`\\foo.exe a"b"" c d`
			`, &.{ "foo.exe", "ab\" c d" });`

			`// From https://daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULESEX`
			`try t("foo.exe CallMeIshmael", &.{ "foo.exe", "CallMeIshmael" });`
			`try t("foo.exe \"Call Me Ishmael\"", &.{ "foo.exe", "Call Me Ishmael" });`
			`try t("foo.exe Cal\"l Me I\"shmael", &.{ "foo.exe", "Call Me Ishmael" });`
			`try t("foo.exe CallMe\\\"Ishmael", &.{ "foo.exe", "CallMe\"Ishmael" });`
			`try t("foo.exe \"CallMe\\\"Ishmael\"", &.{ "foo.exe", "CallMe\"Ishmael" });`
			`try t("foo.exe \"Call Me Ishmael\\\\\"", &.{ "foo.exe", "Call Me Ishmael\\" });`
			`try t("foo.exe \"CallMe\\\\\\\"Ishmael\"", &.{ "foo.exe", "CallMe\\\"Ishmael" });`
			`try t("foo.exe a\\\\\\b", &.{ "foo.exe", "a\\\\\\b" });`
			`try t("foo.exe \"a\\\\\\b\"", &.{ "foo.exe", "a\\\\\\b" });`

			`// Surrogate pair encoding of 𐐷 separated by quotes.`
			`// Encoded as WTF-16:`
			`// "<0xD801>"<0xDC37>`
			`// Encoded as WTF-8:`
			`// "<0xED><0xA0><0x81>"<0xED><0xB0><0xB7>`
			`// During parsing, the quotes drop out and the surrogate pair`
			`// should end up encoded as its normal UTF-8 representation.`
			`try t("foo.exe \"\xed\xa0\x81\"\xed\xb0\xb7", &.{ "foo.exe", "𐐷" });`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`}`

			`fn testArgIteratorWindows(cmd_line: []const u8, expected_args: []const []const u8) !void {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const cmd_line_w = try unicode.wtf8ToWtf16LeAllocZ(testing.allocator, cmd_line);`
Update `ArgIterator` on Windows to follow standard Windows parsing rules This adds `ArgIteratorWindows`, which faithfully replicates the quoting and escaping behavior observed in `CommandLineToArgvW` and should make Zig applications play better with processes that abuse these quirks. 2023-12-18 22:55:46 +01:00			`defer testing.allocator.free(cmd_line_w);`

			`// next`
			`{`
			`var it = try ArgIteratorWindows.init(testing.allocator, cmd_line_w);`
			`defer it.deinit();`

			`for (expected_args) \|expected\| {`
			`if (it.next()) \|actual\| {`
			`try testing.expectEqualStrings(expected, actual);`
			`} else {`
			`return error.TestUnexpectedResult;`
			`}`
			`}`
			`try testing.expect(it.next() == null);`
			`}`

			`// skip`
			`{`
			`var it = try ArgIteratorWindows.init(testing.allocator, cmd_line_w);`
			`defer it.deinit();`

			`for (0..expected_args.len) \|_\| {`
			`try testing.expect(it.skip());`
			`}`
			`try testing.expect(!it.skip());`
			`}`
			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`test "general arg parsing" {`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`try testGeneralCmdLine("a b\tc d", &.{ "a", "b", "c", "d" });`
			`try testGeneralCmdLine("\"abc\" d e", &.{ "abc", "d", "e" });`
			`try testGeneralCmdLine("a\\\\\\b d\"e f\"g h", &.{ "a\\\\\\b", "de fg", "h" });`
			`try testGeneralCmdLine("a\\\\\\\"b c d", &.{ "a\\\"b", "c", "d" });`
			`try testGeneralCmdLine("a\\\\\\\\\"b c\" d e", &.{ "a\\\\b c", "d", "e" });`
			`try testGeneralCmdLine("a b\tc \"d f", &.{ "a", "b", "c", "d f" });`
			`try testGeneralCmdLine("j k l\\", &.{ "j", "k", "l\\" });`
			`try testGeneralCmdLine("\"\" x y z\\\\", &.{ "", "x", "y", "z\\\\" });`

			`try testGeneralCmdLine("\".\\..\\zig-cache\\build\" \"bin\\zig.exe\" \".\\..\" \".\\..\\zig-cache\" \"--help\"", &.{`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`".\\..\\zig-cache\\build",`
			`"bin\\zig.exe",`
			`".\\..",`
			`".\\..\\zig-cache",`
			`"--help",`
			`});`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00
			`try testGeneralCmdLine(`
			`\\ 'foo' "bar"`
			`, &.{ "'foo'", "bar" });`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`fn testGeneralCmdLine(input_cmd_line: []const u8, expected_args: []const []const u8) !void {`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`var it = try ArgIteratorGeneral(.{}).init(std.testing.allocator, input_cmd_line);`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`defer it.deinit();`
			`for (expected_args) \|expected_arg\| {`
			`const arg = it.next().?;`
			`try testing.expectEqualStrings(expected_arg, arg);`
			`}`
			`try testing.expect(it.next() == null);`
			`}`

			`test "response file arg parsing" {`
			`try testResponseFileCmdLine(`
			`\\a b`
			`\\c d\`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`, &.{ "a", "b", "c", "d\\" });`
			`try testResponseFileCmdLine("a b c d\\", &.{ "a", "b", "c", "d\\" });`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00
			`try testResponseFileCmdLine(`
			`\\j`
			`\\ k l # this is a comment \\ \\\ \\\\ "none" "\\" "\\\"`
			`\\ "m" #another comment`
			`\\`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`, &.{ "j", "k", "l", "m" });`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00
			`try testResponseFileCmdLine(`
			`\\ "" q ""`
			`\\ "r s # t" "u\" v" #another comment`
			`\\`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`, &.{ "", "q", "", "r s # t", "u\" v" });`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00
			`try testResponseFileCmdLine(`
			`\\ -l"advapi32" a# b#c d#`
			`\\e\\\`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`, &.{ "-ladvapi32", "a#", "b#c", "d#", "e\\\\\\" });`

			`try testResponseFileCmdLine(`
			`\\ 'foo' "bar"`
			`, &.{ "foo", "bar" });`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`}`

			`fn testResponseFileCmdLine(input_cmd_line: []const u8, expected_args: []const []const u8) !void {`
std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-04 19:55:32 +02:00			`var it = try ArgIteratorGeneral(.{ .comments = true, .single_quotes = true })`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`.init(std.testing.allocator, input_cmd_line);`
			`defer it.deinit();`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`for (expected_args) \|expected_arg\| {`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`const arg = it.next().?;`
std: update usage of std.testing 2021-05-04 20:47:26 +03:00			`try testing.expectEqualStrings(expected_arg, arg);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
Full response file (.rsp) support I hit the "quotes in an RSP file" issue when trying to compile gRPC using "zig cc". As a fun exercise, I decided to see if I could fix it myself. I'm fully open to this code being flat-out rejected. Or I can take feedback to fix it up. This modifies (and renames) _ArgIteratorWindows_ in process.zig such that it works with arbitrary strings (or the contents of an RSP file). In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO" listed in _ClangArgIterator_. This change closes #4833. Pros:* - It has the nice attribute of handling "RSP file" arguments in the same way it handles "cmd_line" arguments. - High Performance, minimal allocations - Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes in a command line were entirely dropped - Added a test case for the above bug - Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_ rather than _next()_) - Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once for the entire cmd_line Cons: - Breaking Change in std library on Windows: Call _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_ - PhaseMage is new with contributions to Zig, might need a lot of hand-holding - PhaseMage is a Windows person, non-Windows stuff will need to be double-checked Testing Done: - Wrote a few new test cases in process.zig - zig.exe build test -Dskip-release (no new failures seen) - zig cc now builds gRPC without error 2022-01-30 11:27:52 -08:00			`try testing.expect(it.next() == null);`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`

			`pub const UserInfo = struct {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`uid: posix.uid_t,`
			`gid: posix.gid_t,`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`};`

			`/// POSIX function which gets a uid from username.`
			`pub fn getUserInfo(name: []const u8) !UserInfo {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return switch (native_os) {`
Add illumos OS tag - Adds `illumos` to the `Target.Os.Tag` enum. A new function, `isSolarish` has been added that returns true if the tag is either Solaris or Illumos. This matches the naming convention found in Rust's `libc` crate[1]. - Add the tag wherever `.solaris` is being checked against. - Check for the C pre-processor macro `__illumos__` in CMake to set the proper target tuple. Illumos distros patch their compilers to have this in the "built-in" set (verified with `echo \| cc -dM -E -`). Alternatively you could check the output of `uname -o`. Right now, both Solaris and Illumos import from `c/solaris.zig`. In the future it may be worth putting the shared ABI bits in a base file, and mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files. [1]: https://github.com/rust-lang/libc/tree/6e02a329a2a27f6887ea86952f389ca11e06448c/src/unix/solarish 2023-10-01 23:09:14 +11:00			`.linux,`
represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct OS target. Earlier, when DriverKit support was added to LLVM, it was represented a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in the target triple is beyond me. But this isn't the first time they've ignored established target triple norms (see: armv7k and aarch64_32) and it probably won't be the last. While doing this, I also audited all Darwin OS prongs throughout the codebase and made sure they cover all the tags. 2025-11-13 18:05:46 +01:00			`.driverkit,`
			`.ios,`
			`.maccatalyst,`
Add illumos OS tag - Adds `illumos` to the `Target.Os.Tag` enum. A new function, `isSolarish` has been added that returns true if the tag is either Solaris or Illumos. This matches the naming convention found in Rust's `libc` crate[1]. - Add the tag wherever `.solaris` is being checked against. - Check for the C pre-processor macro `__illumos__` in CMake to set the proper target tuple. Illumos distros patch their compilers to have this in the "built-in" set (verified with `echo \| cc -dM -E -`). Alternatively you could check the output of `uname -o`. Right now, both Solaris and Illumos import from `c/solaris.zig`. In the future it may be worth putting the shared ABI bits in a base file, and mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files. [1]: https://github.com/rust-lang/libc/tree/6e02a329a2a27f6887ea86952f389ca11e06448c/src/unix/solarish 2023-10-01 23:09:14 +11:00			`.macos,`
			`.tvos,`
represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct OS target. Earlier, when DriverKit support was added to LLVM, it was represented a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in the target triple is beyond me. But this isn't the first time they've ignored established target triple norms (see: armv7k and aarch64_32) and it probably won't be the last. While doing this, I also audited all Darwin OS prongs throughout the codebase and made sure they cover all the tags. 2025-11-13 18:05:46 +01:00			`.visionos,`
			`.watchos,`
Add illumos OS tag - Adds `illumos` to the `Target.Os.Tag` enum. A new function, `isSolarish` has been added that returns true if the tag is either Solaris or Illumos. This matches the naming convention found in Rust's `libc` crate[1]. - Add the tag wherever `.solaris` is being checked against. - Check for the C pre-processor macro `__illumos__` in CMake to set the proper target tuple. Illumos distros patch their compilers to have this in the "built-in" set (verified with `echo \| cc -dM -E -`). Alternatively you could check the output of `uname -o`. Right now, both Solaris and Illumos import from `c/solaris.zig`. In the future it may be worth putting the shared ABI bits in a base file, and mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files. [1]: https://github.com/rust-lang/libc/tree/6e02a329a2a27f6887ea86952f389ca11e06448c/src/unix/solarish 2023-10-01 23:09:14 +11:00			`.freebsd,`
			`.netbsd,`
			`.openbsd,`
			`.haiku,`
			`.illumos,`
std: Add support for SerenityOS in various places Not nearly the entire downstream patchset but these are completely uncontroversial and known to work. 2025-03-10 22:48:21 +00:00			`.serenity,`
Add illumos OS tag - Adds `illumos` to the `Target.Os.Tag` enum. A new function, `isSolarish` has been added that returns true if the tag is either Solaris or Illumos. This matches the naming convention found in Rust's `libc` crate[1]. - Add the tag wherever `.solaris` is being checked against. - Check for the C pre-processor macro `__illumos__` in CMake to set the proper target tuple. Illumos distros patch their compilers to have this in the "built-in" set (verified with `echo \| cc -dM -E -`). Alternatively you could check the output of `uname -o`. Right now, both Solaris and Illumos import from `c/solaris.zig`. In the future it may be worth putting the shared ABI bits in a base file, and mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files. [1]: https://github.com/rust-lang/libc/tree/6e02a329a2a27f6887ea86952f389ca11e06448c/src/unix/solarish 2023-10-01 23:09:14 +11:00			`=> posixGetUserInfo(name),`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`else => @compileError("Unsupported OS"),`
			`};`
			`}`

			`/// TODO this reads /etc/passwd. But sometimes the user/id mapping is in something else`
			/// like NIS, AD, etc. See `man nss` or look at an strace for `id myuser`.
			`pub fn posixGetUserInfo(name: []const u8) !UserInfo {`
std: fix bitrot in process.posixGetUserInfo() 2020-09-10 13:36:34 +02:00			`const file = try std.fs.openFileAbsolute("/etc/passwd", .{});`
			`defer file.close();`
std.Io: delete GenericReader and delete deprecated alias std.io 2025-08-27 21:20:18 -07:00			`var buffer: [4096]u8 = undefined;`
			`var file_reader = file.reader(&buffer);`
			`return posixGetUserInfoPasswdStream(name, &file_reader.interface) catch \|err\| switch (err) {`
			`error.ReadFailed => return file_reader.err.?,`
			`error.EndOfStream => return error.UserNotFound,`
			`error.CorruptPasswordFile => return error.CorruptPasswordFile,`
			`};`
			`}`
std: fix bitrot in process.posixGetUserInfo() 2020-09-10 13:36:34 +02:00
std.Io: delete GenericReader and delete deprecated alias std.io 2025-08-27 21:20:18 -07:00			`fn posixGetUserInfoPasswdStream(name: []const u8, reader: *std.Io.Reader) !UserInfo {`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`const State = enum {`
std.Io: delete GenericReader and delete deprecated alias std.io 2025-08-27 21:20:18 -07:00			`start,`
			`wait_for_next_line,`
			`skip_password,`
			`read_user_id,`
			`read_group_id,`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`};`

			`var name_index: usize = 0;`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`var uid: posix.uid_t = 0;`
			`var gid: posix.gid_t = 0;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00
std.Io: delete GenericReader and delete deprecated alias std.io 2025-08-27 21:20:18 -07:00			`sw: switch (State.start) {`
			`.start => switch (try reader.takeByte()) {`
			`':' => {`
			`if (name_index == name.len) {`
			`continue :sw .skip_password;`
			`} else {`
			`continue :sw .wait_for_next_line;`
			`}`
			`},`
			`'\n' => return error.CorruptPasswordFile,`
			`else => \|byte\| {`
			`if (name_index == name.len or name[name_index] != byte) {`
			`continue :sw .wait_for_next_line;`
			`}`
			`name_index += 1;`
			`continue :sw .start;`
			`},`
			`},`
			`.wait_for_next_line => switch (try reader.takeByte()) {`
			`'\n' => {`
			`name_index = 0;`
			`continue :sw .start;`
			`},`
			`else => continue :sw .wait_for_next_line,`
			`},`
			`.skip_password => switch (try reader.takeByte()) {`
			`'\n' => return error.CorruptPasswordFile,`
			`':' => {`
			`continue :sw .read_user_id;`
			`},`
			`else => continue :sw .skip_password,`
			`},`
			`.read_user_id => switch (try reader.takeByte()) {`
			`':' => {`
			`continue :sw .read_group_id;`
			`},`
			`'\n' => return error.CorruptPasswordFile,`
			`else => \|byte\| {`
			`const digit = switch (byte) {`
			`'0'...'9' => byte - '0',`
			`else => return error.CorruptPasswordFile,`
			`};`
			`{`
			`const ov = @mulWithOverflow(uid, 10);`
			`if (ov[1] != 0) return error.CorruptPasswordFile;`
			`uid = ov[0];`
			`}`
			`{`
			`const ov = @addWithOverflow(uid, digit);`
			`if (ov[1] != 0) return error.CorruptPasswordFile;`
			`uid = ov[0];`
			`}`
			`continue :sw .read_user_id;`
			`},`
			`},`
			`.read_group_id => switch (try reader.takeByte()) {`
			`'\n', ':' => return .{`
			`.uid = uid,`
			`.gid = gid,`
			`},`
			`else => \|byte\| {`
			`const digit = switch (byte) {`
			`'0'...'9' => byte - '0',`
			`else => return error.CorruptPasswordFile,`
			`};`
			`{`
			`const ov = @mulWithOverflow(gid, 10);`
			`if (ov[1] != 0) return error.CorruptPasswordFile;`
			`gid = ov[0];`
			`}`
			`{`
			`const ov = @addWithOverflow(gid, digit);`
			`if (ov[1] != 0) return error.CorruptPasswordFile;`
			`gid = ov[0];`
			`}`
			`continue :sw .read_group_id;`
			`},`
			`},`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
std.Io: delete GenericReader and delete deprecated alias std.io 2025-08-27 21:20:18 -07:00			`comptime unreachable;`
do Jay's suggestion with posix/os API naming & layout 2019-05-24 18:27:18 -04:00			`}`
clean up references to os 2019-05-26 13:17:34 -04:00
			`pub fn getBaseAddress() usize {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`switch (native_os) {`
clean up references to os 2019-05-26 13:17:34 -04:00			`.linux => {`
posix: reduce the number of assumptions made by `dl_iterate_phdr` Not yet fully compatible with the new linker, but still progress. Closes #25786 2025-11-08 23:03:10 -05:00			`const phdrs = std.posix.getSelfPhdrs();`
			`var base: usize = 0;`
			`for (phdrs) \|phdr\| switch (phdr.type) {`
			`.LOAD => return base + phdr.vaddr,`
			`.PHDR => base = @intFromPtr(phdrs.ptr) - phdr.vaddr,`
			`else => {},`
			`} else unreachable;`
clean up references to os 2019-05-26 13:17:34 -04:00			`},`
represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct OS target. Earlier, when DriverKit support was added to LLVM, it was represented a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in the target triple is beyond me. But this isn't the first time they've ignored established target triple norms (see: armv7k and aarch64_32) and it probably won't be the last. While doing this, I also audited all Darwin OS prongs throughout the codebase and made sure they cover all the tags. 2025-11-13 18:05:46 +01:00			`.driverkit, .ios, .maccatalyst, .macos, .tvos, .visionos, .watchos => {`
all: zig fmt and rename "@XToY" to "@YFromX" Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me> 2023-06-15 13:14:16 +06:00			`return @intFromPtr(&std.c._mh_execute_header);`
clean up references to os 2019-05-26 13:17:34 -04:00			`},`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`.windows => return @intFromPtr(windows.kernel32.GetModuleHandleW(null)),`
clean up references to os 2019-05-26 13:17:34 -04:00			`else => @compileError("Unsupported OS"),`
			`}`
			`}`
self-host dynamic linker detection 2020-02-17 15:23:59 -05:00
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			/// Tells whether calling the `execv` or `execve` functions will be a compile error.
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`pub const can_execv = switch (native_os) {`
Avoid depending on child process execution when not supported by host OS In accordance with the requesting issue (#10750): - `zig test` skips any tests that it cannot spawn, returning success - `zig run` and `zig build` exit with failure, reporting the command the cannot be run - `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change - Native `libc` Detection is not supported Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run 2022-02-03 15:27:01 -07:00			`.windows, .haiku, .wasi => false,`
			`else => true,`
			`};`

std: Convert deprecated aliases to compile errors and fix usages Deprecated aliases that are now compile errors: - `std.fs.MAX_PATH_BYTES` (renamed to `std.fs.max_path_bytes`) - `std.mem.tokenize` (split into `tokenizeAny`, `tokenizeSequence`, `tokenizeScalar`) - `std.mem.split` (split into `splitSequence`, `splitAny`, `splitScalar`) - `std.mem.splitBackwards` (split into `splitBackwardsSequence`, `splitBackwardsAny`, `splitBackwardsScalar`) - `std.unicode` + `utf16leToUtf8Alloc`, `utf16leToUtf8AllocZ`, `utf16leToUtf8`, `fmtUtf16le` (all renamed to have capitalized `Le`) + `utf8ToUtf16LeWithNull` (renamed to `utf8ToUtf16LeAllocZ`) - `std.zig.CrossTarget` (moved to `std.Target.Query`) Deprecated `lib/std/std.zig` decls were deleted instead of made a `@compileError` because the `refAllDecls` in the test block would trigger the `@compileError`. The deleted top-level `std` namespaces are: - `std.rand` (renamed to `std.Random`) - `std.TailQueue` (renamed to `std.DoublyLinkedList`) - `std.ChildProcess` (renamed/moved to `std.process.Child`) This is not exhaustive. Deprecated aliases that I didn't touch: + `std.io.` + `std.Build.` + `std.builtin.Mode` + `std.zig.c_translation.CIntLiteralRadix` + anything in `src/` 2024-05-02 20:20:41 -07:00			`/// Tells whether spawning child processes is supported (e.g. via Child)`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`pub const can_spawn = switch (native_os) {`
represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct OS target. Earlier, when DriverKit support was added to LLVM, it was represented a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in the target triple is beyond me. But this isn't the first time they've ignored established target triple norms (see: armv7k and aarch64_32) and it probably won't be the last. While doing this, I also audited all Darwin OS prongs throughout the codebase and made sure they cover all the tags. 2025-11-13 18:05:46 +01:00			`.wasi, .ios, .tvos, .visionos, .watchos => false,`
avoid usage of execv on Haiku 2021-05-22 00:56:30 -05:00			`else => true,`
			`};`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`pub const ExecvError = std.posix.ExecveError \|\| error{OutOfMemory};`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00
			`/// Replaces the current process image with the executed process.`
			`/// This function must allocate memory to add a null terminating bytes on path and each arg.`
			`/// It must also convert to KEY=VALUE\0 format for environment variables, and include null`
			`/// pointers after the args and after the environment variables.`
			/// `argv[0]` is the executable path.
			`/// This function also uses the PATH environment variable to get the full path to the executable.`
			`/// Due to the heap-allocation, it is illegal to call this function in a fork() child.`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			/// For that use case, use the `std.posix` functions directly.
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`pub fn execv(allocator: Allocator, argv: []const []const u8) ExecvError {`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`return execve(allocator, argv, null);`
			`}`

			`/// Replaces the current process image with the executed process.`
			`/// This function must allocate memory to add a null terminating bytes on path and each arg.`
			`/// It must also convert to KEY=VALUE\0 format for environment variables, and include null`
			`/// pointers after the args and after the environment variables.`
			/// `argv[0]` is the executable path.
			`/// This function also uses the PATH environment variable to get the full path to the executable.`
			`/// Due to the heap-allocation, it is illegal to call this function in a fork() child.`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			/// For that use case, use the `std.posix` functions directly.
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`pub fn execve(`
std: clean up imports in a couple files 2022-11-10 14:00:55 -07:00			`allocator: Allocator,`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`argv: []const []const u8,`
Update usages of `process.getEnvMap` and change BufMap -> EnvMap where applicable # Conflicts: # lib/std/build/RunStep.zig 2022-02-06 23:52:08 -07:00			`env_map: ?*const EnvMap,`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`) ExecvError {`
			`if (!can_execv) @compileError("The target OS does not support execv");`

			`var arena_allocator = std.heap.ArenaAllocator.init(allocator);`
			`defer arena_allocator.deinit();`
allocgate: renamed getAllocator function to allocator 2021-10-29 02:08:41 +01:00			`const arena = arena_allocator.allocator();`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00
process: add more missing const 2023-06-01 21:03:33 -04:00			`const argv_buf = try arena.allocSentinel(?[*:0]const u8, argv.len, null);`
update std lib and compiler sources to new for loop syntax 2023-02-18 09:02:57 -07:00			`for (argv, 0..) \|arg, i\| argv_buf[i] = (try arena.dupeZ(u8, arg)).ptr;`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00
			`const envp = m: {`
			`if (env_map) \|m\| {`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const envp_buf = try createNullDelimitedEnvMap(arena, m);`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`break :m envp_buf.ptr;`
migrate from `std.Target.current` to `@import("builtin").target` closes #9388 closes #9321 2021-10-04 23:47:27 -07:00			`} else if (builtin.link_libc) {`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`break :m std.c.environ;`
migrate from `std.Target.current` to `@import("builtin").target` closes #9388 closes #9321 2021-10-04 23:47:27 -07:00			`} else if (builtin.output_mode == .Exe) {`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`// Then we have Zig start code and this works.`
			// TODO type-safety for null-termination of `os.environ`.
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`break :m @as([:null]const ?[:0]const u8, @ptrCast(std.os.environ.ptr));`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`} else {`
			`// TODO come up with a solution for this.`
			`@compileError("missing std lib enhancement: std.process.execv implementation has no way to collect the environment variables to forward to the child process");`
			`}`
			`};`

extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`return posix.execvpeZ_expandArg0(.no_expand, argv_buf.ptr[0].?, argv_buf.ptr, envp);`
std: do not call malloc() between fork() and execv() We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments. 2020-12-26 13:50:26 -07:00			`}`
add std.process.totalSystemMemory 2023-03-06 00:19:32 -07:00
			`pub const TotalSystemMemoryError = error{`
			`UnknownTotalSystemMemory,`
			`};`

std.process: return u64 in totalSystemMemory 2024-01-21 22:16:22 -08:00			`/// Returns the total system memory, in bytes as a u64.`
			`/// We return a u64 instead of usize due to PAE on ARM`
			`/// and Linux's /proc/meminfo reporting more memory when`
			`/// using QEMU user mode emulation.`
			`pub fn totalSystemMemory() TotalSystemMemoryError!u64 {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`switch (native_os) {`
add std.process.totalSystemMemory 2023-03-06 00:19:32 -07:00			`.linux => {`
std: add os.linux.sysinfo(), use it for process.totalSystemMemory() Co-authored-by: Alex Rønne Petersen <alex@alexrp.com> 2025-04-11 13:28:22 -04:00			`var info: std.os.linux.Sysinfo = undefined;`
			`const result: usize = std.os.linux.sysinfo(&info);`
system specific errno 2025-11-14 23:16:55 +01:00			`if (std.os.linux.errno(result) != .SUCCESS) {`
std: add os.linux.sysinfo(), use it for process.totalSystemMemory() Co-authored-by: Alex Rønne Petersen <alex@alexrp.com> 2025-04-11 13:28:22 -04:00			`return error.UnknownTotalSystemMemory;`
			`}`
process.totalSystemMemory: Avoid overflow on Linux when totalram is a 32-bit usize Fixes #25038 2025-08-27 22:16:21 -07:00			`// Promote to u64 to avoid overflow on systems where info.totalram is a 32-bit usize`
			`return @as(u64, info.totalram) * info.mem_unit;`
add std.process.totalSystemMemory 2023-03-06 00:19:32 -07:00			`},`
Revert "std.process: further totalSystemMemory portage" This reverts commit 5c70d7bc723a8e0e47018d3606285005c280ddb8. 2023-07-31 11:20:21 -07:00			`.freebsd => {`
process: totalSystemMemory freebsd portage 2023-04-21 23:23:14 +01:00			`var physmem: c_ulong = undefined;`
			`var len: usize = @sizeOf(c_ulong);`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`posix.sysctlbynameZ("hw.physmem", &physmem, &len, null, 0) catch \|err\| switch (err) {`
drop NameTooLong from sysctlbynameZ error set (#24909) 2025-08-21 11:36:57 +01:00			`error.UnknownName => unreachable,`
std.process.totalSystemMemory: return correct error type on FreeBSD 2023-07-31 22:10:42 -07:00			`else => return error.UnknownTotalSystemMemory,`
process: totalSystemMemory freebsd portage 2023-04-21 23:23:14 +01:00			`};`
add macOS handling for totalSystemMemory (#24903) * add macos handling for totalSystemMemory * fix return type cast for .freebsd in totalSystemMemory * add handling for the whole Darwin family in totalSystemMemory 2025-08-25 20:25:53 +01:00			`return @as(u64, @intCast(physmem));`
			`},`
			`// whole Darwin family`
represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct OS target. Earlier, when DriverKit support was added to LLVM, it was represented a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in the target triple is beyond me. But this isn't the first time they've ignored established target triple norms (see: armv7k and aarch64_32) and it probably won't be the last. While doing this, I also audited all Darwin OS prongs throughout the codebase and made sure they cover all the tags. 2025-11-13 18:05:46 +01:00			`.driverkit, .ios, .maccatalyst, .macos, .tvos, .visionos, .watchos => {`
add macOS handling for totalSystemMemory (#24903) * add macos handling for totalSystemMemory * fix return type cast for .freebsd in totalSystemMemory * add handling for the whole Darwin family in totalSystemMemory 2025-08-25 20:25:53 +01:00			`// "hw.memsize" returns uint64_t`
			`var physmem: u64 = undefined;`
			`var len: usize = @sizeOf(u64);`
			`posix.sysctlbynameZ("hw.memsize", &physmem, &len, null, 0) catch \|err\| switch (err) {`
			`error.PermissionDenied => unreachable, // only when setting values,`
			`error.SystemResources => unreachable, // memory already on the stack`
			`error.UnknownName => unreachable, // constant, known good value`
			`else => return error.UnknownTotalSystemMemory,`
			`};`
			`return physmem;`
process: totalSystemMemory freebsd portage 2023-04-21 23:23:14 +01:00			`},`
openbsd: fix std.c.getdents and debitrot - fix getdents return type usize → c_int - special-case process.zig to use sysctl instead of sysctlbyname - use struct/field pattern for sysctl HW_* constants 2023-06-15 14:48:20 -04:00			`.openbsd => {`
			`const mib: [2]c_int = [_]c_int{`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`posix.CTL.HW,`
			`posix.HW.PHYSMEM64,`
openbsd: fix std.c.getdents and debitrot - fix getdents return type usize → c_int - special-case process.zig to use sysctl instead of sysctlbyname - use struct/field pattern for sysctl HW_* constants 2023-06-15 14:48:20 -04:00			`};`
			`var physmem: i64 = undefined;`
			`var len: usize = @sizeOf(@TypeOf(physmem));`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`posix.sysctl(&mib, &physmem, &len, null, 0) catch \|err\| switch (err) {`
openbsd: fix std.c.getdents and debitrot - fix getdents return type usize → c_int - special-case process.zig to use sysctl instead of sysctlbyname - use struct/field pattern for sysctl HW_* constants 2023-06-15 14:48:20 -04:00			`error.NameTooLong => unreachable, // constant, known good value`
			`error.PermissionDenied => unreachable, // only when setting values,`
			`error.SystemResources => unreachable, // memory already on the stack`
			`error.UnknownName => unreachable, // constant, known good value`
			`else => return error.UnknownTotalSystemMemory,`
			`};`
			`assert(physmem >= 0);`
std.process: return u64 in totalSystemMemory 2024-01-21 22:16:22 -08:00			`return @as(u64, @bitCast(physmem));`
openbsd: fix std.c.getdents and debitrot - fix getdents return type usize → c_int - special-case process.zig to use sysctl instead of sysctlbyname - use struct/field pattern for sysctl HW_* constants 2023-06-15 14:48:20 -04:00			`},`
add std.process.totalSystemMemory 2023-03-06 00:19:32 -07:00			`.windows => {`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`var sbi: windows.SYSTEM_BASIC_INFORMATION = undefined;`
			`const rc = windows.ntdll.NtQuerySystemInformation(`
windows: replace GetPhysicallyInstalledSystemMemory with ntdll. `GetPhysicallyInstalledSystemMemory` uses SMBios to grab the physical memory size which can lead to unecessary allocation and inacurate representation of the total memory. Using `System_Basic_Information` help to retrieve the physical memory which is not reserved for the kernel/tables. This aligns better with the linux side as `/proc/meminfo` does the same thing. 2023-04-12 18:22:07 -05:00			`.SystemBasicInformation,`
			`&sbi,`
extract std.posix from std.os closes #5019 2024-03-18 22:39:59 -07:00			`@sizeOf(windows.SYSTEM_BASIC_INFORMATION),`
windows: replace GetPhysicallyInstalledSystemMemory with ntdll. `GetPhysicallyInstalledSystemMemory` uses SMBios to grab the physical memory size which can lead to unecessary allocation and inacurate representation of the total memory. Using `System_Basic_Information` help to retrieve the physical memory which is not reserved for the kernel/tables. This aligns better with the linux side as `/proc/meminfo` does the same thing. 2023-04-12 18:22:07 -05:00			`null,`
			`);`
			`if (rc != .SUCCESS) {`
Fix crash on some Windows machines 2023-04-04 18:08:02 +02:00			`return error.UnknownTotalSystemMemory;`
windows: replace GetPhysicallyInstalledSystemMemory with ntdll. `GetPhysicallyInstalledSystemMemory` uses SMBios to grab the physical memory size which can lead to unecessary allocation and inacurate representation of the total memory. Using `System_Basic_Information` help to retrieve the physical memory which is not reserved for the kernel/tables. This aligns better with the linux side as `/proc/meminfo` does the same thing. 2023-04-12 18:22:07 -05:00			`}`
std.process: return u64 in totalSystemMemory 2024-01-21 22:16:22 -08:00			`return @as(u64, sbi.NumberOfPhysicalPages) * sbi.PageSize;`
add std.process.totalSystemMemory 2023-03-06 00:19:32 -07:00			`},`
			`else => return error.UnknownTotalSystemMemory,`
			`}`
			`}`

add std.process.cleanExit 2023-03-12 00:34:11 -07:00			`/// Indicate that we are now terminating with a successful exit code.`
			`/// In debug builds, this is a no-op, so that the calling code's`
			`/// cleanup mechanisms are tested and so that external tools that`
			`/// check for resource leaks can be accurate. In release builds, this`
			`/// calls exit(0), and does not return.`
			`pub fn cleanExit() void {`
			`if (builtin.mode == .Debug) {`
			`return;`
			`} else {`
std.process.cleanExit: lock stderr before exiting This makes it so that any other threads which are writing to stderr have a chance to finish before the process terminates. It also clears the terminal in case any progress has been written to stderr, while still accomplishing the goal of not waiting until the update thread exits. 2024-05-27 10:49:26 -07:00			`std.debug.lockStdErr();`
add std.process.cleanExit 2023-03-12 00:34:11 -07:00			`exit(0);`
			`}`
			`}`
introduce std.process.raiseFileDescriptorLimit 2024-05-03 18:10:33 -07:00
			`/// Raise the open file descriptor limit.`
			`///`
			`/// On some systems, this raises the limit before seeing ProcessFdQuotaExceeded`
			`/// errors. On other systems, this does nothing.`
			`pub fn raiseFileDescriptorLimit() void {`
std.c reorganization It is now composed of these main sections: * Declarations that are shared among all operating systems. * Declarations that have the same name, but different type signatures depending on the operating system. Often multiple operating systems share the same type signatures however. * Declarations that are specific to a single operating system. - These are imported one per line so you can see where they come from, protected by a comptime block to prevent accessing the wrong one. Closes #19352 by changing the convention to making types `void` and functions `{}`, so that it becomes possible to update `@hasDecl` sites to use `@TypeOf(f) != void` or `T != void`. Happily, this ended up removing some duplicate logic and update some bitrotted feature detection checks. A handful of types have been modified to gain namespacing and type safety. This is a breaking change. Oh, and the last usage of `usingnamespace` site is eliminated. 2024-07-18 23:35:19 -07:00			`const have_rlimit = posix.rlimit_resource != void;`
introduce std.process.raiseFileDescriptorLimit 2024-05-03 18:10:33 -07:00			`if (!have_rlimit) return;`

			`var lim = posix.getrlimit(.NOFILE) catch return; // Oh well; we tried.`
			`if (native_os.isDarwin()) {`
			// On Darwin, `NOFILE` is bounded by a hardcoded value `OPEN_MAX`.
			`// According to the man pages for setrlimit():`
			`// setrlimit() now returns with errno set to EINVAL in places that historically succeeded.`
			`// It no longer accepts "rlim_cur = RLIM.INFINITY" for RLIM.NOFILE.`
			`// Use "rlim_cur = min(OPEN_MAX, rlim_max)".`
			`lim.max = @min(std.c.OPEN_MAX, lim.max);`
			`}`
			`if (lim.cur == lim.max) return;`

			`// Do a binary search for the limit.`
			`var min: posix.rlim_t = lim.cur;`
			`var max: posix.rlim_t = 1 << 20;`
			`// But if there's a defined upper bound, don't search, just set it.`
			`if (lim.max != posix.RLIM.INFINITY) {`
			`min = lim.max;`
			`max = lim.max;`
			`}`

			`while (true) {`
			`lim.cur = min + @divTrunc(max - min, 2); // on freebsd rlim_t is signed`
			`if (posix.setrlimit(.NOFILE, lim)) \|_\| {`
			`min = lim.cur;`
			`} else \|_\| {`
			`max = lim.cur;`
			`}`
			`if (min + 1 >= max) break;`
			`}`
			`}`

			`test raiseFileDescriptorLimit {`
			`raiseFileDescriptorLimit();`
			`}`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`pub const CreateEnvironOptions = struct {`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			/// `null` means to leave the `ZIG_PROGRESS` environment variable unmodified.
			`/// If non-null, negative means to remove the environment variable, and >= 0`
			`/// means to provide it with the given integer.`
			`zig_progress_fd: ?i32 = null,`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`};`

std: fix typos (#20560) 2024-07-10 00:25:42 +03:00			`/// Creates a null-delimited environment variable block in the format`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`/// expected by POSIX, from a hash map plus options.`
			`pub fn createEnvironFromMap(`
			`arena: Allocator,`
			`map: *const EnvMap,`
			`options: CreateEnvironOptions,`
			`) Allocator.Error![:null]?[*:0]u8 {`
			`const ZigProgressAction = enum { nothing, edit, delete, add };`
			`const zig_progress_action: ZigProgressAction = a: {`
			`const fd = options.zig_progress_fd orelse break :a .nothing;`
			`const contains = map.get("ZIG_PROGRESS") != null;`
			`if (fd >= 0) {`
			`break :a if (contains) .edit else .add;`
			`} else {`
			`if (contains) break :a .delete;`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`}`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`break :a .nothing;`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`};`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00
std.process: fix compilation on 32-bit targets 2024-05-26 12:08:15 -07:00			`const envp_count: usize = c: {`
			`var count: usize = map.count();`
			`switch (zig_progress_action) {`
			`.add => count += 1,`
			`.delete => count -= 1,`
			`.nothing, .edit => {},`
			`}`
			`break :c count;`
			`};`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`const envp_buf = try arena.allocSentinel(?[*:0]u8, envp_count, null);`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`var i: usize = 0;`

std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`if (zig_progress_action == .add) {`
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`envp_buf[i] = try std.fmt.allocPrintSentinel(arena, "ZIG_PROGRESS={d}", .{options.zig_progress_fd.?}, 0);`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`i += 1;`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`}`

std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`{`
			`var it = map.iterator();`
			`while (it.next()) \|pair\| {`
			`if (mem.eql(u8, pair.key_ptr.*, "ZIG_PROGRESS")) switch (zig_progress_action) {`
			`.add => unreachable,`
			`.delete => continue,`
			`.edit => {`
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`envp_buf[i] = try std.fmt.allocPrintSentinel(arena, "{s}={d}", .{`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`pair.key_ptr.*, options.zig_progress_fd.?,`
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`}, 0);`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`i += 1;`
			`continue;`
			`},`
			`.nothing => {},`
			`};`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`envp_buf[i] = try std.fmt.allocPrintSentinel(arena, "{s}={s}", .{ pair.key_ptr., pair.value_ptr. }, 0);`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`i += 1;`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`}`
			`}`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00
			`assert(i == envp_count);`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`return envp_buf;`
			`}`

std: fix typos (#20560) 2024-07-10 00:25:42 +03:00			`/// Creates a null-delimited environment variable block in the format`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`/// expected by POSIX, from a hash map plus options.`
			`pub fn createEnvironFromExisting(`
			`arena: Allocator,`
			`existing: [:null]const ?[:0]const u8,`
			`options: CreateEnvironOptions,`
			`) Allocator.Error![:null]?[*:0]u8 {`
			`const existing_count, const contains_zig_progress = c: {`
			`var count: usize = 0;`
			`var contains = false;`
			`while (existing[count]) \|line\| : (count += 1) {`
			`contains = contains or mem.eql(u8, mem.sliceTo(line, '='), "ZIG_PROGRESS");`
			`}`
			`break :c .{ count, contains };`
			`};`
			`const ZigProgressAction = enum { nothing, edit, delete, add };`
			`const zig_progress_action: ZigProgressAction = a: {`
			`const fd = options.zig_progress_fd orelse break :a .nothing;`
			`if (fd >= 0) {`
			`break :a if (contains_zig_progress) .edit else .add;`
			`} else {`
			`if (contains_zig_progress) break :a .delete;`
			`}`
			`break :a .nothing;`
			`};`

std.process: fix compilation on 32-bit targets 2024-05-26 12:08:15 -07:00			`const envp_count: usize = c: {`
			`var count: usize = existing_count;`
			`switch (zig_progress_action) {`
			`.add => count += 1,`
			`.delete => count -= 1,`
			`.nothing, .edit => {},`
			`}`
			`break :c count;`
			`};`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00
			`const envp_buf = try arena.allocSentinel(?[*:0]u8, envp_count, null);`
			`var i: usize = 0;`
			`var existing_index: usize = 0;`

			`if (zig_progress_action == .add) {`
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`envp_buf[i] = try std.fmt.allocPrintSentinel(arena, "ZIG_PROGRESS={d}", .{options.zig_progress_fd.?}, 0);`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`i += 1;`
			`}`

			`while (existing[existing_index]) \|line\| : (existing_index += 1) {`
			`if (mem.eql(u8, mem.sliceTo(line, '='), "ZIG_PROGRESS")) switch (zig_progress_action) {`
			`.add => unreachable,`
			`.delete => continue,`
			`.edit => {`
std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string 2025-06-27 20:05:22 -07:00			`envp_buf[i] = try std.fmt.allocPrintSentinel(arena, "ZIG_PROGRESS={d}", .{options.zig_progress_fd.?}, 0);`
std.process.Child: fix ZIG_PROGRESS env var handling and properly dup2 the file descriptor to make it handle the case when other files are already open 2024-05-23 20:22:58 -07:00			`i += 1;`
			`continue;`
			`},`
			`.nothing => {},`
			`};`
			`envp_buf[i] = try arena.dupeZ(u8, mem.span(line));`
			`i += 1;`
			`}`

			`assert(i == envp_count);`
			`return envp_buf;`
			`}`

			`pub fn createNullDelimitedEnvMap(arena: mem.Allocator, env_map: const EnvMap) Allocator.Error![:null]?[:0]u8 {`
			`return createEnvironFromMap(arena, env_map, .{});`
std.Progress: child process sends updates via IPC 2024-05-23 14:10:03 -07:00			`}`

std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`test createNullDelimitedEnvMap {`
			`const allocator = testing.allocator;`
			`var envmap = EnvMap.init(allocator);`
			`defer envmap.deinit();`

			`try envmap.put("HOME", "/home/ifreund");`
			`try envmap.put("WAYLAND_DISPLAY", "wayland-1");`
			`try envmap.put("DISPLAY", ":1");`
			`try envmap.put("DEBUGINFOD_URLS", " ");`
			`try envmap.put("XCURSOR_SIZE", "24");`

			`var arena = std.heap.ArenaAllocator.init(allocator);`
			`defer arena.deinit();`
			`const environ = try createNullDelimitedEnvMap(arena.allocator(), &envmap);`

			`try testing.expectEqual(@as(usize, 5), environ.len);`

			`inline for (.{`
			`"HOME=/home/ifreund",`
			`"WAYLAND_DISPLAY=wayland-1",`
			`"DISPLAY=:1",`
			`"DEBUGINFOD_URLS= ",`
			`"XCURSOR_SIZE=24",`
			`}) \|target\| {`
			`for (environ) \|variable\| {`
			`if (mem.eql(u8, mem.span(variable orelse continue), target)) break;`
			`} else {`
			`try testing.expect(false); // Environment variable not found`
			`}`
			`}`
			`}`

			`/// Caller must free result.`
			`pub fn createWindowsEnvBlock(allocator: mem.Allocator, env_map: *const EnvMap) ![]u16 {`
			`// count bytes needed`
			`const max_chars_needed = x: {`
createWindowsEnvBlock: Reduce NUL terminator count to only what's required This code previously added 4 NUL code units, but that was likely due to a misinterpretation of this part of the CreateProcess documentation: > A Unicode environment block is terminated by four zero bytes: two for the last string, two more to terminate the block. (four zero bytes means two zero code units) Additionally, the second zero code unit is only actually needed when the environment is empty due to a quirk of the CreateProcess implementation. In the case of a non-empty environment, there always ends up being two trailing NUL code units since one will come after the last environment variable in the block. 2025-03-17 17:53:12 -07:00			`// Only need 2 trailing NUL code units for an empty environment`
			`var max_chars_needed: usize = if (env_map.count() == 0) 2 else 1;`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`var it = env_map.iterator();`
			`while (it.next()) \|pair\| {`
			`// +1 for '='`
			`// +1 for null byte`
			`max_chars_needed += pair.key_ptr.len + pair.value_ptr.len + 2;`
			`}`
			`break :x max_chars_needed;`
			`};`
			`const result = try allocator.alloc(u16, max_chars_needed);`
			`errdefer allocator.free(result);`

			`var it = env_map.iterator();`
			`var i: usize = 0;`
			`while (it.next()) \|pair\| {`
			`i += try unicode.wtf8ToWtf16Le(result[i..], pair.key_ptr.*);`
			`result[i] = '=';`
			`i += 1;`
			`i += try unicode.wtf8ToWtf16Le(result[i..], pair.value_ptr.*);`
			`result[i] = 0;`
			`i += 1;`
			`}`
			`result[i] = 0;`
			`i += 1;`
createWindowsEnvBlock: Reduce NUL terminator count to only what's required This code previously added 4 NUL code units, but that was likely due to a misinterpretation of this part of the CreateProcess documentation: > A Unicode environment block is terminated by four zero bytes: two for the last string, two more to terminate the block. (four zero bytes means two zero code units) Additionally, the second zero code unit is only actually needed when the environment is empty due to a quirk of the CreateProcess implementation. In the case of a non-empty environment, there always ends up being two trailing NUL code units since one will come after the last environment variable in the block. 2025-03-17 17:53:12 -07:00			`// An empty environment is a special case that requires a redundant`
			`// NUL terminator. CreateProcess will read the second code unit even`
			`// though theoretically the first should be enough to recognize that the`
			`// environment is empty (see https://nullprogram.com/blog/2023/08/23/)`
			`if (env_map.count() == 0) {`
			`result[i] = 0;`
			`i += 1;`
			`}`
std: restructure child process namespace 2024-05-23 11:25:41 -07:00			`return try allocator.realloc(result, i);`
			`}`
move std.zig.fatal to std.process.fatal 2024-07-19 17:38:15 -07:00
			`/// Logs an error and then terminates the process with exit code 1.`
			`pub fn fatal(comptime format: []const u8, format_arguments: anytype) noreturn {`
			`std.log.err(format, format_arguments);`
			`exit(1);`
			`}`