SIGN IN SIGN UP
const std = @import("std.zig");
2019-03-02 16:46:04 -05:00
const tokenizer = @import("zig/tokenizer.zig");
const fmt = @import("zig/fmt.zig");
const assert = std.debug.assert;
progress towards semantic error serialization Introduces std.zig.ErrorBundle which is a trivially serializeable set of compilation errors. This is in the standard library so that both the compiler and the build runner can use it. The idea is they will use it to communicate compilation errors over a binary protocol. The binary encoding of ErrorBundle is a bit problematic - I got a little too aggressive with compaction. I need to change it in a follow-up commit to use some indirection in the error message list, otherwise iteration is too unergonomic. In fact it's so problematic right now that the logic getAllErrorsAlloc() actually fails to produce a viable ErrorBundle because it puts SourceLocation data in between the root level ErrorMessage data. This commit has a simplification - redundant logic for rendering AST errors to stderr has been removed in favor of moving the logic for lowering AST errors into AstGen. So even if we get parse errors, the errors will get lowered into ZIR before being reported. I believe this will be useful when working on --autofix. Either way, some redundant brittle logic was happily deleted. In Compilation, updateSubCompilation() is improved to properly perform error reporting when a sub-compilation object fails. It no longer dumps directly to stderr; instead it populates an ErrorBundle object, which gets added to the parent one during getAllErrorsAlloc(). In package fetching code, instead of dumping directly to stderr, it now populates an ErrorBundle object, and gets properly reported at the CLI layer of abstraction.
2023-02-23 16:18:43 -07:00
pub const ErrorBundle = @import("zig/ErrorBundle.zig");
pub const Server = @import("zig/Server.zig");
pub const Client = @import("zig/Client.zig");
2019-03-02 16:46:04 -05:00
pub const Token = tokenizer.Token;
pub const Tokenizer = tokenizer.Tokenizer;
pub const fmtId = fmt.fmtId;
pub const fmtEscapes = fmt.fmtEscapes;
pub const isValidId = fmt.isValidId;
stage2: *WIP*: rework ZIR memory layout; overhaul source locations The memory layout for ZIR instructions is completely reworked. See zir.zig for those changes. Some new types: * `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating each instruction independently, there is now a Tag and 8 bytes of data available for all ZIR instructions. Small instructions fit within these 8 bytes; larger ones use 4 bytes for an index into `extra`. There is also `string_bytes` so that we can have 4 byte references to strings. `zir.Inst.Tag` describes how to interpret those 8 bytes of data. - This is shared by all `Block` scopes. * `Module.WipZirCode`: represents an in-progress `zir.Code`. In this structure, the arrays are mutable, and get resized as we add/delete things. There is extra state to keep track of things. This struct is stored on the stack. Once it is finished, it produces an immutable `zir.Code`, which will remain on the heap for the duration of a function's existence. - This is shared by all `GenZir` scopes. * `Sema`: represents in-progress semantic analysis of a `zir.Code`. This data is stored on the stack and is shared among all `Block` scopes. It is now the main "self" argument to everything in the file that was previously named `zir_sema.zig`. Additionally, I moved some logic that was in `Module` into here. `Module.Fn` now stores its parameter names inside the `zir.Code`, instead of inside ZIR instructions. When the TZIR memory layout reworking time comes, codegen will be able to reference this data directly instead of duplicating it. astgen.zig is (so far) almost entirely untouched, but nearly all of it will need to be reworked to adhere to this new memory layout structure. I have no benchmarks to report yet, as I am still working through compile errors and fixing various things that I broke in this branch. Overhaul of Source Locations: Previously we used `usize` everywhere to mean byte offset, but sometimes also mean other stuff. This was error prone and also made us do unnecessary work, and store unnecessary bytes in memory. Now there are more types involved into source locations, and more ways to describe a source location. * AllErrors.Message: embrace the assumption that files always have less than 2 << 32 bytes. * SrcLoc gets more complicated, to model more complicated source locations. * Introduce LazySrcLoc, which can model interesting source locations with very little stored state. Useful for avoiding doing unnecessary work when no compile errors occur. Also, previously, we had `src: usize` on every ZIR instruction. This is no longer the case. Each instruction now determines whether it even cares about source location, and if so, how that source location is stored. This requires more careful work inside `Sema`, but it results in fewer bytes stored on the heap, without compromising accuracy and power of compile error messages. Miscellaneous: * std.zig: string literals have more helpful result values for reporting errors. There is now a lower level API and a higher level API. - side note: I noticed that the string literal logic needs some love. There is some unnecessarily hacky code there. * cut & pasted some TZIR logic that was in zir.zig to ir.zig. This probably broke stuff and needs to get fixed. * Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't think this quite how this code will be organized. Need some more careful planning about how to implement structs, unions, enums. They need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
pub const string_literal = @import("zig/string_literal.zig");
2022-08-31 13:36:48 +03:00
pub const number_literal = @import("zig/number_literal.zig");
2022-10-29 00:14:51 -07:00
pub const primitives = @import("zig/primitives.zig");
pub const Ast = @import("zig/Ast.zig");
pub const system = @import("zig/system.zig");
pub const CrossTarget = @import("zig/CrossTarget.zig");
2019-03-02 16:46:04 -05:00
// Character literal parsing
pub const ParsedCharLiteral = string_literal.ParsedCharLiteral;
pub const parseCharLiteral = string_literal.parseCharLiteral;
2022-08-31 13:36:48 +03:00
pub const parseNumberLiteral = number_literal.parseNumberLiteral;
// Files needed by translate-c.
pub const c_builtins = @import("zig/c_builtins.zig");
pub const c_translation = @import("zig/c_translation.zig");
pub const SrcHash = [16]u8;
pub fn hashSrc(src: []const u8) SrcHash {
var out: SrcHash = undefined;
stage2: blaze the trail for std lib integration This branch adds "builtin" and "std" to the import table when using the self-hosted backend. "builtin" gains one additional item: ``` pub const zig_is_stage2 = true; // false when using stage1 backend ``` This allows the std lib to do conditional compilation based on detecting which backend is being used. This will be removed from builtin as soon as self-hosted catches up to feature parity with stage1. Keep a sharp eye out - people are going to be tempted to abuse this. The general rule of thumb is do not use `builtin.zig_is_stage2`. However this commit breaks the rule so that we can gain limited start.zig support as we incrementally improve the self-hosted compiler. This commit also implements `fullyQualifiedNameHash` and related functionality, which effectively puts all Decls in their proper namespaces. `fullyQualifiedName` is not yet implemented. Stop printing "todo" log messages for test decls unless we are in test mode. Add "previous definition here" error notes for Decl name collisions. This commit does not bring us yet to a newly passing test case. Here's what I'm working towards: ```zig const std = @import("std"); export fn main() c_int { const a = std.fs.base64_alphabet[0]; return a - 'A'; } ``` Current output: ``` $ ./zig-cache/bin/zig build-exe test.zig test.zig:3:1: error: TODO implement more analyze elemptr zig-cache/lib/zig/std/start.zig:38:46: error: TODO implement structInitExpr ty ``` So the next steps are clear: * Sema: improve elemptr * AstGen: implement structInitExpr
2021-04-08 19:05:05 -07:00
std.crypto.hash.Blake3.hash(src, &out, .{});
return out;
}
stage2: rewire the frontend driver to whole-file-zir * Remove some unused imports in AstGen.zig. I think it would make sense to start decoupling AstGen from the rest of the compiler code, similar to how the tokenizer and parser are decoupled. * AstGen: For decls, move the block_inline instructions to the top of the function so that they get lower ZIR instruction indexes. With this, the block_inline instruction index combined with its corresponding break_inline instruction index can be used to form a ZIR instruction range. This is useful for allocating an array to map ZIR instructions to semantically analyzed instructions. * Module: extract emit-h functionality into a struct, and only allocate it when emit-h is activated. * Module: remove the `decl_table` field. This previously was a table of all Decls in the entire Module. A "name hash" strategy was used to find decls within a given namespace, using this global table. Now, each Namespace has its own map of name to children Decls. - Additionally, there were 3 places that relied on iterating over decl_table in order to function: - C backend and SPIR-V backend. These now have their own decl_table that they keep populated when `updateDecl` and `removeDecl` are called. - emit-h. A `decl_table` field has been added to the new GlobalEmitH struct which is only allocated when emit-h is activated. * Module: fix ZIR serialization/deserialization bug in debug mode having to do with the secret safety tag for untagged unions. There is still an open TODO to investigate a friendlier solution to this problem with the language. * Module: improve deserialization of ZIR to allocate only exactly as much capacity as length in the instructions array so as to not waste space. * Module: move `srcHashEql` to `std.zig` to live next to the definition of `SrcHash` itself. * Module: re-introduce the logic for scanning top level declarations within a namespace. * Compilation: add an `analyze_pkg` Job which is used to kick off the start of semantic analysis by doing the equivalent of `_ = @import("std");`. The `analyze_pkg` job is unconditionally added to the work queue on every update(), with pkg set to the std lib pkg. * Rename TZIR to AIR in a few places. A more comprehensive rename will come later.
2021-04-26 20:41:07 -07:00
pub fn srcHashEql(a: SrcHash, b: SrcHash) bool {
return @as(u128, @bitCast(a)) == @as(u128, @bitCast(b));
stage2: rewire the frontend driver to whole-file-zir * Remove some unused imports in AstGen.zig. I think it would make sense to start decoupling AstGen from the rest of the compiler code, similar to how the tokenizer and parser are decoupled. * AstGen: For decls, move the block_inline instructions to the top of the function so that they get lower ZIR instruction indexes. With this, the block_inline instruction index combined with its corresponding break_inline instruction index can be used to form a ZIR instruction range. This is useful for allocating an array to map ZIR instructions to semantically analyzed instructions. * Module: extract emit-h functionality into a struct, and only allocate it when emit-h is activated. * Module: remove the `decl_table` field. This previously was a table of all Decls in the entire Module. A "name hash" strategy was used to find decls within a given namespace, using this global table. Now, each Namespace has its own map of name to children Decls. - Additionally, there were 3 places that relied on iterating over decl_table in order to function: - C backend and SPIR-V backend. These now have their own decl_table that they keep populated when `updateDecl` and `removeDecl` are called. - emit-h. A `decl_table` field has been added to the new GlobalEmitH struct which is only allocated when emit-h is activated. * Module: fix ZIR serialization/deserialization bug in debug mode having to do with the secret safety tag for untagged unions. There is still an open TODO to investigate a friendlier solution to this problem with the language. * Module: improve deserialization of ZIR to allocate only exactly as much capacity as length in the instructions array so as to not waste space. * Module: move `srcHashEql` to `std.zig` to live next to the definition of `SrcHash` itself. * Module: re-introduce the logic for scanning top level declarations within a namespace. * Compilation: add an `analyze_pkg` Job which is used to kick off the start of semantic analysis by doing the equivalent of `_ = @import("std");`. The `analyze_pkg` job is unconditionally added to the work queue on every update(), with pkg set to the std lib pkg. * Rename TZIR to AIR in a few places. A more comprehensive rename will come later.
2021-04-26 20:41:07 -07:00
}
stage2: blaze the trail for std lib integration This branch adds "builtin" and "std" to the import table when using the self-hosted backend. "builtin" gains one additional item: ``` pub const zig_is_stage2 = true; // false when using stage1 backend ``` This allows the std lib to do conditional compilation based on detecting which backend is being used. This will be removed from builtin as soon as self-hosted catches up to feature parity with stage1. Keep a sharp eye out - people are going to be tempted to abuse this. The general rule of thumb is do not use `builtin.zig_is_stage2`. However this commit breaks the rule so that we can gain limited start.zig support as we incrementally improve the self-hosted compiler. This commit also implements `fullyQualifiedNameHash` and related functionality, which effectively puts all Decls in their proper namespaces. `fullyQualifiedName` is not yet implemented. Stop printing "todo" log messages for test decls unless we are in test mode. Add "previous definition here" error notes for Decl name collisions. This commit does not bring us yet to a newly passing test case. Here's what I'm working towards: ```zig const std = @import("std"); export fn main() c_int { const a = std.fs.base64_alphabet[0]; return a - 'A'; } ``` Current output: ``` $ ./zig-cache/bin/zig build-exe test.zig test.zig:3:1: error: TODO implement more analyze elemptr zig-cache/lib/zig/std/start.zig:38:46: error: TODO implement structInitExpr ty ``` So the next steps are clear: * Sema: improve elemptr * AstGen: implement structInitExpr
2021-04-08 19:05:05 -07:00
pub fn hashName(parent_hash: SrcHash, sep: []const u8, name: []const u8) SrcHash {
var out: SrcHash = undefined;
var hasher = std.crypto.hash.Blake3.init(.{});
hasher.update(&parent_hash);
hasher.update(sep);
hasher.update(name);
hasher.final(&out);
return out;
}
pub const Loc = struct {
line: usize,
column: usize,
/// Does not include the trailing newline.
source_line: []const u8,
pub fn eql(a: Loc, b: Loc) bool {
return a.line == b.line and a.column == b.column and std.mem.eql(u8, a.source_line, b.source_line);
}
};
pub fn findLineColumn(source: []const u8, byte_offset: usize) Loc {
var line: usize = 0;
var column: usize = 0;
var line_start: usize = 0;
var i: usize = 0;
while (i < byte_offset) : (i += 1) {
switch (source[i]) {
'\n' => {
line += 1;
column = 0;
line_start = i + 1;
},
else => {
column += 1;
},
}
}
while (i < source.len and source[i] != '\n') {
i += 1;
}
return .{
.line = line,
.column = column,
.source_line = source[line_start..i],
};
}
pub fn lineDelta(source: []const u8, start: usize, end: usize) isize {
var line: isize = 0;
if (end >= start) {
for (source[start..end]) |byte| switch (byte) {
'\n' => line += 1,
else => continue,
};
} else {
for (source[end..start]) |byte| switch (byte) {
'\n' => line -= 1,
else => continue,
};
}
return line;
}
pub const BinNameOptions = struct {
root_name: []const u8,
target: std.Target,
output_mode: std.builtin.OutputMode,
link_mode: ?std.builtin.LinkMode = null,
version: ?std.SemanticVersion = null,
};
/// Returns the standard file system basename of a binary generated by the Zig compiler.
pub fn binNameAlloc(allocator: std.mem.Allocator, options: BinNameOptions) error{OutOfMemory}![]u8 {
const root_name = options.root_name;
const target = options.target;
2022-08-18 18:58:28 -07:00
switch (target.ofmt) {
.coff => switch (options.output_mode) {
.Exe => return std.fmt.allocPrint(allocator, "{s}{s}", .{ root_name, target.exeFileExt() }),
.Lib => {
const suffix = switch (options.link_mode orelse .Static) {
.Static => ".lib",
.Dynamic => ".dll",
};
return std.fmt.allocPrint(allocator, "{s}{s}", .{ root_name, suffix });
},
.Obj => return std.fmt.allocPrint(allocator, "{s}.obj", .{root_name}),
},
.elf => switch (options.output_mode) {
.Exe => return allocator.dupe(u8, root_name),
.Lib => {
switch (options.link_mode orelse .Static) {
.Static => return std.fmt.allocPrint(allocator, "{s}{s}.a", .{
target.libPrefix(), root_name,
}),
.Dynamic => {
if (options.version) |ver| {
return std.fmt.allocPrint(allocator, "{s}{s}.so.{d}.{d}.{d}", .{
target.libPrefix(), root_name, ver.major, ver.minor, ver.patch,
});
} else {
return std.fmt.allocPrint(allocator, "{s}{s}.so", .{
target.libPrefix(), root_name,
});
}
},
}
},
.Obj => return std.fmt.allocPrint(allocator, "{s}.o", .{root_name}),
},
.macho => switch (options.output_mode) {
.Exe => return allocator.dupe(u8, root_name),
.Lib => {
switch (options.link_mode orelse .Static) {
.Static => return std.fmt.allocPrint(allocator, "{s}{s}.a", .{
target.libPrefix(), root_name,
}),
.Dynamic => {
if (options.version) |ver| {
return std.fmt.allocPrint(allocator, "{s}{s}.{d}.{d}.{d}.dylib", .{
target.libPrefix(), root_name, ver.major, ver.minor, ver.patch,
});
} else {
return std.fmt.allocPrint(allocator, "{s}{s}.dylib", .{
target.libPrefix(), root_name,
});
}
},
}
},
.Obj => return std.fmt.allocPrint(allocator, "{s}.o", .{root_name}),
},
.wasm => switch (options.output_mode) {
.Exe => return std.fmt.allocPrint(allocator, "{s}{s}", .{ root_name, target.exeFileExt() }),
.Lib => {
switch (options.link_mode orelse .Static) {
.Static => return std.fmt.allocPrint(allocator, "{s}{s}.a", .{
target.libPrefix(), root_name,
}),
.Dynamic => return std.fmt.allocPrint(allocator, "{s}.wasm", .{root_name}),
}
},
.Obj => return std.fmt.allocPrint(allocator, "{s}.o", .{root_name}),
},
.c => return std.fmt.allocPrint(allocator, "{s}.c", .{root_name}),
.spirv => return std.fmt.allocPrint(allocator, "{s}.spv", .{root_name}),
.hex => return std.fmt.allocPrint(allocator, "{s}.ihex", .{root_name}),
.raw => return std.fmt.allocPrint(allocator, "{s}.bin", .{root_name}),
.plan9 => switch (options.output_mode) {
.Exe => return allocator.dupe(u8, root_name),
2022-08-18 18:58:28 -07:00
.Obj => return std.fmt.allocPrint(allocator, "{s}{s}", .{
root_name, target.ofmt.fileExt(target.cpu.arch),
}),
.Lib => return std.fmt.allocPrint(allocator, "{s}{s}.a", .{
target.libPrefix(), root_name,
}),
},
.nvptx => return std.fmt.allocPrint(allocator, "{s}.ptx", .{root_name}),
.dxcontainer => return std.fmt.allocPrint(allocator, "{s}.dxil", .{root_name}),
}
}
test {
@import("std").testing.refAllDecls(@This());
2019-03-02 16:46:04 -05:00
}