stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
//! Compilation of all Zig source code is represented by one `Module`.
|
|
|
|
|
//! Each `Compilation` has exactly one or zero `Module`, depending on whether
|
|
|
|
|
//! there is or is not any zig source code, respectively.
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
const std = @import("std");
|
|
|
|
|
const mem = std.mem;
|
|
|
|
|
const Allocator = std.mem.Allocator;
|
|
|
|
|
const ArrayListUnmanaged = std.ArrayListUnmanaged;
|
|
|
|
|
const assert = std.debug.assert;
|
|
|
|
|
const log = std.log.scoped(.module);
|
|
|
|
|
const BigIntConst = std.math.big.int.Const;
|
|
|
|
|
const BigIntMutable = std.math.big.int.Mutable;
|
|
|
|
|
const Target = std.Target;
|
2021-08-30 19:22:04 -07:00
|
|
|
const Ast = std.zig.Ast;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
|
|
|
|
const Module = @This();
|
|
|
|
|
const Compilation = @import("Compilation.zig");
|
2021-04-25 00:02:58 -07:00
|
|
|
const Cache = @import("Cache.zig");
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const Value = @import("value.zig").Value;
|
|
|
|
|
const Type = @import("type.zig").Type;
|
|
|
|
|
const TypedValue = @import("TypedValue.zig");
|
2020-09-13 19:17:58 -07:00
|
|
|
const Package = @import("Package.zig");
|
|
|
|
|
const link = @import("link.zig");
|
2021-07-08 20:42:47 -07:00
|
|
|
const Air = @import("Air.zig");
|
2021-04-13 12:38:06 -07:00
|
|
|
const Zir = @import("Zir.zig");
|
2020-09-13 19:17:58 -07:00
|
|
|
const trace = @import("tracy.zig").trace;
|
2021-03-28 19:08:42 +02:00
|
|
|
const AstGen = @import("AstGen.zig");
|
2021-03-16 00:03:47 -07:00
|
|
|
const Sema = @import("Sema.zig");
|
2021-01-08 23:16:50 +01:00
|
|
|
const target_util = @import("target.zig");
|
2021-07-03 11:47:58 -07:00
|
|
|
const build_options = @import("build_options");
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
/// General-purpose allocator. Used for both temporary and long-term storage.
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
comp: *Compilation,
|
|
|
|
|
|
|
|
|
|
/// Where our incremental compilation metadata serialization will go.
|
|
|
|
|
zig_cache_artifact_directory: Compilation.Directory,
|
2021-07-23 22:23:03 -07:00
|
|
|
/// Pointer to externally managed resource.
|
2020-09-13 19:17:58 -07:00
|
|
|
root_pkg: *Package,
|
2021-07-23 22:23:03 -07:00
|
|
|
/// Normally, `main_pkg` and `root_pkg` are the same. The exception is `zig test`, in which
|
|
|
|
|
/// `root_pkg` is the test runner, and `main_pkg` is the user's source file which has the tests.
|
|
|
|
|
main_pkg: *Package,
|
2021-04-25 10:43:07 -07:00
|
|
|
|
|
|
|
|
/// Used by AstGen worker to load and store ZIR cache.
|
|
|
|
|
global_zir_cache: Compilation.Directory,
|
|
|
|
|
/// Used by AstGen worker to load and store ZIR cache.
|
|
|
|
|
local_zir_cache: Compilation.Directory,
|
2021-05-14 17:41:22 -07:00
|
|
|
/// It's rare for a decl to be exported, so we save memory by having a sparse
|
|
|
|
|
/// map of Decl pointers to details about them being exported.
|
|
|
|
|
/// The Export memory is owned by the `export_owners` table; the slice itself
|
|
|
|
|
/// is owned by this table. The slice is guaranteed to not be empty.
|
2020-09-13 19:17:58 -07:00
|
|
|
decl_exports: std.AutoArrayHashMapUnmanaged(*Decl, []*Export) = .{},
|
|
|
|
|
/// This models the Decls that perform exports, so that `decl_exports` can be updated when a Decl
|
|
|
|
|
/// is modified. Note that the key of this table is not the Decl being exported, but the Decl that
|
|
|
|
|
/// is performing the export of another Decl.
|
|
|
|
|
/// This table owns the Export memory.
|
|
|
|
|
export_owners: std.AutoArrayHashMapUnmanaged(*Decl, []*Export) = .{},
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// The set of all the files in the Module. We keep track of this in order to iterate
|
|
|
|
|
/// over it and check which source files have been modified on the file system when
|
|
|
|
|
/// an update is requested, as well as to cache `@import` results.
|
|
|
|
|
/// Keys are fully resolved file paths. This table owns the keys and values.
|
|
|
|
|
import_table: std.StringArrayHashMapUnmanaged(*Scope.File) = .{},
|
|
|
|
|
|
2021-08-05 16:37:21 -07:00
|
|
|
/// The set of all the generic function instantiations. This is used so that when a generic
|
|
|
|
|
/// function is called twice with the same comptime parameter arguments, both calls dispatch
|
|
|
|
|
/// to the same function.
|
|
|
|
|
monomorphed_funcs: MonomorphedFuncsSet = .{},
|
|
|
|
|
|
2021-08-21 20:42:45 -07:00
|
|
|
/// The set of all comptime function calls that have been cached so that future calls
|
|
|
|
|
/// with the same parameters will get the same return value.
|
|
|
|
|
memoized_calls: MemoizedCallSet = .{},
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
/// We optimize memory usage for a compilation with no compile errors by storing the
|
|
|
|
|
/// error messages and mapping outside of `Decl`.
|
|
|
|
|
/// The ErrorMsg memory is owned by the decl, using Module's general purpose allocator.
|
|
|
|
|
/// Note that a Decl can succeed but the Fn it represents can fail. In this case,
|
|
|
|
|
/// a Decl can have a failed_decls entry but have analysis status of success.
|
2021-01-16 22:51:01 -07:00
|
|
|
failed_decls: std.AutoArrayHashMapUnmanaged(*Decl, *ErrorMsg) = .{},
|
|
|
|
|
/// Keep track of one `@compileLog` callsite per owner Decl.
|
2021-05-14 23:10:38 -07:00
|
|
|
/// The value is the AST node index offset from the Decl.
|
|
|
|
|
compile_log_decls: std.AutoArrayHashMapUnmanaged(*Decl, i32) = .{},
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Using a map here for consistency with the other fields here.
|
2021-04-07 20:36:01 -07:00
|
|
|
/// The ErrorMsg memory is owned by the `Scope.File`, using Module's general purpose allocator.
|
2021-04-16 14:44:02 -07:00
|
|
|
failed_files: std.AutoArrayHashMapUnmanaged(*Scope.File, ?*ErrorMsg) = .{},
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Using a map here for consistency with the other fields here.
|
|
|
|
|
/// The ErrorMsg memory is owned by the `Export`, using Module's general purpose allocator.
|
2021-01-16 22:51:01 -07:00
|
|
|
failed_exports: std.AutoArrayHashMapUnmanaged(*Export, *ErrorMsg) = .{},
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
next_anon_name_index: usize = 0,
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Candidates for deletion. After a semantic analysis update completes, this list
|
|
|
|
|
/// contains Decls that need to be deleted if they end up having no references to them.
|
2021-04-07 19:38:00 -07:00
|
|
|
deletion_set: std.AutoArrayHashMapUnmanaged(*Decl, void) = .{},
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
/// Error tags and their values, tag names are duped with mod.gpa.
|
2021-03-28 19:38:19 -07:00
|
|
|
/// Corresponds with `error_name_list`.
|
|
|
|
|
global_error_set: std.StringHashMapUnmanaged(ErrorInt) = .{},
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-03-28 19:38:19 -07:00
|
|
|
/// ErrorInt -> []const u8 for fast lookups for @intToError at comptime
|
|
|
|
|
/// Corresponds with `global_error_set`.
|
2021-03-26 17:54:41 -04:00
|
|
|
error_name_list: ArrayListUnmanaged([]const u8) = .{},
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Incrementing integer used to compare against the corresponding Decl
|
|
|
|
|
/// field to determine whether a Decl's status applies to an ongoing update, or a
|
|
|
|
|
/// previous analysis.
|
|
|
|
|
generation: u32 = 0,
|
|
|
|
|
|
2020-09-28 15:42:09 -07:00
|
|
|
stage1_flags: packed struct {
|
|
|
|
|
have_winmain: bool = false,
|
|
|
|
|
have_wwinmain: bool = false,
|
|
|
|
|
have_winmain_crt_startup: bool = false,
|
|
|
|
|
have_wwinmain_crt_startup: bool = false,
|
|
|
|
|
have_dllmain_crt_startup: bool = false,
|
|
|
|
|
have_c_main: bool = false,
|
|
|
|
|
reserved: u2 = 0,
|
|
|
|
|
} = .{},
|
2020-09-28 00:06:06 -07:00
|
|
|
|
2021-04-22 22:35:18 -07:00
|
|
|
job_queued_update_builtin_zig: bool = true,
|
|
|
|
|
|
2021-03-25 23:00:38 -07:00
|
|
|
compile_log_text: ArrayListUnmanaged(u8) = .{},
|
2021-01-16 22:51:01 -07:00
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
emit_h: ?*GlobalEmitH,
|
|
|
|
|
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
test_functions: std.AutoArrayHashMapUnmanaged(*Decl, void) = .{},
|
|
|
|
|
|
2021-08-05 16:37:21 -07:00
|
|
|
const MonomorphedFuncsSet = std.HashMapUnmanaged(
|
|
|
|
|
*Fn,
|
|
|
|
|
void,
|
|
|
|
|
MonomorphedFuncsContext,
|
|
|
|
|
std.hash_map.default_max_load_percentage,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
const MonomorphedFuncsContext = struct {
|
|
|
|
|
pub fn eql(ctx: @This(), a: *Fn, b: *Fn) bool {
|
|
|
|
|
_ = ctx;
|
|
|
|
|
return a == b;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Must match `Sema.GenericCallAdapter.hash`.
|
|
|
|
|
pub fn hash(ctx: @This(), key: *Fn) u64 {
|
|
|
|
|
_ = ctx;
|
|
|
|
|
var hasher = std.hash.Wyhash.init(0);
|
|
|
|
|
|
|
|
|
|
// The generic function Decl is guaranteed to be the first dependency
|
|
|
|
|
// of each of its instantiations.
|
|
|
|
|
const generic_owner_decl = key.owner_decl.dependencies.keys()[0];
|
|
|
|
|
const generic_func = generic_owner_decl.val.castTag(.function).?.data;
|
|
|
|
|
std.hash.autoHash(&hasher, @ptrToInt(generic_func));
|
|
|
|
|
|
|
|
|
|
// This logic must be kept in sync with the logic in `analyzeCall` that
|
|
|
|
|
// computes the hash.
|
|
|
|
|
const comptime_args = key.comptime_args.?;
|
|
|
|
|
const generic_ty_info = generic_owner_decl.ty.fnInfo();
|
|
|
|
|
for (generic_ty_info.param_types) |param_ty, i| {
|
|
|
|
|
if (generic_ty_info.paramIsComptime(i) and param_ty.tag() != .generic_poison) {
|
|
|
|
|
comptime_args[i].val.hash(param_ty, &hasher);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return hasher.final();
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2021-08-21 20:42:45 -07:00
|
|
|
pub const MemoizedCallSet = std.HashMapUnmanaged(
|
|
|
|
|
MemoizedCall.Key,
|
|
|
|
|
MemoizedCall.Result,
|
|
|
|
|
MemoizedCall,
|
|
|
|
|
std.hash_map.default_max_load_percentage,
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
pub const MemoizedCall = struct {
|
|
|
|
|
pub const Key = struct {
|
|
|
|
|
func: *Fn,
|
|
|
|
|
args: []TypedValue,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const Result = struct {
|
|
|
|
|
val: Value,
|
|
|
|
|
arena: std.heap.ArenaAllocator.State,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub fn eql(ctx: @This(), a: Key, b: Key) bool {
|
|
|
|
|
_ = ctx;
|
|
|
|
|
|
|
|
|
|
if (a.func != b.func) return false;
|
|
|
|
|
|
|
|
|
|
assert(a.args.len == b.args.len);
|
|
|
|
|
for (a.args) |a_arg, arg_i| {
|
|
|
|
|
const b_arg = b.args[arg_i];
|
|
|
|
|
if (!a_arg.eql(b_arg)) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Must match `Sema.GenericCallAdapter.hash`.
|
|
|
|
|
pub fn hash(ctx: @This(), key: Key) u64 {
|
|
|
|
|
_ = ctx;
|
|
|
|
|
|
|
|
|
|
var hasher = std.hash.Wyhash.init(0);
|
|
|
|
|
|
|
|
|
|
// The generic function Decl is guaranteed to be the first dependency
|
|
|
|
|
// of each of its instantiations.
|
|
|
|
|
std.hash.autoHash(&hasher, @ptrToInt(key.func));
|
|
|
|
|
|
|
|
|
|
// This logic must be kept in sync with the logic in `analyzeCall` that
|
|
|
|
|
// computes the hash.
|
|
|
|
|
for (key.args) |arg| {
|
|
|
|
|
arg.hash(&hasher);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return hasher.final();
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
/// A `Module` has zero or one of these depending on whether `-femit-h` is enabled.
|
|
|
|
|
pub const GlobalEmitH = struct {
|
|
|
|
|
/// Where to put the output.
|
|
|
|
|
loc: Compilation.EmitLoc,
|
|
|
|
|
/// When emit_h is non-null, each Decl gets one more compile error slot for
|
|
|
|
|
/// emit-h failing for that Decl. This table is also how we tell if a Decl has
|
|
|
|
|
/// failed emit-h or succeeded.
|
|
|
|
|
failed_decls: std.AutoArrayHashMapUnmanaged(*Decl, *ErrorMsg) = .{},
|
|
|
|
|
/// Tracks all decls in order to iterate over them and emit .h code for them.
|
|
|
|
|
decl_table: std.AutoArrayHashMapUnmanaged(*Decl, void) = .{},
|
|
|
|
|
};
|
|
|
|
|
|
2021-03-28 19:38:19 -07:00
|
|
|
pub const ErrorInt = u32;
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
pub const Export = struct {
|
|
|
|
|
options: std.builtin.ExportOptions,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
src: LazySrcLoc,
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Represents the position of the export, if any, in the output file.
|
2020-10-07 20:32:02 +02:00
|
|
|
link: link.File.Export,
|
2020-09-13 19:17:58 -07:00
|
|
|
/// The Decl that performs the export. Note that this is *not* the Decl being exported.
|
|
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// The Decl being exported. Note this is *not* the Decl performing the export.
|
|
|
|
|
exported_decl: *Decl,
|
|
|
|
|
status: enum {
|
|
|
|
|
in_progress,
|
|
|
|
|
failed,
|
|
|
|
|
/// Indicates that the failure was due to a temporary issue, such as an I/O error
|
|
|
|
|
/// when writing to the output file. Retrying the export may succeed.
|
|
|
|
|
failed_retryable,
|
|
|
|
|
complete,
|
|
|
|
|
},
|
2021-05-14 17:41:22 -07:00
|
|
|
|
|
|
|
|
pub fn getSrcLoc(exp: Export) SrcLoc {
|
|
|
|
|
return .{
|
|
|
|
|
.file_scope = exp.owner_decl.namespace.file_scope,
|
|
|
|
|
.parent_decl_node = exp.owner_decl.src_node,
|
|
|
|
|
.lazy = exp.src,
|
|
|
|
|
};
|
|
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
|
2021-01-05 17:33:31 -07:00
|
|
|
/// When Module emit_h field is non-null, each Decl is allocated via this struct, so that
|
|
|
|
|
/// there can be EmitH state attached to each Decl.
|
|
|
|
|
pub const DeclPlusEmitH = struct {
|
|
|
|
|
decl: Decl,
|
|
|
|
|
emit_h: EmitH,
|
|
|
|
|
};
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
pub const Decl = struct {
|
2021-05-07 18:52:11 -07:00
|
|
|
/// Allocated with Module's allocator; outlives the ZIR code.
|
2020-09-13 19:17:58 -07:00
|
|
|
name: [*:0]const u8,
|
2021-04-27 18:36:12 -07:00
|
|
|
/// The most recent Type of the Decl after a successful semantic analysis.
|
|
|
|
|
/// Populated when `has_tv`.
|
|
|
|
|
ty: Type,
|
|
|
|
|
/// The most recent Value of the Decl after a successful semantic analysis.
|
|
|
|
|
/// Populated when `has_tv`.
|
|
|
|
|
val: Value,
|
|
|
|
|
/// Populated when `has_tv`.
|
|
|
|
|
align_val: Value,
|
|
|
|
|
/// Populated when `has_tv`.
|
|
|
|
|
linksection_val: Value,
|
|
|
|
|
/// The memory for ty, val, align_val, linksection_val.
|
|
|
|
|
/// If this is `null` then there is no memory management needed.
|
|
|
|
|
value_arena: ?*std.heap.ArenaAllocator.State = null,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// The direct parent namespace of the Decl.
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Reference to externally owned memory.
|
2021-05-04 11:08:40 -07:00
|
|
|
/// In the case of the Decl corresponding to a file, this is
|
|
|
|
|
/// the namespace of the struct, since there is no parent.
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
namespace: *Scope.Namespace,
|
2021-04-08 20:37:19 -07:00
|
|
|
|
|
|
|
|
/// An integer that can be checked against the corresponding incrementing
|
|
|
|
|
/// generation field of Module. This is used to determine whether `complete` status
|
|
|
|
|
/// represents pre- or post- re-analysis.
|
|
|
|
|
generation: u32,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// The AST node index of this declaration.
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Must be recomputed when the corresponding source file is modified.
|
2021-08-30 19:22:04 -07:00
|
|
|
src_node: Ast.Node.Index,
|
2021-05-01 21:57:52 -07:00
|
|
|
/// Line number corresponding to `src_node`. Stored separately so that source files
|
|
|
|
|
/// do not need to be loaded into memory in order to compute debug line numbers.
|
|
|
|
|
src_line: u32,
|
2021-04-28 16:55:22 -07:00
|
|
|
/// Index to ZIR `extra` array to the entry in the parent's decl structure
|
|
|
|
|
/// (the part that says "for every decls_len"). The first item at this index is
|
2021-05-01 21:57:52 -07:00
|
|
|
/// the contents hash, followed by line, name, etc.
|
2021-05-05 13:16:14 -07:00
|
|
|
/// For anonymous decls and also the root Decl for a File, this is 0.
|
2021-04-28 16:55:22 -07:00
|
|
|
zir_decl_index: Zir.Inst.Index,
|
2021-04-08 20:37:19 -07:00
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Represents the "shallow" analysis status. For example, for decls that are functions,
|
|
|
|
|
/// the function type is analyzed with this set to `in_progress`, however, the semantic
|
|
|
|
|
/// analysis of the function body is performed with this value set to `success`. Functions
|
|
|
|
|
/// have their own analysis status field.
|
|
|
|
|
analysis: enum {
|
|
|
|
|
/// This Decl corresponds to an AST Node that has not been referenced yet, and therefore
|
|
|
|
|
/// because of Zig's lazy declaration analysis, it will remain unanalyzed until referenced.
|
|
|
|
|
unreferenced,
|
2021-05-11 17:34:13 -07:00
|
|
|
/// Semantic analysis for this Decl is running right now.
|
|
|
|
|
/// This state detects dependency loops.
|
2020-09-13 19:17:58 -07:00
|
|
|
in_progress,
|
2021-05-11 17:34:13 -07:00
|
|
|
/// The file corresponding to this Decl had a parse error or ZIR error.
|
|
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_files.
|
|
|
|
|
file_failure,
|
2020-09-13 19:17:58 -07:00
|
|
|
/// This Decl might be OK but it depends on another one which did not successfully complete
|
|
|
|
|
/// semantic analysis.
|
|
|
|
|
dependency_failure,
|
|
|
|
|
/// Semantic analysis failure.
|
|
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_decls.
|
|
|
|
|
sema_failure,
|
|
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_decls.
|
|
|
|
|
/// This indicates the failure was something like running out of disk space,
|
|
|
|
|
/// and attempting semantic analysis again may succeed.
|
|
|
|
|
sema_failure_retryable,
|
|
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_decls.
|
|
|
|
|
codegen_failure,
|
|
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_decls.
|
|
|
|
|
/// This indicates the failure was something like running out of disk space,
|
|
|
|
|
/// and attempting codegen again may succeed.
|
|
|
|
|
codegen_failure_retryable,
|
|
|
|
|
/// Everything is done. During an update, this Decl may be out of date, depending
|
|
|
|
|
/// on its dependencies. The `generation` field can be used to determine if this
|
|
|
|
|
/// completion status occurred before or after a given update.
|
|
|
|
|
complete,
|
|
|
|
|
/// A Module update is in progress, and this Decl has been flagged as being known
|
|
|
|
|
/// to require re-analysis.
|
|
|
|
|
outdated,
|
|
|
|
|
},
|
2021-04-27 18:36:12 -07:00
|
|
|
/// Whether `typed_value`, `align_val`, and `linksection_val` are populated.
|
|
|
|
|
has_tv: bool,
|
2021-05-11 14:17:52 -07:00
|
|
|
/// If `true` it means the `Decl` is the resource owner of the type/value associated
|
|
|
|
|
/// with it. That means when `Decl` is destroyed, the cleanup code should additionally
|
|
|
|
|
/// check if the value owns a `Namespace`, and destroy that too.
|
|
|
|
|
owns_tv: bool,
|
2021-04-07 19:38:00 -07:00
|
|
|
/// This flag is set when this Decl is added to `Module.deletion_set`, and cleared
|
2020-09-13 19:17:58 -07:00
|
|
|
/// when removed.
|
|
|
|
|
deletion_flag: bool,
|
|
|
|
|
/// Whether the corresponding AST decl has a `pub` keyword.
|
|
|
|
|
is_pub: bool,
|
2021-04-27 18:36:12 -07:00
|
|
|
/// Whether the corresponding AST decl has a `export` keyword.
|
|
|
|
|
is_exported: bool,
|
2021-04-28 16:55:22 -07:00
|
|
|
/// Whether the ZIR code provides an align instruction.
|
|
|
|
|
has_align: bool,
|
|
|
|
|
/// Whether the ZIR code provides a linksection instruction.
|
|
|
|
|
has_linksection: bool,
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
/// Flag used by garbage collection to mark and sweep.
|
|
|
|
|
/// Decls which correspond to an AST node always have this field set to `true`.
|
|
|
|
|
/// Anonymous Decls are initialized with this field set to `false` and then it
|
|
|
|
|
/// is the responsibility of machine code backends to mark it `true` whenever
|
|
|
|
|
/// a `decl_ref` Value is encountered that points to this Decl.
|
|
|
|
|
/// When the `codegen_decl` job is encountered in the main work queue, if the
|
|
|
|
|
/// Decl is marked alive, then it sends the Decl to the linker. Otherwise it
|
|
|
|
|
/// deletes the Decl on the spot.
|
|
|
|
|
alive: bool,
|
2021-08-28 15:35:59 -07:00
|
|
|
/// Whether the Decl is a `usingnamespace` declaration.
|
|
|
|
|
is_usingnamespace: bool,
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
/// Represents the position of the code in the output file.
|
|
|
|
|
/// This is populated regardless of semantic analysis and code generation.
|
|
|
|
|
link: link.File.LinkBlock,
|
|
|
|
|
|
|
|
|
|
/// Represents the function in the linked output file, if the `Decl` is a function.
|
|
|
|
|
/// This is stored here and not in `Fn` because `Decl` survives across updates but
|
|
|
|
|
/// `Fn` does not.
|
|
|
|
|
/// TODO Look into making `Fn` a longer lived structure and moving this field there
|
|
|
|
|
/// to save on memory usage.
|
|
|
|
|
fn_link: link.File.LinkFn,
|
|
|
|
|
|
|
|
|
|
/// The shallow set of other decls whose typed_value could possibly change if this Decl's
|
|
|
|
|
/// typed_value is modified.
|
|
|
|
|
dependants: DepsTable = .{},
|
|
|
|
|
/// The shallow set of other decls whose typed_value changing indicates that this Decl's
|
|
|
|
|
/// typed_value may need to be regenerated.
|
|
|
|
|
dependencies: DepsTable = .{},
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
pub const DepsTable = std.AutoArrayHashMapUnmanaged(*Decl, void);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-04-29 20:34:33 -07:00
|
|
|
pub fn clearName(decl: *Decl, gpa: *Allocator) void {
|
2021-05-07 18:52:11 -07:00
|
|
|
gpa.free(mem.spanZ(decl.name));
|
2021-04-29 20:34:33 -07:00
|
|
|
decl.name = undefined;
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn destroy(decl: *Decl, module: *Module) void {
|
2021-01-05 17:33:31 -07:00
|
|
|
const gpa = module.gpa;
|
2021-05-07 16:05:44 -07:00
|
|
|
log.debug("destroy {*} ({s})", .{ decl, decl.name });
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
_ = module.test_functions.swapRemove(decl);
|
2021-05-11 22:12:36 -07:00
|
|
|
if (decl.deletion_flag) {
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(module.deletion_set.swapRemove(decl));
|
2021-05-11 22:12:36 -07:00
|
|
|
}
|
2021-04-27 18:36:12 -07:00
|
|
|
if (decl.has_tv) {
|
2021-05-07 18:52:11 -07:00
|
|
|
if (decl.getInnerNamespace()) |namespace| {
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
namespace.destroyDecls(module);
|
2021-03-20 22:40:08 -07:00
|
|
|
}
|
2021-04-28 22:43:26 -07:00
|
|
|
decl.clearValues(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
decl.dependants.deinit(gpa);
|
|
|
|
|
decl.dependencies.deinit(gpa);
|
2021-05-11 22:12:36 -07:00
|
|
|
decl.clearName(gpa);
|
2021-01-05 17:33:31 -07:00
|
|
|
if (module.emit_h != null) {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const decl_plus_emit_h = @fieldParentPtr(DeclPlusEmitH, "decl", decl);
|
2021-01-05 17:33:31 -07:00
|
|
|
decl_plus_emit_h.emit_h.fwd_decl.deinit(gpa);
|
|
|
|
|
gpa.destroy(decl_plus_emit_h);
|
|
|
|
|
} else {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
gpa.destroy(decl);
|
2021-01-05 17:33:31 -07:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-28 22:43:26 -07:00
|
|
|
pub fn clearValues(decl: *Decl, gpa: *Allocator) void {
|
2021-05-06 17:20:45 -07:00
|
|
|
if (decl.getFunction()) |func| {
|
|
|
|
|
func.deinit(gpa);
|
|
|
|
|
gpa.destroy(func);
|
|
|
|
|
}
|
2021-05-07 18:52:11 -07:00
|
|
|
if (decl.getVariable()) |variable| {
|
|
|
|
|
gpa.destroy(variable);
|
|
|
|
|
}
|
2021-04-28 22:43:26 -07:00
|
|
|
if (decl.value_arena) |arena_state| {
|
|
|
|
|
arena_state.promote(gpa).deinit();
|
|
|
|
|
decl.value_arena = null;
|
|
|
|
|
decl.has_tv = false;
|
2021-05-11 14:17:52 -07:00
|
|
|
decl.owns_tv = false;
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-30 11:07:31 -07:00
|
|
|
pub fn finalizeNewArena(decl: *Decl, arena: *std.heap.ArenaAllocator) !void {
|
|
|
|
|
assert(decl.value_arena == null);
|
|
|
|
|
const arena_state = try arena.allocator.create(std.heap.ArenaAllocator.State);
|
|
|
|
|
arena_state.* = arena.state;
|
|
|
|
|
decl.value_arena = arena_state;
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
/// This name is relative to the containing namespace of the decl.
|
|
|
|
|
/// The memory is owned by the containing File ZIR.
|
|
|
|
|
pub fn getName(decl: Decl) ?[:0]const u8 {
|
|
|
|
|
const zir = decl.namespace.file_scope.zir;
|
|
|
|
|
return decl.getNameZir(zir);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn getNameZir(decl: Decl, zir: Zir) ?[:0]const u8 {
|
2021-05-05 13:16:14 -07:00
|
|
|
assert(decl.zir_decl_index != 0);
|
2021-05-01 21:57:52 -07:00
|
|
|
const name_index = zir.extra[decl.zir_decl_index + 5];
|
2021-04-28 16:55:22 -07:00
|
|
|
if (name_index <= 1) return null;
|
|
|
|
|
return zir.nullTerminatedString(name_index);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn contentsHash(decl: Decl) std.zig.SrcHash {
|
|
|
|
|
const zir = decl.namespace.file_scope.zir;
|
|
|
|
|
return decl.contentsHashZir(zir);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn contentsHashZir(decl: Decl, zir: Zir) std.zig.SrcHash {
|
2021-05-05 13:16:14 -07:00
|
|
|
assert(decl.zir_decl_index != 0);
|
2021-04-28 16:55:22 -07:00
|
|
|
const hash_u32s = zir.extra[decl.zir_decl_index..][0..4];
|
|
|
|
|
const contents_hash = @bitCast(std.zig.SrcHash, hash_u32s.*);
|
|
|
|
|
return contents_hash;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
pub fn zirBlockIndex(decl: *const Decl) Zir.Inst.Index {
|
2021-05-05 13:16:14 -07:00
|
|
|
assert(decl.zir_decl_index != 0);
|
2021-04-28 16:55:22 -07:00
|
|
|
const zir = decl.namespace.file_scope.zir;
|
2021-05-01 21:57:52 -07:00
|
|
|
return zir.extra[decl.zir_decl_index + 6];
|
2021-04-28 16:55:22 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn zirAlignRef(decl: Decl) Zir.Inst.Ref {
|
|
|
|
|
if (!decl.has_align) return .none;
|
2021-05-05 13:16:14 -07:00
|
|
|
assert(decl.zir_decl_index != 0);
|
2021-04-28 16:55:22 -07:00
|
|
|
const zir = decl.namespace.file_scope.zir;
|
|
|
|
|
return @intToEnum(Zir.Inst.Ref, zir.extra[decl.zir_decl_index + 6]);
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-28 22:43:26 -07:00
|
|
|
pub fn zirLinksectionRef(decl: Decl) Zir.Inst.Ref {
|
2021-04-28 16:55:22 -07:00
|
|
|
if (!decl.has_linksection) return .none;
|
2021-05-05 13:16:14 -07:00
|
|
|
assert(decl.zir_decl_index != 0);
|
2021-04-28 16:55:22 -07:00
|
|
|
const zir = decl.namespace.file_scope.zir;
|
|
|
|
|
const extra_index = decl.zir_decl_index + 6 + @boolToInt(decl.has_align);
|
|
|
|
|
return @intToEnum(Zir.Inst.Ref, zir.extra[extra_index]);
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
/// Returns true if and only if the Decl is the top level struct associated with a File.
|
|
|
|
|
pub fn isRoot(decl: *const Decl) bool {
|
|
|
|
|
if (decl.namespace.parent != null)
|
|
|
|
|
return false;
|
|
|
|
|
return decl == decl.namespace.ty.getOwnerDecl();
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-01 21:57:52 -07:00
|
|
|
pub fn relativeToLine(decl: Decl, offset: u32) u32 {
|
|
|
|
|
return decl.src_line + offset;
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn relativeToNodeIndex(decl: Decl, offset: i32) Ast.Node.Index {
|
|
|
|
|
return @bitCast(Ast.Node.Index, offset + @bitCast(i32, decl.src_node));
|
2021-03-20 17:09:06 -07:00
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn nodeIndexToRelative(decl: Decl, node_index: Ast.Node.Index) i32 {
|
2021-04-08 20:37:19 -07:00
|
|
|
return @bitCast(i32, node_index) - @bitCast(i32, decl.src_node);
|
2021-03-20 17:09:06 -07:00
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn tokSrcLoc(decl: Decl, token_index: Ast.TokenIndex) LazySrcLoc {
|
2021-03-19 14:59:46 -07:00
|
|
|
return .{ .token_offset = token_index - decl.srcToken() };
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn nodeSrcLoc(decl: Decl, node_index: Ast.Node.Index) LazySrcLoc {
|
2021-03-20 17:09:06 -07:00
|
|
|
return .{ .node_offset = decl.nodeIndexToRelative(node_index) };
|
2021-03-19 23:06:19 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-16 14:44:02 -07:00
|
|
|
pub fn srcLoc(decl: Decl) SrcLoc {
|
2021-05-14 23:10:38 -07:00
|
|
|
return decl.nodeOffsetSrcLoc(0);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn nodeOffsetSrcLoc(decl: Decl, node_offset: i32) SrcLoc {
|
2021-01-16 22:51:01 -07:00
|
|
|
return .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = decl.src_node,
|
2021-05-14 23:10:38 -07:00
|
|
|
.lazy = .{ .node_offset = node_offset },
|
2021-01-16 22:51:01 -07:00
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn srcToken(decl: Decl) Ast.TokenIndex {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
const tree = &decl.namespace.file_scope.tree;
|
2021-04-08 20:37:19 -07:00
|
|
|
return tree.firstToken(decl.src_node);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn srcByteOffset(decl: Decl) u32 {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
const tree = &decl.namespace.file_scope.tree;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return tree.tokens.items(.start)[decl.srcToken()];
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
pub fn renderFullyQualifiedName(decl: Decl, writer: anytype) !void {
|
|
|
|
|
const unqualified_name = mem.spanZ(decl.name);
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
return decl.namespace.renderFullyQualifiedName(unqualified_name, writer);
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn getFullyQualifiedName(decl: Decl, gpa: *Allocator) ![]u8 {
|
|
|
|
|
var buffer = std.ArrayList(u8).init(gpa);
|
|
|
|
|
defer buffer.deinit();
|
|
|
|
|
try decl.renderFullyQualifiedName(buffer.writer());
|
|
|
|
|
return buffer.toOwnedSlice();
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-27 18:36:12 -07:00
|
|
|
pub fn typedValue(decl: Decl) error{AnalysisFail}!TypedValue {
|
|
|
|
|
if (!decl.has_tv) return error.AnalysisFail;
|
|
|
|
|
return TypedValue{
|
|
|
|
|
.ty = decl.ty,
|
|
|
|
|
.val = decl.val,
|
|
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn value(decl: *Decl) error{AnalysisFail}!Value {
|
|
|
|
|
return (try decl.typedValue()).val;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn isFunction(decl: *Decl) !bool {
|
|
|
|
|
const tv = try decl.typedValue();
|
|
|
|
|
return tv.ty.zigTypeTag() == .Fn;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-05 13:16:14 -07:00
|
|
|
/// If the Decl has a value and it is a struct, return it,
|
|
|
|
|
/// otherwise null.
|
2021-05-05 16:56:24 -07:00
|
|
|
pub fn getStruct(decl: *Decl) ?*Struct {
|
2021-05-11 14:17:52 -07:00
|
|
|
if (!decl.owns_tv) return null;
|
2021-05-05 13:16:14 -07:00
|
|
|
const ty = (decl.val.castTag(.ty) orelse return null).data;
|
|
|
|
|
const struct_obj = (ty.castTag(.@"struct") orelse return null).data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(struct_obj.owner_decl == decl);
|
2021-05-05 13:16:14 -07:00
|
|
|
return struct_obj;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-07 14:18:14 -07:00
|
|
|
/// If the Decl has a value and it is a union, return it,
|
|
|
|
|
/// otherwise null.
|
|
|
|
|
pub fn getUnion(decl: *Decl) ?*Union {
|
2021-05-11 14:17:52 -07:00
|
|
|
if (!decl.owns_tv) return null;
|
2021-05-07 14:18:14 -07:00
|
|
|
const ty = (decl.val.castTag(.ty) orelse return null).data;
|
|
|
|
|
const union_obj = (ty.cast(Type.Payload.Union) orelse return null).data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(union_obj.owner_decl == decl);
|
2021-05-07 14:18:14 -07:00
|
|
|
return union_obj;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-05 16:56:24 -07:00
|
|
|
/// If the Decl has a value and it is a function, return it,
|
|
|
|
|
/// otherwise null.
|
|
|
|
|
pub fn getFunction(decl: *Decl) ?*Fn {
|
2021-05-11 14:17:52 -07:00
|
|
|
if (!decl.owns_tv) return null;
|
2021-05-05 16:56:24 -07:00
|
|
|
const func = (decl.val.castTag(.function) orelse return null).data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(func.owner_decl == decl);
|
2021-05-05 16:56:24 -07:00
|
|
|
return func;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-07 18:52:11 -07:00
|
|
|
pub fn getVariable(decl: *Decl) ?*Var {
|
2021-05-11 14:17:52 -07:00
|
|
|
if (!decl.owns_tv) return null;
|
2021-05-07 18:52:11 -07:00
|
|
|
const variable = (decl.val.castTag(.variable) orelse return null).data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(variable.owner_decl == decl);
|
2021-05-07 18:52:11 -07:00
|
|
|
return variable;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Gets the namespace that this Decl creates by being a struct, union,
|
|
|
|
|
/// enum, or opaque.
|
|
|
|
|
/// Only returns it if the Decl is the owner.
|
|
|
|
|
pub fn getInnerNamespace(decl: *Decl) ?*Scope.Namespace {
|
2021-05-11 14:17:52 -07:00
|
|
|
if (!decl.owns_tv) return null;
|
2021-05-07 18:52:11 -07:00
|
|
|
const ty = (decl.val.castTag(.ty) orelse return null).data;
|
|
|
|
|
switch (ty.tag()) {
|
|
|
|
|
.@"struct" => {
|
|
|
|
|
const struct_obj = ty.castTag(.@"struct").?.data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(struct_obj.owner_decl == decl);
|
2021-05-07 18:52:11 -07:00
|
|
|
return &struct_obj.namespace;
|
|
|
|
|
},
|
2021-08-20 15:23:55 -07:00
|
|
|
.enum_full, .enum_nonexhaustive => {
|
|
|
|
|
const enum_obj = ty.cast(Type.Payload.EnumFull).?.data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(enum_obj.owner_decl == decl);
|
2021-05-07 18:52:11 -07:00
|
|
|
return &enum_obj.namespace;
|
|
|
|
|
},
|
|
|
|
|
.empty_struct => {
|
2021-05-11 14:17:52 -07:00
|
|
|
return ty.castTag(.empty_struct).?.data;
|
2021-05-07 18:52:11 -07:00
|
|
|
},
|
|
|
|
|
.@"opaque" => {
|
|
|
|
|
@panic("TODO opaque types");
|
|
|
|
|
},
|
|
|
|
|
.@"union", .union_tagged => {
|
|
|
|
|
const union_obj = ty.cast(Type.Payload.Union).?.data;
|
2021-05-11 14:17:52 -07:00
|
|
|
assert(union_obj.owner_decl == decl);
|
2021-05-07 18:52:11 -07:00
|
|
|
return &union_obj.namespace;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
else => return null,
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn dump(decl: *Decl) void {
|
|
|
|
|
const loc = std.zig.findLineColumn(decl.scope.source.bytes, decl.src);
|
2021-01-02 19:03:14 -07:00
|
|
|
std.debug.print("{s}:{d}:{d} name={s} status={s}", .{
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
decl.scope.sub_file_path,
|
2020-09-13 19:17:58 -07:00
|
|
|
loc.line + 1,
|
|
|
|
|
loc.column + 1,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mem.spanZ(decl.name),
|
|
|
|
|
@tagName(decl.analysis),
|
2020-09-13 19:17:58 -07:00
|
|
|
});
|
2021-04-27 18:36:12 -07:00
|
|
|
if (decl.has_tv) {
|
|
|
|
|
std.debug.print(" ty={} val={}", .{ decl.ty, decl.val });
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
std.debug.print("\n", .{});
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn getFileScope(decl: Decl) *Scope.File {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
return decl.namespace.file_scope;
|
2021-01-01 19:24:02 -07:00
|
|
|
}
|
|
|
|
|
|
2021-01-05 17:33:31 -07:00
|
|
|
pub fn getEmitH(decl: *Decl, module: *Module) *EmitH {
|
|
|
|
|
assert(module.emit_h != null);
|
|
|
|
|
const decl_plus_emit_h = @fieldParentPtr(DeclPlusEmitH, "decl", decl);
|
|
|
|
|
return &decl_plus_emit_h.emit_h;
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
fn removeDependant(decl: *Decl, other: *Decl) void {
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(decl.dependants.swapRemove(other));
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
fn removeDependency(decl: *Decl, other: *Decl) void {
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(decl.dependencies.swapRemove(other));
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2021-01-05 17:33:31 -07:00
|
|
|
/// This state is attached to every Decl when Module emit_h is non-null.
|
|
|
|
|
pub const EmitH = struct {
|
2021-03-25 23:00:38 -07:00
|
|
|
fwd_decl: ArrayListUnmanaged(u8) = .{},
|
2021-01-05 17:33:31 -07:00
|
|
|
};
|
|
|
|
|
|
2021-03-28 19:38:19 -07:00
|
|
|
/// Represents the data that an explicit error set syntax provides.
|
|
|
|
|
pub const ErrorSet = struct {
|
2021-05-04 11:08:40 -07:00
|
|
|
/// The Decl that corresponds to the error set itself.
|
2021-03-28 19:38:19 -07:00
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// Offset from Decl node index, points to the error set AST node.
|
|
|
|
|
node_offset: i32,
|
|
|
|
|
names_len: u32,
|
|
|
|
|
/// The string bytes are stored in the owner Decl arena.
|
|
|
|
|
/// They are in the same order they appear in the AST.
|
2021-05-04 13:58:08 -07:00
|
|
|
/// The length is given by `names_len`.
|
2021-03-28 19:38:19 -07:00
|
|
|
names_ptr: [*]const []const u8,
|
2021-04-07 11:26:07 -07:00
|
|
|
|
|
|
|
|
pub fn srcLoc(self: ErrorSet) SrcLoc {
|
|
|
|
|
return .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = self.owner_decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = self.owner_decl.src_node,
|
2021-04-07 11:26:07 -07:00
|
|
|
.lazy = .{ .node_offset = self.node_offset },
|
|
|
|
|
};
|
|
|
|
|
}
|
2021-03-28 19:38:19 -07:00
|
|
|
};
|
|
|
|
|
|
2021-04-01 22:34:40 -07:00
|
|
|
/// Represents the data that a struct declaration provides.
|
|
|
|
|
pub const Struct = struct {
|
2021-05-04 11:08:40 -07:00
|
|
|
/// The Decl that corresponds to the struct itself.
|
2021-04-01 22:34:40 -07:00
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// Set of field names in declaration order.
|
|
|
|
|
fields: std.StringArrayHashMapUnmanaged(Field),
|
|
|
|
|
/// Represents the declarations inside this struct.
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
namespace: Scope.Namespace,
|
2021-04-02 21:06:09 -07:00
|
|
|
/// Offset from `owner_decl`, points to the struct AST node.
|
2021-04-01 22:34:40 -07:00
|
|
|
node_offset: i32,
|
2021-05-03 11:46:02 -07:00
|
|
|
/// Index of the struct_decl ZIR instruction.
|
|
|
|
|
zir_index: Zir.Inst.Index,
|
2021-04-01 22:34:40 -07:00
|
|
|
|
2021-05-02 18:50:01 -07:00
|
|
|
layout: std.builtin.TypeInfo.ContainerLayout,
|
|
|
|
|
status: enum {
|
|
|
|
|
none,
|
2021-05-03 18:35:37 -07:00
|
|
|
field_types_wip,
|
2021-05-02 18:50:01 -07:00
|
|
|
have_field_types,
|
2021-05-03 18:35:37 -07:00
|
|
|
layout_wip,
|
2021-05-02 18:50:01 -07:00
|
|
|
have_layout,
|
|
|
|
|
},
|
2021-07-23 22:23:03 -07:00
|
|
|
/// If true, definitely nonzero size at runtime. If false, resolving the fields
|
|
|
|
|
/// is necessary to determine whether it has bits at runtime.
|
|
|
|
|
known_has_bits: bool,
|
2021-05-02 18:50:01 -07:00
|
|
|
|
2021-08-20 15:23:55 -07:00
|
|
|
/// The `Type` and `Value` memory is owned by the arena of the Struct's owner_decl.
|
2021-04-01 22:34:40 -07:00
|
|
|
pub const Field = struct {
|
2021-04-29 17:13:18 -07:00
|
|
|
/// Uses `noreturn` to indicate `anytype`.
|
2021-05-02 18:50:01 -07:00
|
|
|
/// undefined until `status` is `have_field_types` or `have_layout`.
|
2021-04-01 22:34:40 -07:00
|
|
|
ty: Type,
|
|
|
|
|
abi_align: Value,
|
|
|
|
|
/// Uses `unreachable_value` to indicate no default.
|
|
|
|
|
default_val: Value,
|
2021-05-02 18:50:01 -07:00
|
|
|
/// undefined until `status` is `have_layout`.
|
|
|
|
|
offset: u32,
|
2021-04-29 16:57:13 -07:00
|
|
|
is_comptime: bool,
|
2021-04-01 22:34:40 -07:00
|
|
|
};
|
stage2: progress towards basic structs
Introduce `ResultLoc.none_or_ref` which is used by field access
expressions to avoid unnecessary loads when the field access itself
will do the load. This turns:
```zig
p.y - p.x - p.x
```
from
```zir
%14 = load(%4) node_offset:8:12
%15 = field_val(%14, "y") node_offset:8:13
%16 = load(%4) node_offset:8:18
%17 = field_val(%16, "x") node_offset:8:19
%18 = sub(%15, %17) node_offset:8:16
%19 = load(%4) node_offset:8:24
%20 = field_val(%19, "x") node_offset:8:25
```
to
```zir
%14 = field_val(%4, "y") node_offset:8:13
%15 = field_val(%4, "x") node_offset:8:19
%16 = sub(%14, %15) node_offset:8:16
%17 = field_val(%4, "x") node_offset:8:25
```
Much more compact. This requires `Sema.zirFieldVal` to support both
pointers and non-pointers.
C backend: Implement typedefs for struct types, as well as the following
TZIR instructions:
* mul
* mulwrap
* addwrap
* subwrap
* ref
* struct_field_ptr
Note that add, addwrap, sub, subwrap, mul, mulwrap instructions are all
incorrect currently and need to be updated to properly handle wrapping
and non wrapping for signed and unsigned.
C backend: change indentation delta to 1, to make the output smaller and
to process fewer bytes.
I promise I will add a test case as soon as I fix those warnings that
are being printed for my test case.
2021-04-02 19:11:51 -07:00
|
|
|
|
2021-04-02 21:06:09 -07:00
|
|
|
pub fn getFullyQualifiedName(s: *Struct, gpa: *Allocator) ![]u8 {
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
return s.owner_decl.getFullyQualifiedName(gpa);
|
2021-04-02 21:06:09 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn srcLoc(s: Struct) SrcLoc {
|
|
|
|
|
return .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = s.owner_decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = s.owner_decl.src_node,
|
2021-04-02 21:06:09 -07:00
|
|
|
.lazy = .{ .node_offset = s.node_offset },
|
|
|
|
|
};
|
stage2: progress towards basic structs
Introduce `ResultLoc.none_or_ref` which is used by field access
expressions to avoid unnecessary loads when the field access itself
will do the load. This turns:
```zig
p.y - p.x - p.x
```
from
```zir
%14 = load(%4) node_offset:8:12
%15 = field_val(%14, "y") node_offset:8:13
%16 = load(%4) node_offset:8:18
%17 = field_val(%16, "x") node_offset:8:19
%18 = sub(%15, %17) node_offset:8:16
%19 = load(%4) node_offset:8:24
%20 = field_val(%19, "x") node_offset:8:25
```
to
```zir
%14 = field_val(%4, "y") node_offset:8:13
%15 = field_val(%4, "x") node_offset:8:19
%16 = sub(%14, %15) node_offset:8:16
%17 = field_val(%4, "x") node_offset:8:25
```
Much more compact. This requires `Sema.zirFieldVal` to support both
pointers and non-pointers.
C backend: Implement typedefs for struct types, as well as the following
TZIR instructions:
* mul
* mulwrap
* addwrap
* subwrap
* ref
* struct_field_ptr
Note that add, addwrap, sub, subwrap, mul, mulwrap instructions are all
incorrect currently and need to be updated to properly handle wrapping
and non wrapping for signed and unsigned.
C backend: change indentation delta to 1, to make the output smaller and
to process fewer bytes.
I promise I will add a test case as soon as I fix those warnings that
are being printed for my test case.
2021-04-02 19:11:51 -07:00
|
|
|
}
|
2021-05-07 14:18:14 -07:00
|
|
|
|
|
|
|
|
pub fn haveFieldTypes(s: Struct) bool {
|
|
|
|
|
return switch (s.status) {
|
|
|
|
|
.none,
|
|
|
|
|
.field_types_wip,
|
|
|
|
|
=> false,
|
|
|
|
|
.have_field_types,
|
|
|
|
|
.layout_wip,
|
|
|
|
|
.have_layout,
|
|
|
|
|
=> true,
|
|
|
|
|
};
|
|
|
|
|
}
|
2021-04-01 22:34:40 -07:00
|
|
|
};
|
|
|
|
|
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
/// Represents the data that an enum declaration provides, when the fields
|
|
|
|
|
/// are auto-numbered, and there are no declarations. The integer tag type
|
|
|
|
|
/// is inferred to be the smallest power of two unsigned int that fits
|
|
|
|
|
/// the number of fields.
|
|
|
|
|
pub const EnumSimple = struct {
|
2021-05-04 11:08:40 -07:00
|
|
|
/// The Decl that corresponds to the enum itself.
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// Set of field names in declaration order.
|
|
|
|
|
fields: std.StringArrayHashMapUnmanaged(void),
|
|
|
|
|
/// Offset from `owner_decl`, points to the enum decl AST node.
|
|
|
|
|
node_offset: i32,
|
2021-04-07 11:26:07 -07:00
|
|
|
|
|
|
|
|
pub fn srcLoc(self: EnumSimple) SrcLoc {
|
|
|
|
|
return .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = self.owner_decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = self.owner_decl.src_node,
|
2021-04-07 11:26:07 -07:00
|
|
|
.lazy = .{ .node_offset = self.node_offset },
|
|
|
|
|
};
|
|
|
|
|
}
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/// Represents the data that an enum declaration provides, when there is
|
|
|
|
|
/// at least one tag value explicitly specified, or at least one declaration.
|
|
|
|
|
pub const EnumFull = struct {
|
2021-05-04 11:08:40 -07:00
|
|
|
/// The Decl that corresponds to the enum itself.
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// An integer type which is used for the numerical value of the enum.
|
|
|
|
|
/// Whether zig chooses this type or the user specifies it, it is stored here.
|
|
|
|
|
tag_ty: Type,
|
|
|
|
|
/// Set of field names in declaration order.
|
|
|
|
|
fields: std.StringArrayHashMapUnmanaged(void),
|
|
|
|
|
/// Maps integer tag value to field index.
|
|
|
|
|
/// Entries are in declaration order, same as `fields`.
|
|
|
|
|
/// If this hash map is empty, it means the enum tags are auto-numbered.
|
|
|
|
|
values: ValueMap,
|
|
|
|
|
/// Represents the declarations inside this struct.
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
namespace: Scope.Namespace,
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
/// Offset from `owner_decl`, points to the enum decl AST node.
|
|
|
|
|
node_offset: i32,
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
pub const ValueMap = std.ArrayHashMapUnmanaged(Value, void, Value.ArrayHashContext, false);
|
2021-04-07 11:26:07 -07:00
|
|
|
|
|
|
|
|
pub fn srcLoc(self: EnumFull) SrcLoc {
|
|
|
|
|
return .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = self.owner_decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = self.owner_decl.src_node,
|
2021-04-07 11:26:07 -07:00
|
|
|
.lazy = .{ .node_offset = self.node_offset },
|
|
|
|
|
};
|
|
|
|
|
}
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
};
|
|
|
|
|
|
2021-05-07 14:18:14 -07:00
|
|
|
pub const Union = struct {
|
|
|
|
|
/// The Decl that corresponds to the union itself.
|
|
|
|
|
owner_decl: *Decl,
|
|
|
|
|
/// An enum type which is used for the tag of the union.
|
|
|
|
|
/// This type is created even for untagged unions, even when the memory
|
|
|
|
|
/// layout does not store the tag.
|
|
|
|
|
/// Whether zig chooses this type or the user specifies it, it is stored here.
|
|
|
|
|
/// This will be set to the null type until status is `have_field_types`.
|
|
|
|
|
tag_ty: Type,
|
|
|
|
|
/// Set of field names in declaration order.
|
|
|
|
|
fields: std.StringArrayHashMapUnmanaged(Field),
|
|
|
|
|
/// Represents the declarations inside this union.
|
|
|
|
|
namespace: Scope.Namespace,
|
|
|
|
|
/// Offset from `owner_decl`, points to the union decl AST node.
|
|
|
|
|
node_offset: i32,
|
|
|
|
|
/// Index of the union_decl ZIR instruction.
|
|
|
|
|
zir_index: Zir.Inst.Index,
|
|
|
|
|
|
|
|
|
|
layout: std.builtin.TypeInfo.ContainerLayout,
|
|
|
|
|
status: enum {
|
|
|
|
|
none,
|
|
|
|
|
field_types_wip,
|
|
|
|
|
have_field_types,
|
|
|
|
|
layout_wip,
|
|
|
|
|
have_layout,
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
pub const Field = struct {
|
|
|
|
|
/// undefined until `status` is `have_field_types` or `have_layout`.
|
|
|
|
|
ty: Type,
|
|
|
|
|
abi_align: Value,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub fn getFullyQualifiedName(s: *Union, gpa: *Allocator) ![]u8 {
|
|
|
|
|
return s.owner_decl.getFullyQualifiedName(gpa);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn srcLoc(self: Union) SrcLoc {
|
|
|
|
|
return .{
|
|
|
|
|
.file_scope = self.owner_decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = self.owner_decl.src_node,
|
|
|
|
|
.lazy = .{ .node_offset = self.node_offset },
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2021-03-20 22:40:08 -07:00
|
|
|
/// Some Fn struct memory is owned by the Decl's TypedValue.Managed arena allocator.
|
2020-12-28 17:15:29 -07:00
|
|
|
/// Extern functions do not have this data structure; they are represented by
|
|
|
|
|
/// the `Decl` only, with a `Value` tag of `extern_fn`.
|
2020-09-13 19:17:58 -07:00
|
|
|
pub const Fn = struct {
|
2021-05-04 11:08:40 -07:00
|
|
|
/// The Decl that corresponds to the function itself.
|
2021-01-02 12:32:30 -07:00
|
|
|
owner_decl: *Decl,
|
2021-08-03 22:34:22 -07:00
|
|
|
/// If this is not null, this function is a generic function instantiation, and
|
2021-08-06 16:24:39 -07:00
|
|
|
/// there is a `TypedValue` here for each parameter of the function.
|
|
|
|
|
/// Non-comptime parameters are marked with a `generic_poison` for the value.
|
|
|
|
|
/// Non-anytype parameters are marked with a `generic_poison` for the type.
|
2021-08-03 22:34:22 -07:00
|
|
|
comptime_args: ?[*]TypedValue = null,
|
2021-04-30 14:36:02 -07:00
|
|
|
/// The ZIR instruction that is a function instruction. Use this to find
|
|
|
|
|
/// the body. We store this rather than the body directly so that when ZIR
|
|
|
|
|
/// is regenerated on update(), we can map this to the new corresponding
|
|
|
|
|
/// ZIR instruction.
|
|
|
|
|
zir_body_inst: Zir.Inst.Index,
|
2021-05-01 21:57:52 -07:00
|
|
|
|
|
|
|
|
/// Relative to owner Decl.
|
|
|
|
|
lbrace_line: u32,
|
|
|
|
|
/// Relative to owner Decl.
|
|
|
|
|
rbrace_line: u32,
|
|
|
|
|
lbrace_column: u16,
|
|
|
|
|
rbrace_column: u16,
|
|
|
|
|
|
2021-01-02 13:40:23 -07:00
|
|
|
state: Analysis,
|
2021-07-07 00:39:23 -07:00
|
|
|
is_cold: bool = false,
|
2021-01-02 13:40:23 -07:00
|
|
|
|
|
|
|
|
pub const Analysis = enum {
|
|
|
|
|
queued,
|
|
|
|
|
/// This function intentionally only has ZIR generated because it is marked
|
|
|
|
|
/// inline, which means no runtime version of the function will be generated.
|
|
|
|
|
inline_only,
|
2020-09-13 19:17:58 -07:00
|
|
|
in_progress,
|
2021-01-02 13:40:23 -07:00
|
|
|
/// There will be a corresponding ErrorMsg in Module.failed_decls
|
2020-09-13 19:17:58 -07:00
|
|
|
sema_failure,
|
2021-01-02 13:40:23 -07:00
|
|
|
/// This Fn might be OK but it depends on another Decl which did not
|
|
|
|
|
/// successfully complete semantic analysis.
|
2020-09-13 19:17:58 -07:00
|
|
|
dependency_failure,
|
2021-01-02 13:40:23 -07:00
|
|
|
success,
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
|
2021-06-19 21:10:22 -04:00
|
|
|
pub fn deinit(func: *Fn, gpa: *Allocator) void {
|
2021-07-07 20:47:21 -07:00
|
|
|
if (func.getInferredErrorSet()) |map| {
|
|
|
|
|
map.deinit(gpa);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn getInferredErrorSet(func: *Fn) ?*std.StringHashMapUnmanaged(void) {
|
|
|
|
|
const ret_ty = func.owner_decl.ty.fnReturnType();
|
2021-08-05 19:15:59 -07:00
|
|
|
if (ret_ty.tag() == .generic_poison) {
|
|
|
|
|
return null;
|
|
|
|
|
}
|
2021-07-07 20:47:21 -07:00
|
|
|
if (ret_ty.zigTypeTag() == .ErrorUnion) {
|
|
|
|
|
if (ret_ty.errorUnionSet().castTag(.error_set_inferred)) |payload| {
|
|
|
|
|
return &payload.data.map;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return null;
|
2021-06-19 21:10:22 -04:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const Var = struct {
|
|
|
|
|
/// if is_extern == true this is undefined
|
|
|
|
|
init: Value,
|
|
|
|
|
owner_decl: *Decl,
|
|
|
|
|
|
|
|
|
|
is_extern: bool,
|
|
|
|
|
is_mutable: bool,
|
|
|
|
|
is_threadlocal: bool,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const Scope = struct {
|
|
|
|
|
tag: Tag,
|
|
|
|
|
|
|
|
|
|
pub fn cast(base: *Scope, comptime T: type) ?*T {
|
|
|
|
|
if (base.tag != T.base_tag)
|
|
|
|
|
return null;
|
|
|
|
|
|
|
|
|
|
return @fieldParentPtr(T, "base", base);
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn ownerDecl(scope: *Scope) ?*Decl {
|
|
|
|
|
return switch (scope.tag) {
|
2021-03-17 22:54:56 -07:00
|
|
|
.block => scope.cast(Block).?.sema.owner_decl,
|
2021-01-16 22:51:01 -07:00
|
|
|
.file => null,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace => null,
|
2021-01-16 22:51:01 -07:00
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn srcDecl(scope: *Scope) ?*Decl {
|
|
|
|
|
return switch (scope.tag) {
|
|
|
|
|
.block => scope.cast(Block).?.src_decl,
|
2020-09-13 19:17:58 -07:00
|
|
|
.file => null,
|
2021-05-04 12:32:22 -07:00
|
|
|
.namespace => scope.cast(Namespace).?.getDecl(),
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Asserts the scope has a parent which is a Namespace and returns it.
|
|
|
|
|
pub fn namespace(scope: *Scope) *Namespace {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
switch (scope.tag) {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.block => return scope.cast(Block).?.sema.owner_decl.namespace,
|
2021-05-11 23:20:22 -07:00
|
|
|
.file => return scope.cast(File).?.root_decl.?.namespace,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace => return scope.cast(Namespace).?,
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Asserts the scope has a parent which is a Namespace or File and
|
2020-09-13 19:17:58 -07:00
|
|
|
/// returns the sub_file_path field.
|
|
|
|
|
pub fn subFilePath(base: *Scope) []const u8 {
|
|
|
|
|
switch (base.tag) {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace => return @fieldParentPtr(Namespace, "base", base).file_scope.sub_file_path,
|
2020-09-13 19:17:58 -07:00
|
|
|
.file => return @fieldParentPtr(File, "base", base).sub_file_path,
|
|
|
|
|
.block => unreachable,
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-01-16 22:51:01 -07:00
|
|
|
/// When called from inside a Block Scope, chases the src_decl, not the owner_decl.
|
2020-11-16 20:45:53 +02:00
|
|
|
pub fn getFileScope(base: *Scope) *Scope.File {
|
2020-10-06 13:56:26 +03:00
|
|
|
var cur = base;
|
|
|
|
|
while (true) {
|
|
|
|
|
cur = switch (cur.tag) {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace => return @fieldParentPtr(Namespace, "base", cur).file_scope,
|
2020-11-16 20:45:53 +02:00
|
|
|
.file => return @fieldParentPtr(File, "base", cur),
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.block => return @fieldParentPtr(Block, "base", cur).src_decl.namespace.file_scope,
|
2021-01-28 22:38:25 +02:00
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
pub const Tag = enum {
|
|
|
|
|
/// .zig source code.
|
|
|
|
|
file,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Namespace owned by structs, enums, unions, and opaques for decls.
|
|
|
|
|
namespace,
|
2020-09-13 19:17:58 -07:00
|
|
|
block,
|
|
|
|
|
};
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// The container that structs, enums, unions, and opaques have.
|
|
|
|
|
pub const Namespace = struct {
|
|
|
|
|
pub const base_tag: Tag = .namespace;
|
2020-09-13 19:17:58 -07:00
|
|
|
base: Scope = Scope{ .tag = base_tag },
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
parent: ?*Namespace,
|
2020-09-13 19:17:58 -07:00
|
|
|
file_scope: *Scope.File,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Will be a struct, enum, union, or opaque.
|
2020-09-09 17:41:51 +03:00
|
|
|
ty: Type,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Direct children of the namespace. Used during an update to detect
|
|
|
|
|
/// which decls have been added/removed from source.
|
2021-04-26 20:41:07 -07:00
|
|
|
/// Declaration order is preserved via entry order.
|
2021-05-07 18:52:11 -07:00
|
|
|
/// Key memory is owned by `decl.name`.
|
2021-04-26 20:41:07 -07:00
|
|
|
/// TODO save memory with https://github.com/ziglang/zig/issues/8619.
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
/// Anonymous decls are not stored here; they are kept in `anon_decls` instead.
|
2021-04-26 20:41:07 -07:00
|
|
|
decls: std.StringArrayHashMapUnmanaged(*Decl) = .{},
|
|
|
|
|
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
anon_decls: std.AutoArrayHashMapUnmanaged(*Decl, void) = .{},
|
|
|
|
|
|
2021-08-28 15:35:59 -07:00
|
|
|
/// Key is usingnamespace Decl itself. To find the namespace being included,
|
|
|
|
|
/// the Decl Value has to be resolved as a Type which has a Namespace.
|
|
|
|
|
/// Value is whether the usingnamespace decl is marked `pub`.
|
|
|
|
|
usingnamespace_set: std.AutoHashMapUnmanaged(*Decl, bool) = .{},
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn deinit(ns: *Namespace, mod: *Module) void {
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
ns.destroyDecls(mod);
|
2021-04-30 11:07:31 -07:00
|
|
|
ns.* = undefined;
|
|
|
|
|
}
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
pub fn destroyDecls(ns: *Namespace, mod: *Module) void {
|
2021-04-26 20:41:07 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
log.debug("destroyDecls {*}", .{ns});
|
2021-05-04 11:08:40 -07:00
|
|
|
|
2021-04-30 11:07:31 -07:00
|
|
|
var decls = ns.decls;
|
|
|
|
|
ns.decls = .{};
|
|
|
|
|
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
var anon_decls = ns.anon_decls;
|
|
|
|
|
ns.anon_decls = .{};
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decls.values()) |value| {
|
|
|
|
|
value.destroy(mod);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
2021-04-30 11:07:31 -07:00
|
|
|
decls.deinit(gpa);
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (anon_decls.keys()) |key| {
|
|
|
|
|
key.destroy(mod);
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
}
|
|
|
|
|
anon_decls.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-05-11 22:12:36 -07:00
|
|
|
pub fn deleteAllDecls(
|
|
|
|
|
ns: *Namespace,
|
|
|
|
|
mod: *Module,
|
|
|
|
|
outdated_decls: ?*std.AutoArrayHashMap(*Decl, void),
|
|
|
|
|
) !void {
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
|
|
|
|
log.debug("deleteAllDecls {*}", .{ns});
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
var decls = ns.decls;
|
2021-05-11 22:12:36 -07:00
|
|
|
ns.decls = .{};
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
var anon_decls = ns.anon_decls;
|
2021-05-11 22:12:36 -07:00
|
|
|
ns.anon_decls = .{};
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
// TODO rework this code to not panic on OOM.
|
|
|
|
|
// (might want to coordinate with the clearDecl function)
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decls.values()) |child_decl| {
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
mod.clearDecl(child_decl, outdated_decls) catch @panic("out of memory");
|
|
|
|
|
child_decl.destroy(mod);
|
|
|
|
|
}
|
|
|
|
|
decls.deinit(gpa);
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (anon_decls.keys()) |child_decl| {
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
mod.clearDecl(child_decl, outdated_decls) catch @panic("out of memory");
|
|
|
|
|
child_decl.destroy(mod);
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
}
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
anon_decls.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
|
2021-05-08 14:34:30 -07:00
|
|
|
// This renders e.g. "std.fs.Dir.OpenOptions"
|
|
|
|
|
pub fn renderFullyQualifiedName(
|
|
|
|
|
ns: Namespace,
|
|
|
|
|
name: []const u8,
|
|
|
|
|
writer: anytype,
|
|
|
|
|
) @TypeOf(writer).Error!void {
|
|
|
|
|
if (ns.parent) |parent| {
|
|
|
|
|
const decl = ns.getDecl();
|
|
|
|
|
try parent.renderFullyQualifiedName(mem.spanZ(decl.name), writer);
|
|
|
|
|
} else {
|
|
|
|
|
try ns.file_scope.renderFullyQualifiedName(writer);
|
|
|
|
|
}
|
|
|
|
|
if (name.len != 0) {
|
|
|
|
|
try writer.writeAll(".");
|
|
|
|
|
try writer.writeAll(name);
|
|
|
|
|
}
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
}
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
|
|
|
|
|
pub fn getDecl(ns: Namespace) *Decl {
|
|
|
|
|
return ns.ty.getOwnerDecl();
|
|
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const File = struct {
|
|
|
|
|
pub const base_tag: Tag = .file;
|
|
|
|
|
base: Scope = Scope{ .tag = base_tag },
|
2021-02-11 23:29:55 -07:00
|
|
|
status: enum {
|
|
|
|
|
never_loaded,
|
2021-04-28 22:43:26 -07:00
|
|
|
retryable_failure,
|
2021-04-14 11:26:53 -07:00
|
|
|
parse_failure,
|
|
|
|
|
astgen_failure,
|
2021-04-28 22:43:26 -07:00
|
|
|
success_zir,
|
2021-02-11 23:29:55 -07:00
|
|
|
},
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
source_loaded: bool,
|
2021-04-14 11:26:53 -07:00
|
|
|
tree_loaded: bool,
|
|
|
|
|
zir_loaded: bool,
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Relative to the owning package's root_src_dir.
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Memory is stored in gpa, owned by File.
|
2020-09-13 19:17:58 -07:00
|
|
|
sub_file_path: []const u8,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
/// Whether this is populated depends on `source_loaded`.
|
|
|
|
|
source: [:0]const u8,
|
|
|
|
|
/// Whether this is populated depends on `status`.
|
|
|
|
|
stat_size: u64,
|
|
|
|
|
/// Whether this is populated depends on `status`.
|
|
|
|
|
stat_inode: std.fs.File.INode,
|
|
|
|
|
/// Whether this is populated depends on `status`.
|
|
|
|
|
stat_mtime: i128,
|
2021-04-14 11:26:53 -07:00
|
|
|
/// Whether this is populated or not depends on `tree_loaded`.
|
2021-08-30 19:22:04 -07:00
|
|
|
tree: Ast,
|
2021-04-14 11:26:53 -07:00
|
|
|
/// Whether this is populated or not depends on `zir_loaded`.
|
|
|
|
|
zir: Zir,
|
2020-10-06 13:56:26 +03:00
|
|
|
/// Package that this file is a part of, managed externally.
|
|
|
|
|
pkg: *Package,
|
2021-05-11 23:20:22 -07:00
|
|
|
/// The Decl of the struct that represents this File.
|
|
|
|
|
root_decl: ?*Decl,
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
/// Used by change detection algorithm, after astgen, contains the
|
|
|
|
|
/// set of decls that existed in the previous ZIR but not in the new one.
|
|
|
|
|
deleted_decls: std.ArrayListUnmanaged(*Decl) = .{},
|
|
|
|
|
/// Used by change detection algorithm, after astgen, contains the
|
|
|
|
|
/// set of decls that existed both in the previous ZIR and in the new one,
|
|
|
|
|
/// but their source code has been modified.
|
|
|
|
|
outdated_decls: std.ArrayListUnmanaged(*Decl) = .{},
|
|
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
/// The most recent successful ZIR for this file, with no errors.
|
|
|
|
|
/// This is only populated when a previously successful ZIR
|
|
|
|
|
/// newly introduces compile errors during an update. When ZIR is
|
|
|
|
|
/// successful, this field is unloaded.
|
|
|
|
|
prev_zir: ?*Zir = null,
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn unload(file: *File, gpa: *Allocator) void {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
file.unloadTree(gpa);
|
|
|
|
|
file.unloadSource(gpa);
|
2021-04-14 11:26:53 -07:00
|
|
|
file.unloadZir(gpa);
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
pub fn unloadTree(file: *File, gpa: *Allocator) void {
|
2021-04-14 11:26:53 -07:00
|
|
|
if (file.tree_loaded) {
|
|
|
|
|
file.tree_loaded = false;
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
file.tree.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn unloadSource(file: *File, gpa: *Allocator) void {
|
|
|
|
|
if (file.source_loaded) {
|
|
|
|
|
file.source_loaded = false;
|
|
|
|
|
gpa.free(file.source);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
pub fn unloadZir(file: *File, gpa: *Allocator) void {
|
|
|
|
|
if (file.zir_loaded) {
|
|
|
|
|
file.zir_loaded = false;
|
|
|
|
|
file.zir.deinit(gpa);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn deinit(file: *File, mod: *Module) void {
|
|
|
|
|
const gpa = mod.gpa;
|
2021-04-30 11:07:31 -07:00
|
|
|
log.debug("deinit File {s}", .{file.sub_file_path});
|
2021-05-06 17:20:45 -07:00
|
|
|
file.deleted_decls.deinit(gpa);
|
|
|
|
|
file.outdated_decls.deinit(gpa);
|
2021-05-11 23:20:22 -07:00
|
|
|
if (file.root_decl) |root_decl| {
|
|
|
|
|
root_decl.destroy(mod);
|
2021-04-30 11:07:31 -07:00
|
|
|
}
|
2021-04-16 19:45:58 -07:00
|
|
|
gpa.free(file.sub_file_path);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
file.unload(gpa);
|
2021-05-11 17:34:13 -07:00
|
|
|
if (file.prev_zir) |prev_zir| {
|
|
|
|
|
prev_zir.deinit(gpa);
|
|
|
|
|
gpa.destroy(prev_zir);
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
file.* = undefined;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
pub fn getSource(file: *File, gpa: *Allocator) ![:0]const u8 {
|
|
|
|
|
if (file.source_loaded) return file.source;
|
|
|
|
|
|
2021-04-16 19:45:58 -07:00
|
|
|
const root_dir_path = file.pkg.root_src_directory.path orelse ".";
|
|
|
|
|
log.debug("File.getSource, not cached. pkgdir={s} sub_file_path={s}", .{
|
|
|
|
|
root_dir_path, file.sub_file_path,
|
|
|
|
|
});
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
// Keep track of inode, file size, mtime, hash so we can detect which files
|
|
|
|
|
// have been modified when an incremental update is requested.
|
|
|
|
|
var f = try file.pkg.root_src_directory.handle.openFile(file.sub_file_path, .{});
|
|
|
|
|
defer f.close();
|
|
|
|
|
|
|
|
|
|
const stat = try f.stat();
|
|
|
|
|
|
|
|
|
|
if (stat.size > std.math.maxInt(u32))
|
|
|
|
|
return error.FileTooBig;
|
|
|
|
|
|
2021-05-25 10:34:02 +08:00
|
|
|
const source = try gpa.allocSentinel(u8, @intCast(usize, stat.size), 0);
|
2021-04-14 11:26:53 -07:00
|
|
|
defer if (!file.source_loaded) gpa.free(source);
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
const amt = try f.readAll(source);
|
|
|
|
|
if (amt != stat.size)
|
|
|
|
|
return error.UnexpectedEndOfFile;
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
// Here we do not modify stat fields because this function is the one
|
|
|
|
|
// used for error reporting. We need to keep the stat fields stale so that
|
|
|
|
|
// astGenFile can know to regenerate ZIR.
|
|
|
|
|
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
file.source = source;
|
|
|
|
|
file.source_loaded = true;
|
2021-04-14 11:26:53 -07:00
|
|
|
return source;
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn getTree(file: *File, gpa: *Allocator) !*const Ast {
|
2021-04-25 00:02:58 -07:00
|
|
|
if (file.tree_loaded) return &file.tree;
|
|
|
|
|
|
|
|
|
|
const source = try file.getSource(gpa);
|
|
|
|
|
file.tree = try std.zig.parse(gpa, source);
|
|
|
|
|
file.tree_loaded = true;
|
|
|
|
|
return &file.tree;
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn destroy(file: *File, mod: *Module) void {
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
file.deinit(mod);
|
2021-04-14 11:26:53 -07:00
|
|
|
gpa.destroy(file);
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-08 14:34:30 -07:00
|
|
|
pub fn renderFullyQualifiedName(file: File, writer: anytype) !void {
|
2021-04-30 11:07:31 -07:00
|
|
|
// Convert all the slashes into dots and truncate the extension.
|
|
|
|
|
const ext = std.fs.path.extension(file.sub_file_path);
|
|
|
|
|
const noext = file.sub_file_path[0 .. file.sub_file_path.len - ext.len];
|
2021-05-08 14:34:30 -07:00
|
|
|
for (noext) |byte| switch (byte) {
|
|
|
|
|
'/', '\\' => try writer.writeByte('.'),
|
|
|
|
|
else => try writer.writeByte(byte),
|
2021-04-30 11:07:31 -07:00
|
|
|
};
|
2021-05-08 14:34:30 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn fullyQualifiedNameZ(file: File, gpa: *Allocator) ![:0]u8 {
|
|
|
|
|
var buf = std.ArrayList(u8).init(gpa);
|
|
|
|
|
defer buf.deinit();
|
|
|
|
|
try file.renderFullyQualifiedName(buf.writer());
|
|
|
|
|
return buf.toOwnedSliceSentinel(0);
|
2021-04-30 11:07:31 -07:00
|
|
|
}
|
|
|
|
|
|
2021-06-28 17:31:47 -04:00
|
|
|
/// Returns the full path to this file relative to its package.
|
|
|
|
|
pub fn fullPath(file: File, ally: *Allocator) ![]u8 {
|
|
|
|
|
return file.pkg.root_src_directory.join(ally, &[_][]const u8{file.sub_file_path});
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
pub fn dumpSrc(file: *File, src: LazySrcLoc) void {
|
|
|
|
|
const loc = std.zig.findLineColumn(file.source.bytes, src);
|
|
|
|
|
std.debug.print("{s}:{d}:{d}\n", .{ file.sub_file_path, loc.line + 1, loc.column + 1 });
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-05-15 21:20:06 -07:00
|
|
|
|
|
|
|
|
pub fn okToReportErrors(file: File) bool {
|
|
|
|
|
return switch (file.status) {
|
|
|
|
|
.parse_failure, .astgen_failure => false,
|
|
|
|
|
else => true,
|
|
|
|
|
};
|
|
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// This is the context needed to semantically analyze ZIR instructions and
|
2021-04-26 20:41:07 -07:00
|
|
|
/// produce AIR instructions.
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// This is a temporary structure stored on the stack; references to it are valid only
|
2020-09-13 19:17:58 -07:00
|
|
|
/// during semantic analysis of the block.
|
|
|
|
|
pub const Block = struct {
|
|
|
|
|
pub const base_tag: Tag = .block;
|
2021-01-02 22:42:07 -07:00
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
base: Scope = Scope{ .tag = base_tag },
|
|
|
|
|
parent: ?*Block,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// Shared among all child blocks.
|
|
|
|
|
sema: *Sema,
|
2021-01-16 22:51:01 -07:00
|
|
|
/// This Decl is the Decl according to the Zig source code corresponding to this Block.
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// This can vary during inline or comptime function calls. See `Sema.owner_decl`
|
|
|
|
|
/// for the one that will be the same for all Block instances.
|
2021-01-16 22:51:01 -07:00
|
|
|
src_decl: *Decl,
|
2021-07-12 15:30:30 -07:00
|
|
|
instructions: ArrayListUnmanaged(Air.Inst.Index),
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
// `param` instructions are collected here to be used by the `func` instruction.
|
|
|
|
|
params: std.ArrayListUnmanaged(Param) = .{},
|
2021-05-07 20:03:27 -07:00
|
|
|
label: ?*Label = null,
|
2021-01-02 22:42:07 -07:00
|
|
|
inlining: ?*Inlining,
|
2021-06-06 21:08:31 +03:00
|
|
|
/// If runtime_index is not 0 then one of these is guaranteed to be non null.
|
|
|
|
|
runtime_cond: ?LazySrcLoc = null,
|
|
|
|
|
runtime_loop: ?LazySrcLoc = null,
|
|
|
|
|
/// Non zero if a non-inline loop or a runtime conditional have been encountered.
|
|
|
|
|
/// Stores to to comptime variables are only allowed when var.runtime_index <= runtime_index.
|
|
|
|
|
runtime_index: u32 = 0,
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
is_comptime: bool,
|
2021-01-02 22:42:07 -07:00
|
|
|
|
2021-06-23 14:32:21 -04:00
|
|
|
/// when null, it is determined by build mode, changed by @setRuntimeSafety
|
|
|
|
|
want_safety: ?bool = null,
|
|
|
|
|
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
const Param = struct {
|
|
|
|
|
/// `noreturn` means `anytype`.
|
|
|
|
|
ty: Type,
|
|
|
|
|
is_comptime: bool,
|
|
|
|
|
};
|
|
|
|
|
|
2021-01-02 14:28:03 -07:00
|
|
|
/// This `Block` maps a block ZIR instruction to the corresponding
|
2021-04-26 20:41:07 -07:00
|
|
|
/// AIR instruction for break instruction analysis.
|
2021-01-02 14:28:03 -07:00
|
|
|
pub const Label = struct {
|
2021-04-13 12:34:27 -07:00
|
|
|
zir_block: Zir.Inst.Index,
|
2021-01-02 14:28:03 -07:00
|
|
|
merges: Merges,
|
|
|
|
|
};
|
2021-01-02 12:32:30 -07:00
|
|
|
|
2021-01-02 14:28:03 -07:00
|
|
|
/// This `Block` indicates that an inline function call is happening
|
|
|
|
|
/// and return instructions should be analyzed as a break instruction
|
2021-04-26 20:41:07 -07:00
|
|
|
/// to this AIR block instruction.
|
2021-01-02 22:42:07 -07:00
|
|
|
/// It is shared among all the blocks in an inline or comptime called
|
|
|
|
|
/// function.
|
2021-01-02 14:28:03 -07:00
|
|
|
pub const Inlining = struct {
|
2021-09-14 21:58:22 -07:00
|
|
|
comptime_result: Air.Inst.Ref,
|
2021-01-02 14:28:03 -07:00
|
|
|
merges: Merges,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const Merges = struct {
|
2021-07-12 15:30:30 -07:00
|
|
|
block_inst: Air.Inst.Index,
|
2021-01-22 16:45:09 -07:00
|
|
|
/// Separate array list from break_inst_list so that it can be passed directly
|
|
|
|
|
/// to resolvePeerTypes.
|
2021-07-14 21:57:40 -07:00
|
|
|
results: ArrayListUnmanaged(Air.Inst.Ref),
|
2021-01-22 16:45:09 -07:00
|
|
|
/// Keeps track of the break instructions so that the operand can be replaced
|
|
|
|
|
/// if we need to add type coercion at the end of block analysis.
|
|
|
|
|
/// Same indexes, capacity, length as `results`.
|
2021-07-12 15:30:30 -07:00
|
|
|
br_list: ArrayListUnmanaged(Air.Inst.Index),
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
2021-01-01 19:24:02 -07:00
|
|
|
|
|
|
|
|
/// For debugging purposes.
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn dump(block: *Block, mod: Module) void {
|
2021-04-13 12:34:27 -07:00
|
|
|
Zir.dumpBlock(mod, block);
|
2021-01-01 19:24:02 -07:00
|
|
|
}
|
2021-01-22 16:45:09 -07:00
|
|
|
|
|
|
|
|
pub fn makeSubBlock(parent: *Block) Block {
|
|
|
|
|
return .{
|
|
|
|
|
.parent = parent,
|
2021-03-17 22:54:56 -07:00
|
|
|
.sema = parent.sema,
|
2021-01-22 16:45:09 -07:00
|
|
|
.src_decl = parent.src_decl,
|
|
|
|
|
.instructions = .{},
|
|
|
|
|
.label = null,
|
|
|
|
|
.inlining = parent.inlining,
|
|
|
|
|
.is_comptime = parent.is_comptime,
|
2021-06-06 21:08:31 +03:00
|
|
|
.runtime_cond = parent.runtime_cond,
|
|
|
|
|
.runtime_loop = parent.runtime_loop,
|
|
|
|
|
.runtime_index = parent.runtime_index,
|
2021-06-23 14:32:21 -04:00
|
|
|
.want_safety = parent.want_safety,
|
2021-01-22 16:45:09 -07:00
|
|
|
};
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
|
|
|
|
pub fn wantSafety(block: *const Block) bool {
|
2021-06-23 14:32:21 -04:00
|
|
|
return block.want_safety orelse switch (block.sema.mod.optimizeMode()) {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
.Debug => true,
|
|
|
|
|
.ReleaseSafe => true,
|
|
|
|
|
.ReleaseFast => false,
|
|
|
|
|
.ReleaseSmall => false,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn getFileScope(block: *Block) *Scope.File {
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
return block.src_decl.namespace.file_scope;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
}
|
2021-07-13 15:45:08 -07:00
|
|
|
|
2021-07-14 19:04:02 -07:00
|
|
|
pub fn addTy(
|
|
|
|
|
block: *Block,
|
|
|
|
|
tag: Air.Inst.Tag,
|
|
|
|
|
ty: Type,
|
|
|
|
|
) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = tag,
|
|
|
|
|
.data = .{ .ty = ty },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-13 15:45:08 -07:00
|
|
|
pub fn addTyOp(
|
|
|
|
|
block: *Block,
|
|
|
|
|
tag: Air.Inst.Tag,
|
|
|
|
|
ty: Type,
|
|
|
|
|
operand: Air.Inst.Ref,
|
|
|
|
|
) error{OutOfMemory}!Air.Inst.Ref {
|
2021-07-13 21:49:22 -07:00
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = tag,
|
|
|
|
|
.data = .{ .ty_op = .{
|
|
|
|
|
.ty = try block.sema.addType(ty),
|
|
|
|
|
.operand = operand,
|
|
|
|
|
} },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-14 19:04:02 -07:00
|
|
|
pub fn addNoOp(block: *Block, tag: Air.Inst.Tag) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = tag,
|
2021-07-14 22:44:57 -07:00
|
|
|
.data = .{ .no_op = {} },
|
2021-07-14 19:04:02 -07:00
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-13 21:49:22 -07:00
|
|
|
pub fn addUnOp(
|
|
|
|
|
block: *Block,
|
|
|
|
|
tag: Air.Inst.Tag,
|
|
|
|
|
operand: Air.Inst.Ref,
|
|
|
|
|
) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = tag,
|
|
|
|
|
.data = .{ .un_op = operand },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-14 19:04:02 -07:00
|
|
|
pub fn addBr(
|
|
|
|
|
block: *Block,
|
|
|
|
|
target_block: Air.Inst.Index,
|
|
|
|
|
operand: Air.Inst.Ref,
|
|
|
|
|
) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = .br,
|
|
|
|
|
.data = .{ .br = .{
|
|
|
|
|
.block_inst = target_block,
|
|
|
|
|
.operand = operand,
|
|
|
|
|
} },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-13 21:49:22 -07:00
|
|
|
pub fn addBinOp(
|
|
|
|
|
block: *Block,
|
|
|
|
|
tag: Air.Inst.Tag,
|
|
|
|
|
lhs: Air.Inst.Ref,
|
|
|
|
|
rhs: Air.Inst.Ref,
|
|
|
|
|
) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = tag,
|
|
|
|
|
.data = .{ .bin_op = .{
|
|
|
|
|
.lhs = lhs,
|
|
|
|
|
.rhs = rhs,
|
|
|
|
|
} },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-06 17:26:37 -07:00
|
|
|
pub fn addArg(block: *Block, ty: Type, name: u32) error{OutOfMemory}!Air.Inst.Ref {
|
|
|
|
|
return block.addInst(.{
|
|
|
|
|
.tag = .arg,
|
|
|
|
|
.data = .{ .ty_str = .{
|
|
|
|
|
.ty = try block.sema.addType(ty),
|
|
|
|
|
.str = name,
|
|
|
|
|
} },
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-13 21:49:22 -07:00
|
|
|
pub fn addInst(block: *Block, inst: Air.Inst) error{OutOfMemory}!Air.Inst.Ref {
|
2021-07-19 17:35:14 -07:00
|
|
|
return Air.indexToRef(try block.addInstAsIndex(inst));
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn addInstAsIndex(block: *Block, inst: Air.Inst) error{OutOfMemory}!Air.Inst.Index {
|
2021-07-13 15:45:08 -07:00
|
|
|
const sema = block.sema;
|
|
|
|
|
const gpa = sema.gpa;
|
|
|
|
|
|
|
|
|
|
try sema.air_instructions.ensureUnusedCapacity(gpa, 1);
|
|
|
|
|
try block.instructions.ensureUnusedCapacity(gpa, 1);
|
|
|
|
|
|
2021-07-13 21:49:22 -07:00
|
|
|
const result_index = @intCast(Air.Inst.Index, sema.air_instructions.len);
|
|
|
|
|
sema.air_instructions.appendAssumeCapacity(inst);
|
|
|
|
|
block.instructions.appendAssumeCapacity(result_index);
|
2021-07-19 17:35:14 -07:00
|
|
|
return result_index;
|
2021-07-13 15:45:08 -07:00
|
|
|
}
|
stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
instruction, Sema emits a constant with a `decl_ref`.
* AIR no longer has a `ref` instruction. There is no longer any
instruction that takes a value and returns a pointer to it. If this
is desired, Sema must either create an anynomous Decl and return a
constant `decl_ref`, or in the case of a runtime value, emit an
`alloc` instruction, `store` the value to it, and then return the
`alloc`.
* The `ref_val` Value Tag is eliminated. `decl_ref` should be used
instead. Also added is `eu_payload_ptr` which points to the payload
of an error union, given an error union pointer.
In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.
There is a new abstraction in Sema, which looks like this:
var anon_decl = try block.startAnonDecl();
defer anon_decl.deinit();
// here 'anon_decl.arena()` may be used
const decl = try anon_decl.finish(ty, val);
// decl is typically now used with `decl_ref`.
This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.
Additional improvements:
* Sema: fix source location resolution for calling convention
expression.
* Sema: properly report "unable to resolve comptime value" for loads of
global variables. There is now a set of functions which can be
called if the callee wants to obtain the Value even if the tag is
`variable` (indicating comptime-known address but runtime-known value).
* Sema: `coerce` resolves builtin types before checking equality.
* Sema: fix `u1_type` missing from `addType`, making this type have a
slightly more efficient representation in AIR.
* LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
to properly do an LLVMConstBitCast.
* Remove unused parameter from `Value.toEnum`.
After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
|
|
|
|
|
|
|
|
pub fn startAnonDecl(block: *Block) !WipAnonDecl {
|
|
|
|
|
return WipAnonDecl{
|
|
|
|
|
.block = block,
|
|
|
|
|
.new_decl_arena = std.heap.ArenaAllocator.init(block.sema.gpa),
|
|
|
|
|
.finished = false,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub const WipAnonDecl = struct {
|
|
|
|
|
block: *Scope.Block,
|
|
|
|
|
new_decl_arena: std.heap.ArenaAllocator,
|
|
|
|
|
finished: bool,
|
|
|
|
|
|
|
|
|
|
pub fn arena(wad: *WipAnonDecl) *Allocator {
|
|
|
|
|
return &wad.new_decl_arena.allocator;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn deinit(wad: *WipAnonDecl) void {
|
|
|
|
|
if (!wad.finished) {
|
|
|
|
|
wad.new_decl_arena.deinit();
|
|
|
|
|
}
|
|
|
|
|
wad.* = undefined;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn finish(wad: *WipAnonDecl, ty: Type, val: Value) !*Decl {
|
|
|
|
|
const new_decl = try wad.block.sema.mod.createAnonymousDecl(&wad.block.base, .{
|
|
|
|
|
.ty = ty,
|
|
|
|
|
.val = val,
|
|
|
|
|
});
|
|
|
|
|
errdefer wad.block.sema.mod.deleteAnonDecl(&wad.block.base, new_decl);
|
|
|
|
|
try new_decl.finalizeNewArena(&wad.new_decl_arena);
|
|
|
|
|
wad.finished = true;
|
|
|
|
|
return new_decl;
|
|
|
|
|
}
|
|
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
};
|
|
|
|
|
|
2021-01-16 22:51:01 -07:00
|
|
|
/// This struct holds data necessary to construct API-facing `AllErrors.Message`.
|
|
|
|
|
/// Its memory is managed with the general purpose allocator so that they
|
|
|
|
|
/// can be created and destroyed in response to incremental updates.
|
|
|
|
|
/// In some cases, the Scope.File could have been inferred from where the ErrorMsg
|
|
|
|
|
/// is stored. For example, if it is stored in Module.failed_decls, then the Scope.File
|
|
|
|
|
/// would be determined by the Decl Scope. However, the data structure contains the field
|
|
|
|
|
/// anyway so that `ErrorMsg` can be reused for error notes, which may be in a different
|
|
|
|
|
/// file than the parent error message. It also simplifies processing of error messages.
|
|
|
|
|
pub const ErrorMsg = struct {
|
|
|
|
|
src_loc: SrcLoc,
|
|
|
|
|
msg: []const u8,
|
|
|
|
|
notes: []ErrorMsg = &.{},
|
|
|
|
|
|
|
|
|
|
pub fn create(
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
src_loc: SrcLoc,
|
|
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
|
|
|
|
) !*ErrorMsg {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const err_msg = try gpa.create(ErrorMsg);
|
|
|
|
|
errdefer gpa.destroy(err_msg);
|
|
|
|
|
err_msg.* = try init(gpa, src_loc, format, args);
|
|
|
|
|
return err_msg;
|
2021-01-16 22:51:01 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Assumes the ErrorMsg struct and msg were both allocated with `gpa`,
|
|
|
|
|
/// as well as all notes.
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn destroy(err_msg: *ErrorMsg, gpa: *Allocator) void {
|
|
|
|
|
err_msg.deinit(gpa);
|
|
|
|
|
gpa.destroy(err_msg);
|
2021-01-16 22:51:01 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn init(
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
src_loc: SrcLoc,
|
|
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
|
|
|
|
) !ErrorMsg {
|
|
|
|
|
return ErrorMsg{
|
|
|
|
|
.src_loc = src_loc,
|
|
|
|
|
.msg = try std.fmt.allocPrint(gpa, format, args),
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn deinit(err_msg: *ErrorMsg, gpa: *Allocator) void {
|
|
|
|
|
for (err_msg.notes) |*note| {
|
2021-01-16 22:51:01 -07:00
|
|
|
note.deinit(gpa);
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
gpa.free(err_msg.notes);
|
|
|
|
|
gpa.free(err_msg.msg);
|
|
|
|
|
err_msg.* = undefined;
|
2021-01-16 22:51:01 -07:00
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/// Canonical reference to a position within a source file.
|
|
|
|
|
pub const SrcLoc = struct {
|
2021-04-16 14:44:02 -07:00
|
|
|
file_scope: *Scope.File,
|
|
|
|
|
/// Might be 0 depending on tag of `lazy`.
|
2021-08-30 19:22:04 -07:00
|
|
|
parent_decl_node: Ast.Node.Index,
|
2021-04-16 14:44:02 -07:00
|
|
|
/// Relative to `parent_decl_node`.
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
lazy: LazySrcLoc,
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn declSrcToken(src_loc: SrcLoc) Ast.TokenIndex {
|
2021-04-16 14:44:02 -07:00
|
|
|
const tree = src_loc.file_scope.tree;
|
|
|
|
|
return tree.firstToken(src_loc.parent_decl_node);
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn declRelativeToNodeIndex(src_loc: SrcLoc, offset: i32) Ast.TokenIndex {
|
|
|
|
|
return @bitCast(Ast.Node.Index, offset + @bitCast(i32, src_loc.parent_decl_node));
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-28 22:43:26 -07:00
|
|
|
pub fn byteOffset(src_loc: SrcLoc, gpa: *Allocator) !u32 {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
switch (src_loc.lazy) {
|
|
|
|
|
.unneeded => unreachable,
|
2021-04-20 17:03:18 -07:00
|
|
|
.entire_file => return 0,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
|
|
|
|
.byte_abs => |byte_index| return byte_index,
|
|
|
|
|
|
|
|
|
|
.token_abs => |tok_index| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-25 00:37:52 -07:00
|
|
|
.node_abs => |node| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-17 22:54:56 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
2021-03-25 00:37:52 -07:00
|
|
|
const tok_index = tree.firstToken(node);
|
2021-03-17 22:54:56 -07:00
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
.byte_offset => |byte_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-04-16 14:44:02 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[src_loc.declSrcToken()] + byte_off;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
},
|
|
|
|
|
.token_offset => |tok_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const tok_index = src_loc.declSrcToken() + tok_off;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-26 23:46:37 -07:00
|
|
|
.node_offset, .node_offset_bin_op => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-04-28 22:43:26 -07:00
|
|
|
assert(src_loc.file_scope.tree_loaded);
|
2021-03-23 23:13:01 -07:00
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
2021-03-25 00:37:52 -07:00
|
|
|
const tok_index = main_tokens[node];
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-04-02 21:06:09 -07:00
|
|
|
.node_offset_back2tok => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-04-02 21:06:09 -07:00
|
|
|
const tok_index = tree.firstToken(node) - 2;
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-26 18:35:15 -07:00
|
|
|
.node_offset_var_decl_ty => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-26 18:35:15 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
|
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.global_var_decl => tree.globalVarDecl(node),
|
|
|
|
|
.local_var_decl => tree.localVarDecl(node),
|
|
|
|
|
.simple_var_decl => tree.simpleVarDecl(node),
|
|
|
|
|
.aligned_var_decl => tree.alignedVarDecl(node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const tok_index = if (full.ast.type_node != 0) blk: {
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
break :blk main_tokens[full.ast.type_node];
|
|
|
|
|
} else blk: {
|
|
|
|
|
break :blk full.ast.mut_token + 1; // the name token
|
|
|
|
|
};
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-09-14 21:58:22 -07:00
|
|
|
.node_offset_builtin_call_arg0 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 0),
|
|
|
|
|
.node_offset_builtin_call_arg1 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 1),
|
|
|
|
|
.node_offset_builtin_call_arg2 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 2),
|
|
|
|
|
.node_offset_builtin_call_arg3 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 3),
|
|
|
|
|
.node_offset_builtin_call_arg4 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 4),
|
|
|
|
|
.node_offset_builtin_call_arg5 => |n| return src_loc.byteOffsetBuiltinCallArg(gpa, n, 5),
|
2021-03-31 23:00:00 -07:00
|
|
|
.node_offset_array_access_index => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[node_datas[node].rhs];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_slice_sentinel => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.slice_open => tree.sliceOpen(node),
|
|
|
|
|
.slice => tree.slice(node),
|
|
|
|
|
.slice_sentinel => tree.sliceSentinel(node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[full.ast.sentinel];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_call_func => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-08-30 19:22:04 -07:00
|
|
|
var params: [1]Ast.Node.Index = undefined;
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.call_one,
|
|
|
|
|
.call_one_comma,
|
|
|
|
|
.async_call_one,
|
|
|
|
|
.async_call_one_comma,
|
|
|
|
|
=> tree.callOne(¶ms, node),
|
|
|
|
|
|
|
|
|
|
.call,
|
|
|
|
|
.call_comma,
|
|
|
|
|
.async_call,
|
|
|
|
|
.async_call_comma,
|
|
|
|
|
=> tree.callFull(node),
|
|
|
|
|
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[full.ast.fn_expr];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_field_name => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-04-02 21:06:09 -07:00
|
|
|
const tok_index = switch (node_tags[node]) {
|
|
|
|
|
.field_access => node_datas[node].rhs,
|
|
|
|
|
else => tree.firstToken(node) - 2,
|
|
|
|
|
};
|
2021-03-31 23:00:00 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_deref_ptr => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const tok_index = node_datas[node].lhs;
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_asm_source => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.asm_simple => tree.asmSimple(node),
|
|
|
|
|
.@"asm" => tree.asmFull(node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[full.ast.template];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_asm_ret_ty => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.asm_simple => tree.asmSimple(node),
|
|
|
|
|
.@"asm" => tree.asmFull(node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
const asm_output = full.outputs[0];
|
|
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const ret_ty_node = node_datas[asm_output].lhs;
|
2021-03-31 23:00:00 -07:00
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
const tok_index = main_tokens[ret_ty_node];
|
2021-03-31 23:00:00 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-25 00:37:52 -07:00
|
|
|
|
|
|
|
|
.node_offset_for_cond, .node_offset_if_cond => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-25 00:37:52 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-03-26 18:35:15 -07:00
|
|
|
const src_node = switch (node_tags[node]) {
|
2021-03-25 00:37:52 -07:00
|
|
|
.if_simple => tree.ifSimple(node).ast.cond_expr,
|
|
|
|
|
.@"if" => tree.ifFull(node).ast.cond_expr,
|
|
|
|
|
.while_simple => tree.whileSimple(node).ast.cond_expr,
|
|
|
|
|
.while_cont => tree.whileCont(node).ast.cond_expr,
|
|
|
|
|
.@"while" => tree.whileFull(node).ast.cond_expr,
|
|
|
|
|
.for_simple => tree.forSimple(node).ast.cond_expr,
|
|
|
|
|
.@"for" => tree.forFull(node).ast.cond_expr,
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
2021-03-26 18:35:15 -07:00
|
|
|
const tok_index = main_tokens[src_node];
|
2021-03-25 00:37:52 -07:00
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-26 23:46:37 -07:00
|
|
|
.node_offset_bin_lhs => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-26 23:46:37 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const src_node = node_datas[node].lhs;
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[src_node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
.node_offset_bin_rhs => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-26 23:46:37 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const src_node = node_datas[node].rhs;
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[src_node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-03-31 23:00:00 -07:00
|
|
|
|
|
|
|
|
.node_offset_switch_operand => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const src_node = node_datas[node].lhs;
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[src_node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.node_offset_switch_special_prong => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const switch_node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
2021-08-30 19:22:04 -07:00
|
|
|
const extra = tree.extraData(node_datas[switch_node].rhs, Ast.Node.SubRange);
|
2021-03-31 23:00:00 -07:00
|
|
|
const case_nodes = tree.extra_data[extra.start..extra.end];
|
|
|
|
|
for (case_nodes) |case_node| {
|
|
|
|
|
const case = switch (node_tags[case_node]) {
|
|
|
|
|
.switch_case_one => tree.switchCaseOne(case_node),
|
|
|
|
|
.switch_case => tree.switchCase(case_node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const is_special = (case.ast.values.len == 0) or
|
|
|
|
|
(case.ast.values.len == 1 and
|
|
|
|
|
node_tags[case.ast.values[0]] == .identifier and
|
|
|
|
|
mem.eql(u8, tree.tokenSlice(main_tokens[case.ast.values[0]]), "_"));
|
|
|
|
|
if (!is_special) continue;
|
|
|
|
|
|
|
|
|
|
const tok_index = main_tokens[case_node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
} else unreachable;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.node_offset_switch_range => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-05-02 15:06:32 -07:00
|
|
|
const switch_node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
2021-08-30 19:22:04 -07:00
|
|
|
const extra = tree.extraData(node_datas[switch_node].rhs, Ast.Node.SubRange);
|
2021-03-31 23:00:00 -07:00
|
|
|
const case_nodes = tree.extra_data[extra.start..extra.end];
|
|
|
|
|
for (case_nodes) |case_node| {
|
|
|
|
|
const case = switch (node_tags[case_node]) {
|
|
|
|
|
.switch_case_one => tree.switchCaseOne(case_node),
|
|
|
|
|
.switch_case => tree.switchCase(case_node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const is_special = (case.ast.values.len == 0) or
|
|
|
|
|
(case.ast.values.len == 1 and
|
|
|
|
|
node_tags[case.ast.values[0]] == .identifier and
|
|
|
|
|
mem.eql(u8, tree.tokenSlice(main_tokens[case.ast.values[0]]), "_"));
|
|
|
|
|
if (is_special) continue;
|
|
|
|
|
|
|
|
|
|
for (case.ast.values) |item_node| {
|
|
|
|
|
if (node_tags[item_node] == .switch_range) {
|
|
|
|
|
const tok_index = main_tokens[item_node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
} else unreachable;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.node_offset_fn_type_cc => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
instruction, Sema emits a constant with a `decl_ref`.
* AIR no longer has a `ref` instruction. There is no longer any
instruction that takes a value and returns a pointer to it. If this
is desired, Sema must either create an anynomous Decl and return a
constant `decl_ref`, or in the case of a runtime value, emit an
`alloc` instruction, `store` the value to it, and then return the
`alloc`.
* The `ref_val` Value Tag is eliminated. `decl_ref` should be used
instead. Also added is `eu_payload_ptr` which points to the payload
of an error union, given an error union pointer.
In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.
There is a new abstraction in Sema, which looks like this:
var anon_decl = try block.startAnonDecl();
defer anon_decl.deinit();
// here 'anon_decl.arena()` may be used
const decl = try anon_decl.finish(ty, val);
// decl is typically now used with `decl_ref`.
This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.
Additional improvements:
* Sema: fix source location resolution for calling convention
expression.
* Sema: properly report "unable to resolve comptime value" for loads of
global variables. There is now a set of functions which can be
called if the callee wants to obtain the Value even if the tag is
`variable` (indicating comptime-known address but runtime-known value).
* Sema: `coerce` resolves builtin types before checking equality.
* Sema: fix `u1_type` missing from `addType`, making this type have a
slightly more efficient representation in AIR.
* LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
to properly do an LLVMConstBitCast.
* Remove unused parameter from `Value.toEnum`.
After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-08-30 19:22:04 -07:00
|
|
|
var params: [1]Ast.Node.Index = undefined;
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.fn_proto_simple => tree.fnProtoSimple(¶ms, node),
|
|
|
|
|
.fn_proto_multi => tree.fnProtoMulti(node),
|
|
|
|
|
.fn_proto_one => tree.fnProtoOne(¶ms, node),
|
|
|
|
|
.fn_proto => tree.fnProto(node),
|
stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
instruction, Sema emits a constant with a `decl_ref`.
* AIR no longer has a `ref` instruction. There is no longer any
instruction that takes a value and returns a pointer to it. If this
is desired, Sema must either create an anynomous Decl and return a
constant `decl_ref`, or in the case of a runtime value, emit an
`alloc` instruction, `store` the value to it, and then return the
`alloc`.
* The `ref_val` Value Tag is eliminated. `decl_ref` should be used
instead. Also added is `eu_payload_ptr` which points to the payload
of an error union, given an error union pointer.
In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.
There is a new abstraction in Sema, which looks like this:
var anon_decl = try block.startAnonDecl();
defer anon_decl.deinit();
// here 'anon_decl.arena()` may be used
const decl = try anon_decl.finish(ty, val);
// decl is typically now used with `decl_ref`.
This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.
Additional improvements:
* Sema: fix source location resolution for calling convention
expression.
* Sema: properly report "unable to resolve comptime value" for loads of
global variables. There is now a set of functions which can be
called if the callee wants to obtain the Value even if the tag is
`variable` (indicating comptime-known address but runtime-known value).
* Sema: `coerce` resolves builtin types before checking equality.
* Sema: fix `u1_type` missing from `addType`, making this type have a
slightly more efficient representation in AIR.
* LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
to properly do an LLVMConstBitCast.
* Remove unused parameter from `Value.toEnum`.
After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
|
|
|
.fn_decl => switch (node_tags[node_datas[node].lhs]) {
|
|
|
|
|
.fn_proto_simple => tree.fnProtoSimple(¶ms, node_datas[node].lhs),
|
|
|
|
|
.fn_proto_multi => tree.fnProtoMulti(node_datas[node].lhs),
|
|
|
|
|
.fn_proto_one => tree.fnProtoOne(¶ms, node_datas[node].lhs),
|
|
|
|
|
.fn_proto => tree.fnProto(node_datas[node].lhs),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
},
|
2021-03-31 23:00:00 -07:00
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[full.ast.callconv_expr];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.node_offset_fn_type_ret_ty => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-03-31 23:00:00 -07:00
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-04-16 14:44:02 -07:00
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-08-30 19:22:04 -07:00
|
|
|
var params: [1]Ast.Node.Index = undefined;
|
2021-03-31 23:00:00 -07:00
|
|
|
const full = switch (node_tags[node]) {
|
|
|
|
|
.fn_proto_simple => tree.fnProtoSimple(¶ms, node),
|
|
|
|
|
.fn_proto_multi => tree.fnProtoMulti(node),
|
|
|
|
|
.fn_proto_one => tree.fnProtoOne(¶ms, node),
|
|
|
|
|
.fn_proto => tree.fnProto(node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[full.ast.return_type];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-04-23 18:28:46 -07:00
|
|
|
|
|
|
|
|
.node_offset_anyframe_type => |node_off| {
|
2021-04-28 22:43:26 -07:00
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
2021-04-23 18:28:46 -07:00
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const parent_node = src_loc.declRelativeToNodeIndex(node_off);
|
|
|
|
|
const node = node_datas[parent_node].rhs;
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[node];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
2021-05-04 14:40:59 -07:00
|
|
|
|
|
|
|
|
.node_offset_lib_name => |node_off| {
|
|
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
|
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
|
|
|
|
const parent_node = src_loc.declRelativeToNodeIndex(node_off);
|
2021-08-30 19:22:04 -07:00
|
|
|
var params: [1]Ast.Node.Index = undefined;
|
2021-05-04 14:40:59 -07:00
|
|
|
const full = switch (node_tags[parent_node]) {
|
|
|
|
|
.fn_proto_simple => tree.fnProtoSimple(¶ms, parent_node),
|
|
|
|
|
.fn_proto_multi => tree.fnProtoMulti(parent_node),
|
|
|
|
|
.fn_proto_one => tree.fnProtoOne(¶ms, parent_node),
|
|
|
|
|
.fn_proto => tree.fnProto(parent_node),
|
|
|
|
|
.fn_decl => blk: {
|
|
|
|
|
const fn_proto = node_datas[parent_node].lhs;
|
|
|
|
|
break :blk switch (node_tags[fn_proto]) {
|
|
|
|
|
.fn_proto_simple => tree.fnProtoSimple(¶ms, fn_proto),
|
|
|
|
|
.fn_proto_multi => tree.fnProtoMulti(fn_proto),
|
|
|
|
|
.fn_proto_one => tree.fnProtoOne(¶ms, fn_proto),
|
|
|
|
|
.fn_proto => tree.fnProto(fn_proto),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
},
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const tok_index = full.lib_name.?;
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
},
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
}
|
|
|
|
|
}
|
2021-09-14 21:58:22 -07:00
|
|
|
|
|
|
|
|
pub fn byteOffsetBuiltinCallArg(
|
|
|
|
|
src_loc: SrcLoc,
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
node_off: i32,
|
|
|
|
|
arg_index: u32,
|
|
|
|
|
) !u32 {
|
|
|
|
|
const tree = try src_loc.file_scope.getTree(gpa);
|
|
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
|
|
|
|
const node = src_loc.declRelativeToNodeIndex(node_off);
|
|
|
|
|
const param = switch (node_tags[node]) {
|
|
|
|
|
.builtin_call_two, .builtin_call_two_comma => switch (arg_index) {
|
|
|
|
|
0 => node_datas[node].lhs,
|
|
|
|
|
1 => node_datas[node].rhs,
|
|
|
|
|
else => unreachable,
|
|
|
|
|
},
|
|
|
|
|
.builtin_call, .builtin_call_comma => tree.extra_data[node_datas[node].lhs + arg_index],
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const tok_index = main_tokens[param];
|
|
|
|
|
const token_starts = tree.tokens.items(.start);
|
|
|
|
|
return token_starts[tok_index];
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/// Resolving a source location into a byte offset may require doing work
|
|
|
|
|
/// that we would rather not do unless the error actually occurs.
|
|
|
|
|
/// Therefore we need a data structure that contains the information necessary
|
|
|
|
|
/// to lazily produce a `SrcLoc` as required.
|
|
|
|
|
/// Most of the offsets in this data structure are relative to the containing Decl.
|
|
|
|
|
/// This makes the source location resolve properly even when a Decl gets
|
|
|
|
|
/// shifted up or down in the file, as long as the Decl's contents itself
|
|
|
|
|
/// do not change.
|
|
|
|
|
pub const LazySrcLoc = union(enum) {
|
|
|
|
|
/// When this tag is set, the code that constructed this `LazySrcLoc` is asserting
|
|
|
|
|
/// that all code paths which would need to resolve the source location are
|
|
|
|
|
/// unreachable. If you are debugging this tag incorrectly being this value,
|
|
|
|
|
/// look into using reverse-continue with a memory watchpoint to see where the
|
|
|
|
|
/// value is being set to this tag.
|
|
|
|
|
unneeded,
|
2021-04-14 11:26:53 -07:00
|
|
|
/// Means the source location points to an entire file; not any particular
|
|
|
|
|
/// location within the file. `file_scope` union field will be active.
|
|
|
|
|
entire_file,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to a byte offset within a source file,
|
|
|
|
|
/// offset from 0. The source file is determined contextually.
|
|
|
|
|
/// Inside a `SrcLoc`, the `file_scope` union field will be active.
|
|
|
|
|
byte_abs: u32,
|
|
|
|
|
/// The source location points to a token within a source file,
|
|
|
|
|
/// offset from 0. The source file is determined contextually.
|
|
|
|
|
/// Inside a `SrcLoc`, the `file_scope` union field will be active.
|
|
|
|
|
token_abs: u32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to an AST node within a source file,
|
|
|
|
|
/// offset from 0. The source file is determined contextually.
|
|
|
|
|
/// Inside a `SrcLoc`, the `file_scope` union field will be active.
|
|
|
|
|
node_abs: u32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to a byte offset within a source file,
|
|
|
|
|
/// offset from the byte offset of the Decl within the file.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
byte_offset: u32,
|
|
|
|
|
/// This data is the offset into the token list from the Decl token.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
token_offset: u32,
|
|
|
|
|
/// The source location points to an AST node, which is this value offset
|
|
|
|
|
/// from its containing Decl node AST index.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset: i32,
|
2021-04-02 21:06:09 -07:00
|
|
|
/// The source location points to two tokens left of the first token of an AST node,
|
|
|
|
|
/// which is this value offset from its containing Decl node AST index.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_back2tok: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to a variable declaration type expression,
|
|
|
|
|
/// found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a variable declaration AST node. Next, navigate
|
|
|
|
|
/// to the type expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_var_decl_ty: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to a for loop condition expression,
|
|
|
|
|
/// found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a for loop AST node. Next, navigate
|
|
|
|
|
/// to the condition expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_for_cond: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to the first parameter of a builtin
|
|
|
|
|
/// function call, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a builtin call AST node. Next, navigate
|
|
|
|
|
/// to the first parameter.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_builtin_call_arg0: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// Same as `node_offset_builtin_call_arg0` except arg index 1.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_builtin_call_arg1: i32,
|
2021-09-14 21:58:22 -07:00
|
|
|
node_offset_builtin_call_arg2: i32,
|
|
|
|
|
node_offset_builtin_call_arg3: i32,
|
|
|
|
|
node_offset_builtin_call_arg4: i32,
|
|
|
|
|
node_offset_builtin_call_arg5: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to the index expression of an array access
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to an array access AST node. Next, navigate
|
|
|
|
|
/// to the index expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_array_access_index: i32,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// The source location points to the sentinel expression of a slice
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a slice AST node. Next, navigate
|
|
|
|
|
/// to the sentinel expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_slice_sentinel: i32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to the callee expression of a function
|
|
|
|
|
/// call expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a function call AST node. Next, navigate
|
|
|
|
|
/// to the callee expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_call_func: i32,
|
2021-04-02 21:06:09 -07:00
|
|
|
/// The payload is offset from the containing Decl AST node.
|
|
|
|
|
/// The source location points to the field name of:
|
|
|
|
|
/// * a field access expression (`a.b`), or
|
|
|
|
|
/// * the operand ("b" node) of a field initialization expression (`.a = b`)
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_field_name: i32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to the pointer of a pointer deref expression,
|
|
|
|
|
/// found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a pointer deref AST node. Next, navigate
|
|
|
|
|
/// to the pointer expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_deref_ptr: i32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to the assembly source code of an inline assembly
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to inline assembly AST node. Next, navigate
|
|
|
|
|
/// to the asm template source code.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_asm_source: i32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to the return type of an inline assembly
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to inline assembly AST node. Next, navigate
|
|
|
|
|
/// to the return type expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_asm_ret_ty: i32,
|
2021-03-17 22:54:56 -07:00
|
|
|
/// The source location points to the condition expression of an if
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to an if expression AST node. Next, navigate
|
|
|
|
|
/// to the condition expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
2021-03-20 17:09:06 -07:00
|
|
|
node_offset_if_cond: i32,
|
2021-03-21 19:23:12 -07:00
|
|
|
/// The source location points to a binary expression, such as `a + b`, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_bin_op: i32,
|
|
|
|
|
/// The source location points to the LHS of a binary expression, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node,
|
|
|
|
|
/// which points to a binary expression AST node. Next, nagivate to the LHS.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_bin_lhs: i32,
|
|
|
|
|
/// The source location points to the RHS of a binary expression, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node,
|
|
|
|
|
/// which points to a binary expression AST node. Next, nagivate to the RHS.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_bin_rhs: i32,
|
stage2: guidance on how to implement switch expressions
Here's what I think the ZIR should be. AstGen is not yet implemented to
match this, and the main implementation of analyzeSwitch in Sema is not
yet implemented to match it either.
Here are some example byte size reductions from master branch, with the
ZIR memory layout from this commit:
```
switch (foo) {
a => 1,
b => 2,
c => 3,
d => 4,
}
```
184 bytes (master) => 40 bytes (this branch)
```
switch (foo) {
a, b => 1,
c..d, e, f => 2,
g => 3,
else => 4,
}
```
240 bytes (master) => 80 bytes (this branch)
2021-03-28 23:12:26 -07:00
|
|
|
/// The source location points to the operand of a switch expression, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node,
|
|
|
|
|
/// which points to a switch expression AST node. Next, nagivate to the operand.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_switch_operand: i32,
|
2021-03-29 21:59:08 -07:00
|
|
|
/// The source location points to the else/`_` prong of a switch expression, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node,
|
|
|
|
|
/// which points to a switch expression AST node. Next, nagivate to the else/`_` prong.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_switch_special_prong: i32,
|
|
|
|
|
/// The source location points to all the ranges of a switch expression, found
|
|
|
|
|
/// by taking this AST node index offset from the containing Decl AST node,
|
|
|
|
|
/// which points to a switch expression AST node. Next, nagivate to any of the
|
|
|
|
|
/// range nodes. The error applies to all of them.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_switch_range: i32,
|
2021-03-31 21:36:32 -07:00
|
|
|
/// The source location points to the calling convention of a function type
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a function type AST node. Next, nagivate to
|
|
|
|
|
/// the calling convention node.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_fn_type_cc: i32,
|
|
|
|
|
/// The source location points to the return type of a function type
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a function type AST node. Next, nagivate to
|
|
|
|
|
/// the return type node.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_fn_type_ret_ty: i32,
|
2021-04-23 18:28:46 -07:00
|
|
|
/// The source location points to the type expression of an `anyframe->T`
|
|
|
|
|
/// expression, found by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a `anyframe->T` expression AST node. Next, navigate
|
|
|
|
|
/// to the type expression.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_anyframe_type: i32,
|
2021-05-04 14:40:59 -07:00
|
|
|
/// The source location points to the string literal of `extern "foo"`, found
|
|
|
|
|
/// by taking this AST node index offset from the containing
|
|
|
|
|
/// Decl AST node, which points to a function prototype or variable declaration
|
|
|
|
|
/// expression AST node. Next, navigate to the string literal of the `extern "foo"`.
|
|
|
|
|
/// The Decl is determined contextually.
|
|
|
|
|
node_offset_lib_name: i32,
|
2021-03-17 00:56:08 -07:00
|
|
|
|
|
|
|
|
/// Upgrade to a `SrcLoc` based on the `Decl` or file in the provided scope.
|
|
|
|
|
pub fn toSrcLoc(lazy: LazySrcLoc, scope: *Scope) SrcLoc {
|
|
|
|
|
return switch (lazy) {
|
|
|
|
|
.unneeded,
|
2021-04-14 11:26:53 -07:00
|
|
|
.entire_file,
|
2021-03-17 00:56:08 -07:00
|
|
|
.byte_abs,
|
|
|
|
|
.token_abs,
|
2021-03-17 22:54:56 -07:00
|
|
|
.node_abs,
|
2021-03-17 00:56:08 -07:00
|
|
|
=> .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = scope.getFileScope(),
|
|
|
|
|
.parent_decl_node = 0,
|
2021-03-17 00:56:08 -07:00
|
|
|
.lazy = lazy,
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.byte_offset,
|
|
|
|
|
.token_offset,
|
|
|
|
|
.node_offset,
|
2021-04-02 21:06:09 -07:00
|
|
|
.node_offset_back2tok,
|
2021-03-17 00:56:08 -07:00
|
|
|
.node_offset_var_decl_ty,
|
|
|
|
|
.node_offset_for_cond,
|
|
|
|
|
.node_offset_builtin_call_arg0,
|
|
|
|
|
.node_offset_builtin_call_arg1,
|
2021-09-14 21:58:22 -07:00
|
|
|
.node_offset_builtin_call_arg2,
|
|
|
|
|
.node_offset_builtin_call_arg3,
|
|
|
|
|
.node_offset_builtin_call_arg4,
|
|
|
|
|
.node_offset_builtin_call_arg5,
|
2021-03-17 00:56:08 -07:00
|
|
|
.node_offset_array_access_index,
|
|
|
|
|
.node_offset_slice_sentinel,
|
2021-03-17 22:54:56 -07:00
|
|
|
.node_offset_call_func,
|
|
|
|
|
.node_offset_field_name,
|
|
|
|
|
.node_offset_deref_ptr,
|
|
|
|
|
.node_offset_asm_source,
|
|
|
|
|
.node_offset_asm_ret_ty,
|
|
|
|
|
.node_offset_if_cond,
|
2021-03-21 19:23:12 -07:00
|
|
|
.node_offset_bin_op,
|
|
|
|
|
.node_offset_bin_lhs,
|
|
|
|
|
.node_offset_bin_rhs,
|
stage2: guidance on how to implement switch expressions
Here's what I think the ZIR should be. AstGen is not yet implemented to
match this, and the main implementation of analyzeSwitch in Sema is not
yet implemented to match it either.
Here are some example byte size reductions from master branch, with the
ZIR memory layout from this commit:
```
switch (foo) {
a => 1,
b => 2,
c => 3,
d => 4,
}
```
184 bytes (master) => 40 bytes (this branch)
```
switch (foo) {
a, b => 1,
c..d, e, f => 2,
g => 3,
else => 4,
}
```
240 bytes (master) => 80 bytes (this branch)
2021-03-28 23:12:26 -07:00
|
|
|
.node_offset_switch_operand,
|
2021-03-29 21:59:08 -07:00
|
|
|
.node_offset_switch_special_prong,
|
|
|
|
|
.node_offset_switch_range,
|
2021-03-31 21:36:32 -07:00
|
|
|
.node_offset_fn_type_cc,
|
|
|
|
|
.node_offset_fn_type_ret_ty,
|
2021-04-23 18:28:46 -07:00
|
|
|
.node_offset_anyframe_type,
|
2021-05-04 14:40:59 -07:00
|
|
|
.node_offset_lib_name,
|
2021-03-17 00:56:08 -07:00
|
|
|
=> .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = scope.getFileScope(),
|
|
|
|
|
.parent_decl_node = scope.srcDecl().?.src_node,
|
2021-03-17 00:56:08 -07:00
|
|
|
.lazy = lazy,
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
}
|
2021-03-17 22:54:56 -07:00
|
|
|
|
|
|
|
|
/// Upgrade to a `SrcLoc` based on the `Decl` provided.
|
|
|
|
|
pub fn toSrcLocWithDecl(lazy: LazySrcLoc, decl: *Decl) SrcLoc {
|
|
|
|
|
return switch (lazy) {
|
|
|
|
|
.unneeded,
|
2021-04-14 11:26:53 -07:00
|
|
|
.entire_file,
|
2021-03-17 22:54:56 -07:00
|
|
|
.byte_abs,
|
|
|
|
|
.token_abs,
|
|
|
|
|
.node_abs,
|
|
|
|
|
=> .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = 0,
|
2021-03-17 22:54:56 -07:00
|
|
|
.lazy = lazy,
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.byte_offset,
|
|
|
|
|
.token_offset,
|
|
|
|
|
.node_offset,
|
2021-04-02 21:06:09 -07:00
|
|
|
.node_offset_back2tok,
|
2021-03-17 22:54:56 -07:00
|
|
|
.node_offset_var_decl_ty,
|
|
|
|
|
.node_offset_for_cond,
|
|
|
|
|
.node_offset_builtin_call_arg0,
|
|
|
|
|
.node_offset_builtin_call_arg1,
|
2021-09-14 21:58:22 -07:00
|
|
|
.node_offset_builtin_call_arg2,
|
|
|
|
|
.node_offset_builtin_call_arg3,
|
|
|
|
|
.node_offset_builtin_call_arg4,
|
|
|
|
|
.node_offset_builtin_call_arg5,
|
2021-03-17 22:54:56 -07:00
|
|
|
.node_offset_array_access_index,
|
|
|
|
|
.node_offset_slice_sentinel,
|
|
|
|
|
.node_offset_call_func,
|
|
|
|
|
.node_offset_field_name,
|
|
|
|
|
.node_offset_deref_ptr,
|
|
|
|
|
.node_offset_asm_source,
|
|
|
|
|
.node_offset_asm_ret_ty,
|
|
|
|
|
.node_offset_if_cond,
|
2021-03-21 19:23:12 -07:00
|
|
|
.node_offset_bin_op,
|
|
|
|
|
.node_offset_bin_lhs,
|
|
|
|
|
.node_offset_bin_rhs,
|
stage2: guidance on how to implement switch expressions
Here's what I think the ZIR should be. AstGen is not yet implemented to
match this, and the main implementation of analyzeSwitch in Sema is not
yet implemented to match it either.
Here are some example byte size reductions from master branch, with the
ZIR memory layout from this commit:
```
switch (foo) {
a => 1,
b => 2,
c => 3,
d => 4,
}
```
184 bytes (master) => 40 bytes (this branch)
```
switch (foo) {
a, b => 1,
c..d, e, f => 2,
g => 3,
else => 4,
}
```
240 bytes (master) => 80 bytes (this branch)
2021-03-28 23:12:26 -07:00
|
|
|
.node_offset_switch_operand,
|
2021-03-29 21:59:08 -07:00
|
|
|
.node_offset_switch_special_prong,
|
|
|
|
|
.node_offset_switch_range,
|
2021-03-31 21:36:32 -07:00
|
|
|
.node_offset_fn_type_cc,
|
|
|
|
|
.node_offset_fn_type_ret_ty,
|
2021-04-23 18:28:46 -07:00
|
|
|
.node_offset_anyframe_type,
|
2021-05-04 14:40:59 -07:00
|
|
|
.node_offset_lib_name,
|
2021-03-17 22:54:56 -07:00
|
|
|
=> .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = decl.getFileScope(),
|
|
|
|
|
.parent_decl_node = decl.src_node,
|
2021-03-17 22:54:56 -07:00
|
|
|
.lazy = lazy,
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
}
|
2021-01-16 22:51:01 -07:00
|
|
|
};
|
|
|
|
|
|
2021-07-14 12:16:48 -07:00
|
|
|
pub const SemaError = error{ OutOfMemory, AnalysisFail };
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
pub const CompileError = error{
|
|
|
|
|
OutOfMemory,
|
|
|
|
|
/// When this is returned, the compile error for the failure has already been recorded.
|
|
|
|
|
AnalysisFail,
|
|
|
|
|
/// Returned when a compile error needed to be reported but a provided LazySrcLoc was set
|
|
|
|
|
/// to the `unneeded` tag. The source location was, in fact, needed. It is expected that
|
|
|
|
|
/// somewhere up the call stack, the operation will be retried after doing expensive work
|
|
|
|
|
/// to compute a source location.
|
|
|
|
|
NeededSourceLocation,
|
|
|
|
|
/// A Type or Value was needed to be used during semantic analysis, but it was not available
|
|
|
|
|
/// because the function is generic. This is only seen when analyzing the body of a param
|
|
|
|
|
/// instruction.
|
|
|
|
|
GenericPoison,
|
2021-09-14 21:58:22 -07:00
|
|
|
/// In a comptime scope, a return instruction was encountered. This error is only seen when
|
|
|
|
|
/// doing a comptime function call.
|
|
|
|
|
ComptimeReturn,
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn deinit(mod: *Module) void {
|
|
|
|
|
const gpa = mod.gpa;
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.import_table.keys()) |key| {
|
|
|
|
|
gpa.free(key);
|
|
|
|
|
}
|
|
|
|
|
for (mod.import_table.values()) |value| {
|
|
|
|
|
value.destroy(mod);
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
}
|
|
|
|
|
mod.import_table.deinit(gpa);
|
|
|
|
|
|
|
|
|
|
mod.deletion_set.deinit(gpa);
|
|
|
|
|
|
2021-07-23 22:23:03 -07:00
|
|
|
// The callsite of `Compilation.create` owns the `main_pkg`, however
|
2021-04-08 20:52:02 -07:00
|
|
|
// Module owns the builtin and std packages that it adds.
|
2021-07-23 22:23:03 -07:00
|
|
|
if (mod.main_pkg.table.fetchRemove("builtin")) |kv| {
|
2021-06-03 15:39:26 -05:00
|
|
|
gpa.free(kv.key);
|
|
|
|
|
kv.value.destroy(gpa);
|
2021-04-08 20:52:02 -07:00
|
|
|
}
|
2021-07-23 22:23:03 -07:00
|
|
|
if (mod.main_pkg.table.fetchRemove("std")) |kv| {
|
2021-06-03 15:39:26 -05:00
|
|
|
gpa.free(kv.key);
|
|
|
|
|
kv.value.destroy(gpa);
|
2021-04-08 20:52:02 -07:00
|
|
|
}
|
2021-07-23 22:23:03 -07:00
|
|
|
if (mod.main_pkg.table.fetchRemove("root")) |kv| {
|
2021-06-03 15:39:26 -05:00
|
|
|
gpa.free(kv.key);
|
2021-04-08 20:52:02 -07:00
|
|
|
}
|
2021-07-23 22:23:03 -07:00
|
|
|
if (mod.root_pkg != mod.main_pkg) {
|
|
|
|
|
mod.root_pkg.destroy(gpa);
|
|
|
|
|
}
|
2021-04-08 20:52:02 -07:00
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.compile_log_text.deinit(gpa);
|
2021-01-16 22:51:01 -07:00
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.zig_cache_artifact_directory.handle.close();
|
2021-04-25 10:43:07 -07:00
|
|
|
mod.local_zir_cache.handle.close();
|
|
|
|
|
mod.global_zir_cache.handle.close();
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.failed_decls.values()) |value| {
|
|
|
|
|
value.destroy(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.failed_decls.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
if (mod.emit_h) |emit_h| {
|
2021-06-03 15:39:26 -05:00
|
|
|
for (emit_h.failed_decls.values()) |value| {
|
|
|
|
|
value.destroy(gpa);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
|
|
|
|
emit_h.failed_decls.deinit(gpa);
|
|
|
|
|
emit_h.decl_table.deinit(gpa);
|
|
|
|
|
gpa.destroy(emit_h);
|
2021-01-05 17:33:31 -07:00
|
|
|
}
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.failed_files.values()) |value| {
|
|
|
|
|
if (value) |msg| msg.destroy(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.failed_files.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.failed_exports.values()) |value| {
|
|
|
|
|
value.destroy(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.failed_exports.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.compile_log_decls.deinit(gpa);
|
2020-11-21 21:12:33 -05:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.decl_exports.values()) |export_list| {
|
2020-09-13 19:17:58 -07:00
|
|
|
gpa.free(export_list);
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.decl_exports.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.export_owners.values()) |value| {
|
|
|
|
|
freeExportList(gpa, value);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.export_owners.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-08-21 20:42:45 -07:00
|
|
|
{
|
|
|
|
|
var it = mod.global_error_set.keyIterator();
|
|
|
|
|
while (it.next()) |key| {
|
|
|
|
|
gpa.free(key.*);
|
|
|
|
|
}
|
|
|
|
|
mod.global_error_set.deinit(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2020-09-09 18:21:50 +03:00
|
|
|
|
2021-03-26 17:54:41 -04:00
|
|
|
mod.error_name_list.deinit(gpa);
|
2021-07-27 15:44:21 -07:00
|
|
|
mod.test_functions.deinit(gpa);
|
2021-08-05 16:37:21 -07:00
|
|
|
mod.monomorphed_funcs.deinit(gpa);
|
2021-08-21 20:42:45 -07:00
|
|
|
|
|
|
|
|
{
|
|
|
|
|
var it = mod.memoized_calls.iterator();
|
|
|
|
|
while (it.next()) |entry| {
|
|
|
|
|
gpa.free(entry.key_ptr.args);
|
|
|
|
|
entry.value_ptr.arena.promote(gpa).deinit();
|
|
|
|
|
}
|
|
|
|
|
mod.memoized_calls.deinit(gpa);
|
|
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fn freeExportList(gpa: *Allocator, export_list: []*Export) void {
|
|
|
|
|
for (export_list) |exp| {
|
|
|
|
|
gpa.free(exp.options.name);
|
|
|
|
|
gpa.destroy(exp);
|
|
|
|
|
}
|
|
|
|
|
gpa.free(export_list);
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-25 00:02:58 -07:00
|
|
|
const data_has_safety_tag = @sizeOf(Zir.Inst.Data) != 8;
|
|
|
|
|
// TODO This is taking advantage of matching stage1 debug union layout.
|
|
|
|
|
// We need a better language feature for initializing a union with
|
|
|
|
|
// a runtime known tag.
|
|
|
|
|
const Stage1DataLayout = extern struct {
|
|
|
|
|
data: [8]u8 align(8),
|
2021-04-26 20:41:07 -07:00
|
|
|
safety_tag: u8,
|
2021-04-25 00:02:58 -07:00
|
|
|
};
|
|
|
|
|
comptime {
|
|
|
|
|
if (data_has_safety_tag) {
|
|
|
|
|
assert(@sizeOf(Stage1DataLayout) == @sizeOf(Zir.Inst.Data));
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-06-21 13:25:48 -07:00
|
|
|
pub fn astGenFile(mod: *Module, file: *Scope.File) !void {
|
2021-04-15 20:34:21 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
const comp = mod.comp;
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
|
|
|
|
// In any case we need to examine the stat of the file to determine the course of action.
|
2021-04-25 00:02:58 -07:00
|
|
|
var source_file = try file.pkg.root_src_directory.handle.openFile(file.sub_file_path, .{});
|
|
|
|
|
defer source_file.close();
|
2021-04-14 11:26:53 -07:00
|
|
|
|
2021-04-25 00:02:58 -07:00
|
|
|
const stat = try source_file.stat();
|
|
|
|
|
|
2021-07-23 22:23:03 -07:00
|
|
|
const want_local_cache = file.pkg == mod.main_pkg;
|
2021-04-25 00:02:58 -07:00
|
|
|
const digest = hash: {
|
|
|
|
|
var path_hash: Cache.HashHelper = .{};
|
2021-07-03 11:47:58 -07:00
|
|
|
path_hash.addBytes(build_options.version);
|
2021-04-25 00:02:58 -07:00
|
|
|
if (!want_local_cache) {
|
|
|
|
|
path_hash.addOptionalBytes(file.pkg.root_src_directory.path);
|
|
|
|
|
}
|
|
|
|
|
path_hash.addBytes(file.sub_file_path);
|
|
|
|
|
break :hash path_hash.final();
|
|
|
|
|
};
|
2021-04-25 10:43:07 -07:00
|
|
|
const cache_directory = if (want_local_cache) mod.local_zir_cache else mod.global_zir_cache;
|
|
|
|
|
const zir_dir = cache_directory.handle;
|
2021-04-25 00:02:58 -07:00
|
|
|
|
|
|
|
|
var cache_file: ?std.fs.File = null;
|
|
|
|
|
defer if (cache_file) |f| f.close();
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
// Determine whether we need to reload the file from disk and redo parsing and AstGen.
|
|
|
|
|
switch (file.status) {
|
2021-04-25 00:02:58 -07:00
|
|
|
.never_loaded, .retryable_failure => cached: {
|
|
|
|
|
// First, load the cached ZIR code, if any.
|
|
|
|
|
log.debug("AstGen checking cache: {s} (local={}, digest={s})", .{
|
|
|
|
|
file.sub_file_path, want_local_cache, &digest,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
// We ask for a lock in order to coordinate with other zig processes.
|
|
|
|
|
// If another process is already working on this file, we will get the cached
|
|
|
|
|
// version. Likewise if we're working on AstGen and another process asks for
|
|
|
|
|
// the cached file, they'll get it.
|
|
|
|
|
cache_file = zir_dir.openFile(&digest, .{ .lock = .Shared }) catch |err| switch (err) {
|
|
|
|
|
error.PathAlreadyExists => unreachable, // opening for reading
|
|
|
|
|
error.NoSpaceLeft => unreachable, // opening for reading
|
|
|
|
|
error.NotDir => unreachable, // no dir components
|
|
|
|
|
error.InvalidUtf8 => unreachable, // it's a hex encoded name
|
|
|
|
|
error.BadPathName => unreachable, // it's a hex encoded name
|
|
|
|
|
error.NameTooLong => unreachable, // it's a fixed size name
|
|
|
|
|
error.PipeBusy => unreachable, // it's not a pipe
|
|
|
|
|
error.WouldBlock => unreachable, // not asking for non-blocking I/O
|
|
|
|
|
|
|
|
|
|
error.SymLinkLoop,
|
|
|
|
|
error.FileNotFound,
|
|
|
|
|
error.Unexpected,
|
|
|
|
|
=> break :cached,
|
|
|
|
|
|
|
|
|
|
else => |e| return e, // Retryable errors are handled at callsite.
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
// First we read the header to determine the lengths of arrays.
|
|
|
|
|
const header = cache_file.?.reader().readStruct(Zir.Header) catch |err| switch (err) {
|
|
|
|
|
// This can happen if Zig bails out of this function between creating
|
|
|
|
|
// the cached file and writing it.
|
|
|
|
|
error.EndOfStream => break :cached,
|
|
|
|
|
else => |e| return e,
|
|
|
|
|
};
|
|
|
|
|
const unchanged_metadata =
|
|
|
|
|
stat.size == header.stat_size and
|
|
|
|
|
stat.mtime == header.stat_mtime and
|
|
|
|
|
stat.inode == header.stat_inode;
|
|
|
|
|
|
|
|
|
|
if (!unchanged_metadata) {
|
|
|
|
|
log.debug("AstGen cache stale: {s}", .{file.sub_file_path});
|
|
|
|
|
break :cached;
|
|
|
|
|
}
|
2021-04-26 20:41:07 -07:00
|
|
|
log.debug("AstGen cache hit: {s} instructions_len={d}", .{
|
|
|
|
|
file.sub_file_path, header.instructions_len,
|
|
|
|
|
});
|
2021-04-25 00:02:58 -07:00
|
|
|
|
|
|
|
|
var instructions: std.MultiArrayList(Zir.Inst) = .{};
|
|
|
|
|
defer instructions.deinit(gpa);
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
try instructions.setCapacity(gpa, header.instructions_len);
|
|
|
|
|
instructions.len = header.instructions_len;
|
2021-04-25 00:02:58 -07:00
|
|
|
|
|
|
|
|
var zir: Zir = .{
|
|
|
|
|
.instructions = instructions.toOwnedSlice(),
|
|
|
|
|
.string_bytes = &.{},
|
|
|
|
|
.extra = &.{},
|
|
|
|
|
};
|
|
|
|
|
var keep_zir = false;
|
|
|
|
|
defer if (!keep_zir) zir.deinit(gpa);
|
|
|
|
|
|
|
|
|
|
zir.string_bytes = try gpa.alloc(u8, header.string_bytes_len);
|
|
|
|
|
zir.extra = try gpa.alloc(u32, header.extra_len);
|
|
|
|
|
|
|
|
|
|
const safety_buffer = if (data_has_safety_tag)
|
|
|
|
|
try gpa.alloc([8]u8, header.instructions_len)
|
|
|
|
|
else
|
|
|
|
|
undefined;
|
|
|
|
|
defer if (data_has_safety_tag) gpa.free(safety_buffer);
|
|
|
|
|
|
|
|
|
|
const data_ptr = if (data_has_safety_tag)
|
|
|
|
|
@ptrCast([*]u8, safety_buffer.ptr)
|
|
|
|
|
else
|
|
|
|
|
@ptrCast([*]u8, zir.instructions.items(.data).ptr);
|
|
|
|
|
|
|
|
|
|
var iovecs = [_]std.os.iovec{
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = @ptrCast([*]u8, zir.instructions.items(.tag).ptr),
|
|
|
|
|
.iov_len = header.instructions_len,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = data_ptr,
|
|
|
|
|
.iov_len = header.instructions_len * 8,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = zir.string_bytes.ptr,
|
|
|
|
|
.iov_len = header.string_bytes_len,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = @ptrCast([*]u8, zir.extra.ptr),
|
|
|
|
|
.iov_len = header.extra_len * 4,
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
const amt_read = try cache_file.?.readvAll(&iovecs);
|
|
|
|
|
const amt_expected = zir.instructions.len * 9 +
|
|
|
|
|
zir.string_bytes.len +
|
|
|
|
|
zir.extra.len * 4;
|
|
|
|
|
if (amt_read != amt_expected) {
|
|
|
|
|
log.warn("unexpected EOF reading cached ZIR for {s}", .{file.sub_file_path});
|
|
|
|
|
zir.deinit(gpa);
|
|
|
|
|
break :cached;
|
|
|
|
|
}
|
|
|
|
|
if (data_has_safety_tag) {
|
|
|
|
|
const tags = zir.instructions.items(.tag);
|
|
|
|
|
for (zir.instructions.items(.data)) |*data, i| {
|
|
|
|
|
const union_tag = Zir.Inst.Tag.data_tags[@enumToInt(tags[i])];
|
|
|
|
|
const as_struct = @ptrCast(*Stage1DataLayout, data);
|
|
|
|
|
as_struct.* = .{
|
|
|
|
|
.safety_tag = @enumToInt(union_tag),
|
|
|
|
|
.data = safety_buffer[i],
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
keep_zir = true;
|
|
|
|
|
file.zir = zir;
|
|
|
|
|
file.zir_loaded = true;
|
|
|
|
|
file.stat_size = header.stat_size;
|
|
|
|
|
file.stat_inode = header.stat_inode;
|
|
|
|
|
file.stat_mtime = header.stat_mtime;
|
2021-04-28 22:43:26 -07:00
|
|
|
file.status = .success_zir;
|
2021-04-25 00:02:58 -07:00
|
|
|
log.debug("AstGen cached success: {s}", .{file.sub_file_path});
|
|
|
|
|
|
|
|
|
|
// TODO don't report compile errors until Sema @importFile
|
|
|
|
|
if (file.zir.hasCompileErrors()) {
|
|
|
|
|
{
|
|
|
|
|
const lock = comp.mutex.acquire();
|
|
|
|
|
defer lock.release();
|
|
|
|
|
try mod.failed_files.putNoClobber(gpa, file, null);
|
|
|
|
|
}
|
|
|
|
|
file.status = .astgen_failure;
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
}
|
|
|
|
|
return;
|
2021-04-15 20:34:21 -07:00
|
|
|
},
|
2021-05-11 17:34:13 -07:00
|
|
|
.parse_failure, .astgen_failure, .success_zir => {
|
2021-04-14 11:26:53 -07:00
|
|
|
const unchanged_metadata =
|
|
|
|
|
stat.size == file.stat_size and
|
|
|
|
|
stat.mtime == file.stat_mtime and
|
|
|
|
|
stat.inode == file.stat_inode;
|
|
|
|
|
|
|
|
|
|
if (unchanged_metadata) {
|
|
|
|
|
log.debug("unmodified metadata of file: {s}", .{file.sub_file_path});
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
log.debug("metadata changed: {s}", .{file.sub_file_path});
|
|
|
|
|
},
|
|
|
|
|
}
|
2021-04-25 00:02:58 -07:00
|
|
|
if (cache_file) |f| {
|
|
|
|
|
f.close();
|
|
|
|
|
cache_file = null;
|
|
|
|
|
}
|
|
|
|
|
cache_file = zir_dir.createFile(&digest, .{ .lock = .Exclusive }) catch |err| switch (err) {
|
|
|
|
|
error.NotDir => unreachable, // no dir components
|
|
|
|
|
error.InvalidUtf8 => unreachable, // it's a hex encoded name
|
|
|
|
|
error.BadPathName => unreachable, // it's a hex encoded name
|
|
|
|
|
error.NameTooLong => unreachable, // it's a fixed size name
|
|
|
|
|
error.PipeBusy => unreachable, // it's not a pipe
|
|
|
|
|
error.WouldBlock => unreachable, // not asking for non-blocking I/O
|
|
|
|
|
error.FileNotFound => unreachable, // no dir components
|
|
|
|
|
|
|
|
|
|
else => |e| {
|
|
|
|
|
const pkg_path = file.pkg.root_src_directory.path orelse ".";
|
|
|
|
|
const cache_path = cache_directory.path orelse ".";
|
2021-04-25 10:43:07 -07:00
|
|
|
log.warn("unable to save cached ZIR code for {s}/{s} to {s}/{s}: {s}", .{
|
2021-04-25 00:02:58 -07:00
|
|
|
pkg_path, file.sub_file_path, cache_path, &digest, @errorName(e),
|
|
|
|
|
});
|
|
|
|
|
return;
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
mod.lockAndClearFileCompileError(file);
|
|
|
|
|
|
2021-05-12 22:02:44 -07:00
|
|
|
// If the previous ZIR does not have compile errors, keep it around
|
|
|
|
|
// in case parsing or new ZIR fails. In case of successful ZIR update
|
|
|
|
|
// at the end of this function we will free it.
|
|
|
|
|
// We keep the previous ZIR loaded so that we can use it
|
|
|
|
|
// for the update next time it does not have any compile errors. This avoids
|
|
|
|
|
// needlessly tossing out semantic analysis work when an error is
|
|
|
|
|
// temporarily introduced.
|
|
|
|
|
if (file.zir_loaded and !file.zir.hasCompileErrors()) {
|
|
|
|
|
assert(file.prev_zir == null);
|
|
|
|
|
const prev_zir_ptr = try gpa.create(Zir);
|
|
|
|
|
file.prev_zir = prev_zir_ptr;
|
|
|
|
|
prev_zir_ptr.* = file.zir;
|
|
|
|
|
file.zir = undefined;
|
|
|
|
|
file.zir_loaded = false;
|
|
|
|
|
}
|
2021-04-14 11:26:53 -07:00
|
|
|
file.unload(gpa);
|
|
|
|
|
|
|
|
|
|
if (stat.size > std.math.maxInt(u32))
|
|
|
|
|
return error.FileTooBig;
|
|
|
|
|
|
2021-05-25 10:34:02 +08:00
|
|
|
const source = try gpa.allocSentinel(u8, @intCast(usize, stat.size), 0);
|
2021-04-14 11:26:53 -07:00
|
|
|
defer if (!file.source_loaded) gpa.free(source);
|
2021-04-25 00:02:58 -07:00
|
|
|
const amt = try source_file.readAll(source);
|
2021-04-14 11:26:53 -07:00
|
|
|
if (amt != stat.size)
|
|
|
|
|
return error.UnexpectedEndOfFile;
|
|
|
|
|
|
|
|
|
|
file.stat_size = stat.size;
|
|
|
|
|
file.stat_inode = stat.inode;
|
|
|
|
|
file.stat_mtime = stat.mtime;
|
|
|
|
|
file.source = source;
|
|
|
|
|
file.source_loaded = true;
|
|
|
|
|
|
|
|
|
|
file.tree = try std.zig.parse(gpa, source);
|
|
|
|
|
defer if (!file.tree_loaded) file.tree.deinit(gpa);
|
|
|
|
|
|
|
|
|
|
if (file.tree.errors.len != 0) {
|
|
|
|
|
const parse_err = file.tree.errors[0];
|
|
|
|
|
|
|
|
|
|
var msg = std.ArrayList(u8).init(gpa);
|
|
|
|
|
defer msg.deinit();
|
|
|
|
|
|
|
|
|
|
const token_starts = file.tree.tokens.items(.start);
|
2021-07-01 00:14:58 -07:00
|
|
|
const token_tags = file.tree.tokens.items(.tag);
|
2021-04-14 11:26:53 -07:00
|
|
|
|
|
|
|
|
try file.tree.renderError(parse_err, msg.writer());
|
|
|
|
|
const err_msg = try gpa.create(ErrorMsg);
|
|
|
|
|
err_msg.* = .{
|
|
|
|
|
.src_loc = .{
|
2021-04-16 14:44:02 -07:00
|
|
|
.file_scope = file,
|
|
|
|
|
.parent_decl_node = 0,
|
2021-04-14 11:26:53 -07:00
|
|
|
.lazy = .{ .byte_abs = token_starts[parse_err.token] },
|
|
|
|
|
},
|
|
|
|
|
.msg = msg.toOwnedSlice(),
|
|
|
|
|
};
|
2021-07-01 00:14:58 -07:00
|
|
|
if (token_tags[parse_err.token] == .invalid) {
|
|
|
|
|
const bad_off = @intCast(u32, file.tree.tokenSlice(parse_err.token).len);
|
2021-07-02 12:33:05 -07:00
|
|
|
const byte_abs = token_starts[parse_err.token] + bad_off;
|
2021-07-01 00:14:58 -07:00
|
|
|
try mod.errNoteNonLazy(.{
|
|
|
|
|
.file_scope = file,
|
|
|
|
|
.parent_decl_node = 0,
|
2021-07-02 12:33:05 -07:00
|
|
|
.lazy = .{ .byte_abs = byte_abs },
|
2021-07-02 15:27:00 -07:00
|
|
|
}, err_msg, "invalid byte: '{'}'", .{std.zig.fmtEscapes(source[byte_abs..][0..1])});
|
2021-07-01 00:14:58 -07:00
|
|
|
}
|
2021-04-14 11:26:53 -07:00
|
|
|
|
|
|
|
|
{
|
|
|
|
|
const lock = comp.mutex.acquire();
|
|
|
|
|
defer lock.release();
|
|
|
|
|
try mod.failed_files.putNoClobber(gpa, file, err_msg);
|
|
|
|
|
}
|
|
|
|
|
file.status = .parse_failure;
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
}
|
|
|
|
|
file.tree_loaded = true;
|
|
|
|
|
|
2021-05-02 17:08:19 -07:00
|
|
|
file.zir = try AstGen.generate(gpa, file.tree);
|
2021-04-14 11:26:53 -07:00
|
|
|
file.zir_loaded = true;
|
2021-04-28 22:43:26 -07:00
|
|
|
file.status = .success_zir;
|
2021-04-25 00:02:58 -07:00
|
|
|
log.debug("AstGen fresh success: {s}", .{file.sub_file_path});
|
|
|
|
|
|
|
|
|
|
const safety_buffer = if (data_has_safety_tag)
|
|
|
|
|
try gpa.alloc([8]u8, file.zir.instructions.len)
|
|
|
|
|
else
|
|
|
|
|
undefined;
|
|
|
|
|
defer if (data_has_safety_tag) gpa.free(safety_buffer);
|
|
|
|
|
const data_ptr = if (data_has_safety_tag)
|
2021-08-28 15:15:46 -07:00
|
|
|
if (file.zir.instructions.len == 0)
|
|
|
|
|
@as([*]const u8, undefined)
|
|
|
|
|
else
|
|
|
|
|
@ptrCast([*]const u8, safety_buffer.ptr)
|
2021-04-25 00:02:58 -07:00
|
|
|
else
|
|
|
|
|
@ptrCast([*]const u8, file.zir.instructions.items(.data).ptr);
|
|
|
|
|
if (data_has_safety_tag) {
|
|
|
|
|
// The `Data` union has a safety tag but in the file format we store it without.
|
|
|
|
|
for (file.zir.instructions.items(.data)) |*data, i| {
|
|
|
|
|
const as_struct = @ptrCast(*const Stage1DataLayout, data);
|
|
|
|
|
safety_buffer[i] = as_struct.data;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const header: Zir.Header = .{
|
|
|
|
|
.instructions_len = @intCast(u32, file.zir.instructions.len),
|
|
|
|
|
.string_bytes_len = @intCast(u32, file.zir.string_bytes.len),
|
|
|
|
|
.extra_len = @intCast(u32, file.zir.extra.len),
|
|
|
|
|
|
|
|
|
|
.stat_size = stat.size,
|
|
|
|
|
.stat_inode = stat.inode,
|
|
|
|
|
.stat_mtime = stat.mtime,
|
|
|
|
|
};
|
|
|
|
|
var iovecs = [_]std.os.iovec_const{
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = @ptrCast([*]const u8, &header),
|
|
|
|
|
.iov_len = @sizeOf(Zir.Header),
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = @ptrCast([*]const u8, file.zir.instructions.items(.tag).ptr),
|
|
|
|
|
.iov_len = file.zir.instructions.len,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = data_ptr,
|
|
|
|
|
.iov_len = file.zir.instructions.len * 8,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = file.zir.string_bytes.ptr,
|
|
|
|
|
.iov_len = file.zir.string_bytes.len,
|
|
|
|
|
},
|
|
|
|
|
.{
|
|
|
|
|
.iov_base = @ptrCast([*]const u8, file.zir.extra.ptr),
|
|
|
|
|
.iov_len = file.zir.extra.len * 4,
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
cache_file.?.writevAll(&iovecs) catch |err| {
|
|
|
|
|
const pkg_path = file.pkg.root_src_directory.path orelse ".";
|
|
|
|
|
const cache_path = cache_directory.path orelse ".";
|
2021-04-25 10:43:07 -07:00
|
|
|
log.warn("unable to write cached ZIR code for {s}/{s} to {s}/{s}: {s}", .{
|
2021-04-25 00:02:58 -07:00
|
|
|
pkg_path, file.sub_file_path, cache_path, &digest, @errorName(err),
|
|
|
|
|
});
|
|
|
|
|
};
|
2021-04-14 11:26:53 -07:00
|
|
|
|
2021-05-12 22:02:44 -07:00
|
|
|
if (file.zir.hasCompileErrors()) {
|
|
|
|
|
{
|
|
|
|
|
const lock = comp.mutex.acquire();
|
|
|
|
|
defer lock.release();
|
|
|
|
|
try mod.failed_files.putNoClobber(gpa, file, null);
|
|
|
|
|
}
|
|
|
|
|
file.status = .astgen_failure;
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (file.prev_zir) |prev_zir| {
|
2021-04-28 16:55:22 -07:00
|
|
|
// Iterate over all Namespace objects contained within this File, looking at the
|
|
|
|
|
// previous and new ZIR together and update the references to point
|
|
|
|
|
// to the new one. For example, Decl name, Decl zir_decl_index, and Namespace
|
|
|
|
|
// decl_table keys need to get updated to point to the new memory, even if the
|
|
|
|
|
// underlying source code is unchanged.
|
|
|
|
|
// We do not need to hold any locks at this time because all the Decl and Namespace
|
|
|
|
|
// objects being touched are specific to this File, and the only other concurrent
|
|
|
|
|
// tasks are touching other File objects.
|
2021-05-12 22:02:44 -07:00
|
|
|
try updateZirRefs(gpa, file, prev_zir.*);
|
2021-05-06 17:20:45 -07:00
|
|
|
// At this point, `file.outdated_decls` and `file.deleted_decls` are populated,
|
|
|
|
|
// and semantic analysis will deal with them properly.
|
2021-05-12 22:02:44 -07:00
|
|
|
// No need to keep previous ZIR.
|
|
|
|
|
prev_zir.deinit(gpa);
|
|
|
|
|
gpa.destroy(prev_zir);
|
|
|
|
|
file.prev_zir = null;
|
|
|
|
|
} else if (file.root_decl) |root_decl| {
|
|
|
|
|
// This is an update, but it is the first time the File has succeeded
|
|
|
|
|
// ZIR. We must mark it outdated since we have already tried to
|
|
|
|
|
// semantically analyze it.
|
|
|
|
|
try file.outdated_decls.resize(gpa, 1);
|
|
|
|
|
file.outdated_decls.items[0] = root_decl;
|
2021-04-14 11:26:53 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-05 13:16:14 -07:00
|
|
|
/// Patch ups:
|
|
|
|
|
/// * Struct.zir_index
|
2021-05-07 14:18:14 -07:00
|
|
|
/// * Decl.zir_index
|
2021-05-05 16:56:24 -07:00
|
|
|
/// * Fn.zir_body_inst
|
2021-05-05 13:16:14 -07:00
|
|
|
/// * Decl.zir_decl_index
|
2021-05-06 17:20:45 -07:00
|
|
|
fn updateZirRefs(gpa: *Allocator, file: *Scope.File, old_zir: Zir) !void {
|
2021-05-05 13:16:14 -07:00
|
|
|
const new_zir = file.zir;
|
|
|
|
|
|
|
|
|
|
// Maps from old ZIR to new ZIR, struct_decl, enum_decl, etc. Any instruction which
|
|
|
|
|
// creates a namespace, gets mapped from old to new here.
|
|
|
|
|
var inst_map: std.AutoHashMapUnmanaged(Zir.Inst.Index, Zir.Inst.Index) = .{};
|
|
|
|
|
defer inst_map.deinit(gpa);
|
|
|
|
|
// Maps from old ZIR to new ZIR, the extra data index for the sub-decl item.
|
|
|
|
|
// e.g. the thing that Decl.zir_decl_index points to.
|
|
|
|
|
var extra_map: std.AutoHashMapUnmanaged(u32, u32) = .{};
|
|
|
|
|
defer extra_map.deinit(gpa);
|
|
|
|
|
|
|
|
|
|
try mapOldZirToNew(gpa, old_zir, new_zir, &inst_map, &extra_map);
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
// Walk the Decl graph, updating ZIR indexes, strings, and populating
|
|
|
|
|
// the deleted and outdated lists.
|
2021-05-05 13:16:14 -07:00
|
|
|
|
|
|
|
|
var decl_stack: std.ArrayListUnmanaged(*Decl) = .{};
|
|
|
|
|
defer decl_stack.deinit(gpa);
|
|
|
|
|
|
2021-05-11 23:20:22 -07:00
|
|
|
const root_decl = file.root_decl.?;
|
2021-05-05 13:16:14 -07:00
|
|
|
try decl_stack.append(gpa, root_decl);
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
file.deleted_decls.clearRetainingCapacity();
|
|
|
|
|
file.outdated_decls.clearRetainingCapacity();
|
|
|
|
|
|
|
|
|
|
// The root decl is always outdated; otherwise we would not have had
|
|
|
|
|
// to re-generate ZIR for the File.
|
|
|
|
|
try file.outdated_decls.append(gpa, root_decl);
|
2021-05-05 13:16:14 -07:00
|
|
|
|
|
|
|
|
while (decl_stack.popOrNull()) |decl| {
|
|
|
|
|
// Anonymous decls and the root decl have this set to 0. We still need
|
|
|
|
|
// to walk them but we do not need to modify this value.
|
2021-05-06 17:20:45 -07:00
|
|
|
// Anonymous decls should not be marked outdated. They will be re-generated
|
|
|
|
|
// if their owner decl is marked outdated.
|
2021-05-05 13:16:14 -07:00
|
|
|
if (decl.zir_decl_index != 0) {
|
2021-05-12 22:02:44 -07:00
|
|
|
const old_zir_decl_index = decl.zir_decl_index;
|
|
|
|
|
const new_zir_decl_index = extra_map.get(old_zir_decl_index) orelse {
|
|
|
|
|
log.debug("updateZirRefs {s}: delete {*} ({s})", .{
|
|
|
|
|
file.sub_file_path, decl, decl.name,
|
|
|
|
|
});
|
2021-05-06 17:20:45 -07:00
|
|
|
try file.deleted_decls.append(gpa, decl);
|
2021-05-05 13:16:14 -07:00
|
|
|
continue;
|
|
|
|
|
};
|
2021-05-12 22:02:44 -07:00
|
|
|
const old_hash = decl.contentsHashZir(old_zir);
|
|
|
|
|
decl.zir_decl_index = new_zir_decl_index;
|
2021-05-05 13:16:14 -07:00
|
|
|
const new_hash = decl.contentsHashZir(new_zir);
|
|
|
|
|
if (!std.zig.srcHashEql(old_hash, new_hash)) {
|
2021-05-12 22:02:44 -07:00
|
|
|
log.debug("updateZirRefs {s}: outdated {*} ({s}) {d} => {d}", .{
|
|
|
|
|
file.sub_file_path, decl, decl.name, old_zir_decl_index, new_zir_decl_index,
|
|
|
|
|
});
|
2021-05-06 17:20:45 -07:00
|
|
|
try file.outdated_decls.append(gpa, decl);
|
2021-05-12 22:02:44 -07:00
|
|
|
} else {
|
|
|
|
|
log.debug("updateZirRefs {s}: unchanged {*} ({s}) {d} => {d}", .{
|
|
|
|
|
file.sub_file_path, decl, decl.name, old_zir_decl_index, new_zir_decl_index,
|
|
|
|
|
});
|
2021-05-05 13:16:14 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-11 22:12:36 -07:00
|
|
|
if (!decl.owns_tv) continue;
|
2021-05-05 13:16:14 -07:00
|
|
|
|
|
|
|
|
if (decl.getStruct()) |struct_obj| {
|
|
|
|
|
struct_obj.zir_index = inst_map.get(struct_obj.zir_index) orelse {
|
2021-05-06 17:20:45 -07:00
|
|
|
try file.deleted_decls.append(gpa, decl);
|
2021-05-05 13:16:14 -07:00
|
|
|
continue;
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-07 14:18:14 -07:00
|
|
|
if (decl.getUnion()) |union_obj| {
|
|
|
|
|
union_obj.zir_index = inst_map.get(union_obj.zir_index) orelse {
|
|
|
|
|
try file.deleted_decls.append(gpa, decl);
|
|
|
|
|
continue;
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-05 16:56:24 -07:00
|
|
|
if (decl.getFunction()) |func| {
|
|
|
|
|
func.zir_body_inst = inst_map.get(func.zir_body_inst) orelse {
|
2021-05-06 17:20:45 -07:00
|
|
|
try file.deleted_decls.append(gpa, decl);
|
2021-05-05 16:56:24 -07:00
|
|
|
continue;
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-07 18:52:11 -07:00
|
|
|
if (decl.getInnerNamespace()) |namespace| {
|
2021-06-03 15:39:26 -05:00
|
|
|
for (namespace.decls.values()) |sub_decl| {
|
2021-05-05 13:16:14 -07:00
|
|
|
try decl_stack.append(gpa, sub_decl);
|
|
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
for (namespace.anon_decls.keys()) |sub_decl| {
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
try decl_stack.append(gpa, sub_decl);
|
|
|
|
|
}
|
2021-05-05 13:16:14 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn mapOldZirToNew(
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
old_zir: Zir,
|
|
|
|
|
new_zir: Zir,
|
|
|
|
|
inst_map: *std.AutoHashMapUnmanaged(Zir.Inst.Index, Zir.Inst.Index),
|
|
|
|
|
extra_map: *std.AutoHashMapUnmanaged(u32, u32),
|
|
|
|
|
) Allocator.Error!void {
|
|
|
|
|
// Contain ZIR indexes of declaration instructions.
|
|
|
|
|
const MatchedZirDecl = struct {
|
|
|
|
|
old_inst: Zir.Inst.Index,
|
|
|
|
|
new_inst: Zir.Inst.Index,
|
|
|
|
|
};
|
|
|
|
|
var match_stack: std.ArrayListUnmanaged(MatchedZirDecl) = .{};
|
|
|
|
|
defer match_stack.deinit(gpa);
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
const old_main_struct_inst = old_zir.getMainStruct();
|
|
|
|
|
const new_main_struct_inst = new_zir.getMainStruct();
|
2021-05-05 13:16:14 -07:00
|
|
|
|
|
|
|
|
try match_stack.append(gpa, .{
|
|
|
|
|
.old_inst = old_main_struct_inst,
|
|
|
|
|
.new_inst = new_main_struct_inst,
|
|
|
|
|
});
|
|
|
|
|
|
2021-05-05 16:56:24 -07:00
|
|
|
var old_decls = std.ArrayList(Zir.Inst.Index).init(gpa);
|
|
|
|
|
defer old_decls.deinit();
|
|
|
|
|
var new_decls = std.ArrayList(Zir.Inst.Index).init(gpa);
|
|
|
|
|
defer new_decls.deinit();
|
|
|
|
|
|
2021-05-05 13:16:14 -07:00
|
|
|
while (match_stack.popOrNull()) |match_item| {
|
|
|
|
|
try inst_map.put(gpa, match_item.old_inst, match_item.new_inst);
|
|
|
|
|
|
|
|
|
|
// Maps name to extra index of decl sub item.
|
|
|
|
|
var decl_map: std.StringHashMapUnmanaged(u32) = .{};
|
|
|
|
|
defer decl_map.deinit(gpa);
|
|
|
|
|
|
|
|
|
|
{
|
|
|
|
|
var old_decl_it = old_zir.declIterator(match_item.old_inst);
|
|
|
|
|
while (old_decl_it.next()) |old_decl| {
|
|
|
|
|
try decl_map.put(gpa, old_decl.name, old_decl.sub_index);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
var new_decl_it = new_zir.declIterator(match_item.new_inst);
|
|
|
|
|
while (new_decl_it.next()) |new_decl| {
|
|
|
|
|
const old_extra_index = decl_map.get(new_decl.name) orelse continue;
|
|
|
|
|
const new_extra_index = new_decl.sub_index;
|
|
|
|
|
try extra_map.put(gpa, old_extra_index, new_extra_index);
|
|
|
|
|
|
2021-05-05 16:56:24 -07:00
|
|
|
try old_zir.findDecls(&old_decls, old_extra_index);
|
|
|
|
|
try new_zir.findDecls(&new_decls, new_extra_index);
|
|
|
|
|
var i: usize = 0;
|
|
|
|
|
while (true) : (i += 1) {
|
|
|
|
|
if (i >= old_decls.items.len) break;
|
|
|
|
|
if (i >= new_decls.items.len) break;
|
|
|
|
|
try match_stack.append(gpa, .{
|
|
|
|
|
.old_inst = old_decls.items[i],
|
|
|
|
|
.new_inst = new_decls.items[i],
|
|
|
|
|
});
|
|
|
|
|
}
|
2021-05-05 13:16:14 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-07-14 12:16:48 -07:00
|
|
|
pub fn ensureDeclAnalyzed(mod: *Module, decl: *Decl) SemaError!void {
|
2020-09-13 19:17:58 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
|
|
|
|
const subsequent_analysis = switch (decl.analysis) {
|
|
|
|
|
.in_progress => unreachable,
|
|
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
.file_failure,
|
2020-09-13 19:17:58 -07:00
|
|
|
.sema_failure,
|
|
|
|
|
.sema_failure_retryable,
|
|
|
|
|
.codegen_failure,
|
|
|
|
|
.dependency_failure,
|
|
|
|
|
.codegen_failure_retryable,
|
|
|
|
|
=> return error.AnalysisFail,
|
|
|
|
|
|
|
|
|
|
.complete => return,
|
|
|
|
|
|
|
|
|
|
.outdated => blk: {
|
2021-05-06 17:20:45 -07:00
|
|
|
log.debug("re-analyzing {*} ({s})", .{ decl, decl.name });
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
// The exports this Decl performs will be re-discovered, so we remove them here
|
|
|
|
|
// prior to re-analysis.
|
2021-01-16 22:51:01 -07:00
|
|
|
mod.deleteDeclExports(decl);
|
2020-09-13 19:17:58 -07:00
|
|
|
// Dependencies will be re-discovered, so we remove them here prior to re-analysis.
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decl.dependencies.keys()) |dep| {
|
2020-09-13 19:17:58 -07:00
|
|
|
dep.removeDependant(decl);
|
2021-05-14 17:41:22 -07:00
|
|
|
if (dep.dependants.count() == 0 and !dep.deletion_flag) {
|
|
|
|
|
log.debug("insert {*} ({s}) dependant {*} ({s}) into deletion set", .{
|
|
|
|
|
decl, decl.name, dep, dep.name,
|
|
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
// We don't perform a deletion here, because this Decl or another one
|
|
|
|
|
// may end up referencing it before the update is complete.
|
|
|
|
|
dep.deletion_flag = true;
|
2021-04-07 19:38:00 -07:00
|
|
|
try mod.deletion_set.put(mod.gpa, dep, {});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
decl.dependencies.clearRetainingCapacity();
|
|
|
|
|
|
|
|
|
|
break :blk true;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.unreferenced => false,
|
|
|
|
|
};
|
|
|
|
|
|
2021-04-14 11:26:53 -07:00
|
|
|
const type_changed = mod.semaDecl(decl) catch |err| switch (err) {
|
2021-05-14 17:41:22 -07:00
|
|
|
error.AnalysisFail => {
|
|
|
|
|
if (decl.analysis == .in_progress) {
|
|
|
|
|
// If this decl caused the compile error, the analysis field would
|
|
|
|
|
// be changed to indicate it was this Decl's fault. Because this
|
|
|
|
|
// did not happen, we infer here that it was a dependency failure.
|
|
|
|
|
decl.analysis = .dependency_failure;
|
|
|
|
|
}
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
},
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
error.NeededSourceLocation => unreachable,
|
|
|
|
|
error.GenericPoison => unreachable,
|
|
|
|
|
else => |e| {
|
2021-01-16 22:51:01 -07:00
|
|
|
decl.analysis = .sema_failure_retryable;
|
2021-05-14 17:41:22 -07:00
|
|
|
try mod.failed_decls.ensureUnusedCapacity(mod.gpa, 1);
|
2021-01-16 22:51:01 -07:00
|
|
|
mod.failed_decls.putAssumeCapacityNoClobber(decl, try ErrorMsg.create(
|
|
|
|
|
mod.gpa,
|
|
|
|
|
decl.srcLoc(),
|
|
|
|
|
"unable to analyze: {s}",
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
.{@errorName(e)},
|
2021-01-16 22:51:01 -07:00
|
|
|
));
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
},
|
|
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
if (subsequent_analysis) {
|
|
|
|
|
// We may need to chase the dependants and re-analyze them.
|
|
|
|
|
// However, if the decl is a function, and the type is the same, we do not need to.
|
2021-04-27 18:36:12 -07:00
|
|
|
if (type_changed or decl.ty.zigTypeTag() != .Fn) {
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decl.dependants.keys()) |dep| {
|
2020-09-13 19:17:58 -07:00
|
|
|
switch (dep.analysis) {
|
|
|
|
|
.unreferenced => unreachable,
|
2021-05-14 17:41:22 -07:00
|
|
|
.in_progress => continue, // already doing analysis, ok
|
2020-09-13 19:17:58 -07:00
|
|
|
.outdated => continue, // already queued for update
|
|
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
.file_failure,
|
2020-09-13 19:17:58 -07:00
|
|
|
.dependency_failure,
|
|
|
|
|
.sema_failure,
|
|
|
|
|
.sema_failure_retryable,
|
|
|
|
|
.codegen_failure,
|
|
|
|
|
.codegen_failure_retryable,
|
|
|
|
|
.complete,
|
2021-01-16 22:51:01 -07:00
|
|
|
=> if (dep.generation != mod.generation) {
|
|
|
|
|
try mod.markOutdatedDecl(dep);
|
2020-09-13 19:17:58 -07:00
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn semaPkg(mod: *Module, pkg: *Package) !void {
|
2021-06-21 13:27:52 -07:00
|
|
|
const file = (try mod.importPkg(pkg)).file;
|
2021-04-26 20:41:07 -07:00
|
|
|
return mod.semaFile(file);
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
/// Regardless of the file status, will create a `Decl` so that we
|
|
|
|
|
/// can track dependencies and re-analyze when the file becomes outdated.
|
2021-07-14 12:16:48 -07:00
|
|
|
pub fn semaFile(mod: *Module, file: *Scope.File) SemaError!void {
|
2021-04-26 20:41:07 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-05-11 23:20:22 -07:00
|
|
|
if (file.root_decl != null) return;
|
2021-04-26 20:41:07 -07:00
|
|
|
|
|
|
|
|
const gpa = mod.gpa;
|
2021-04-30 11:07:31 -07:00
|
|
|
var new_decl_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
errdefer new_decl_arena.deinit();
|
|
|
|
|
|
|
|
|
|
const struct_obj = try new_decl_arena.allocator.create(Module.Struct);
|
|
|
|
|
const struct_ty = try Type.Tag.@"struct".create(&new_decl_arena.allocator, struct_obj);
|
|
|
|
|
const struct_val = try Value.Tag.ty.create(&new_decl_arena.allocator, struct_ty);
|
|
|
|
|
struct_obj.* = .{
|
|
|
|
|
.owner_decl = undefined, // set below
|
|
|
|
|
.fields = .{},
|
|
|
|
|
.node_offset = 0, // it's the struct for the root file
|
2021-05-11 17:34:13 -07:00
|
|
|
.zir_index = undefined, // set below
|
2021-05-02 18:50:01 -07:00
|
|
|
.layout = .Auto,
|
|
|
|
|
.status = .none,
|
2021-07-23 22:23:03 -07:00
|
|
|
.known_has_bits = undefined,
|
2021-04-30 11:07:31 -07:00
|
|
|
.namespace = .{
|
|
|
|
|
.parent = null,
|
|
|
|
|
.ty = struct_ty,
|
|
|
|
|
.file_scope = file,
|
|
|
|
|
},
|
2021-04-26 20:41:07 -07:00
|
|
|
};
|
2021-04-30 11:07:31 -07:00
|
|
|
const new_decl = try mod.allocateNewDecl(&struct_obj.namespace, 0);
|
2021-05-11 23:20:22 -07:00
|
|
|
file.root_decl = new_decl;
|
2021-04-30 11:07:31 -07:00
|
|
|
struct_obj.owner_decl = new_decl;
|
2021-05-01 21:57:52 -07:00
|
|
|
new_decl.src_line = 0;
|
2021-04-30 11:07:31 -07:00
|
|
|
new_decl.name = try file.fullyQualifiedNameZ(gpa);
|
|
|
|
|
new_decl.is_pub = true;
|
|
|
|
|
new_decl.is_exported = false;
|
|
|
|
|
new_decl.has_align = false;
|
|
|
|
|
new_decl.has_linksection = false;
|
|
|
|
|
new_decl.ty = struct_ty;
|
|
|
|
|
new_decl.val = struct_val;
|
|
|
|
|
new_decl.has_tv = true;
|
2021-05-11 14:17:52 -07:00
|
|
|
new_decl.owns_tv = true;
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
new_decl.alive = true; // This Decl corresponds to a File and is therefore always alive.
|
2021-05-11 17:34:13 -07:00
|
|
|
new_decl.analysis = .in_progress;
|
2021-04-30 11:07:31 -07:00
|
|
|
new_decl.generation = mod.generation;
|
2021-04-27 18:36:12 -07:00
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
if (file.status == .success_zir) {
|
|
|
|
|
assert(file.zir_loaded);
|
|
|
|
|
const main_struct_inst = file.zir.getMainStruct();
|
|
|
|
|
struct_obj.zir_index = main_struct_inst;
|
|
|
|
|
|
|
|
|
|
var sema_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
defer sema_arena.deinit();
|
|
|
|
|
|
|
|
|
|
var sema: Sema = .{
|
|
|
|
|
.mod = mod,
|
|
|
|
|
.gpa = gpa,
|
|
|
|
|
.arena = &sema_arena.allocator,
|
|
|
|
|
.code = file.zir,
|
|
|
|
|
.owner_decl = new_decl,
|
|
|
|
|
.namespace = &struct_obj.namespace,
|
|
|
|
|
.func = null,
|
2021-08-06 16:24:39 -07:00
|
|
|
.fn_ret_ty = Type.initTag(.void),
|
2021-05-11 17:34:13 -07:00
|
|
|
.owner_func = null,
|
|
|
|
|
};
|
2021-05-17 17:39:52 -07:00
|
|
|
defer sema.deinit();
|
2021-05-11 17:34:13 -07:00
|
|
|
var block_scope: Scope.Block = .{
|
|
|
|
|
.parent = null,
|
|
|
|
|
.sema = &sema,
|
|
|
|
|
.src_decl = new_decl,
|
|
|
|
|
.instructions = .{},
|
|
|
|
|
.inlining = null,
|
|
|
|
|
.is_comptime = true,
|
|
|
|
|
};
|
|
|
|
|
defer block_scope.instructions.deinit(gpa);
|
2021-04-26 20:41:07 -07:00
|
|
|
|
2021-05-11 23:20:22 -07:00
|
|
|
if (sema.analyzeStructDecl(new_decl, main_struct_inst, struct_obj)) |_| {
|
|
|
|
|
new_decl.analysis = .complete;
|
|
|
|
|
} else |err| switch (err) {
|
|
|
|
|
error.OutOfMemory => return error.OutOfMemory,
|
|
|
|
|
error.AnalysisFail => {},
|
|
|
|
|
}
|
2021-05-11 17:34:13 -07:00
|
|
|
} else {
|
|
|
|
|
new_decl.analysis = .file_failure;
|
|
|
|
|
}
|
2021-04-30 11:07:31 -07:00
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
try new_decl.finalizeNewArena(&new_decl_arena);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
|
|
|
|
|
2021-02-11 23:29:55 -07:00
|
|
|
/// Returns `true` if the Decl type changed.
|
|
|
|
|
/// Returns `true` if this is the first time analyzing the Decl.
|
|
|
|
|
/// Returns `false` otherwise.
|
2021-04-14 11:26:53 -07:00
|
|
|
fn semaDecl(mod: *Module, decl: *Decl) !bool {
|
2020-09-13 19:17:58 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-05-11 17:34:13 -07:00
|
|
|
if (decl.namespace.file_scope.status != .success_zir) {
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-27 18:36:12 -07:00
|
|
|
const gpa = mod.gpa;
|
2021-05-06 17:20:45 -07:00
|
|
|
const zir = decl.namespace.file_scope.zir;
|
|
|
|
|
const zir_datas = zir.instructions.items(.data);
|
2021-04-27 18:36:12 -07:00
|
|
|
|
|
|
|
|
decl.analysis = .in_progress;
|
|
|
|
|
|
|
|
|
|
var analysis_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
defer analysis_arena.deinit();
|
|
|
|
|
|
|
|
|
|
var sema: Sema = .{
|
|
|
|
|
.mod = mod,
|
|
|
|
|
.gpa = gpa,
|
|
|
|
|
.arena = &analysis_arena.allocator,
|
|
|
|
|
.code = zir,
|
|
|
|
|
.owner_decl = decl,
|
|
|
|
|
.namespace = decl.namespace,
|
|
|
|
|
.func = null,
|
2021-08-06 16:24:39 -07:00
|
|
|
.fn_ret_ty = Type.initTag(.void),
|
2021-04-27 18:36:12 -07:00
|
|
|
.owner_func = null,
|
|
|
|
|
};
|
2021-05-17 17:39:52 -07:00
|
|
|
defer sema.deinit();
|
2021-05-06 17:20:45 -07:00
|
|
|
|
|
|
|
|
if (decl.isRoot()) {
|
2021-05-07 14:18:14 -07:00
|
|
|
log.debug("semaDecl root {*} ({s})", .{ decl, decl.name });
|
2021-05-06 17:20:45 -07:00
|
|
|
const main_struct_inst = zir.getMainStruct();
|
|
|
|
|
const struct_obj = decl.getStruct().?;
|
2021-05-11 23:20:22 -07:00
|
|
|
// This might not have gotten set in `semaFile` if the first time had
|
|
|
|
|
// a ZIR failure, so we set it here in case.
|
|
|
|
|
struct_obj.zir_index = main_struct_inst;
|
2021-05-06 17:20:45 -07:00
|
|
|
try sema.analyzeStructDecl(decl, main_struct_inst, struct_obj);
|
|
|
|
|
decl.analysis = .complete;
|
|
|
|
|
decl.generation = mod.generation;
|
|
|
|
|
return false;
|
|
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
log.debug("semaDecl {*} ({s})", .{ decl, decl.name });
|
2021-05-06 17:20:45 -07:00
|
|
|
|
2021-04-27 18:36:12 -07:00
|
|
|
var block_scope: Scope.Block = .{
|
|
|
|
|
.parent = null,
|
|
|
|
|
.sema = &sema,
|
|
|
|
|
.src_decl = decl,
|
|
|
|
|
.instructions = .{},
|
|
|
|
|
.inlining = null,
|
|
|
|
|
.is_comptime = true,
|
|
|
|
|
};
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
defer {
|
|
|
|
|
block_scope.instructions.deinit(gpa);
|
|
|
|
|
block_scope.params.deinit(gpa);
|
|
|
|
|
}
|
2021-04-27 18:36:12 -07:00
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
const zir_block_index = decl.zirBlockIndex();
|
2021-04-28 22:43:26 -07:00
|
|
|
const inst_data = zir_datas[zir_block_index].pl_node;
|
2021-04-27 18:36:12 -07:00
|
|
|
const extra = zir.extraData(Zir.Inst.Block, inst_data.payload_index);
|
|
|
|
|
const body = zir.extra[extra.end..][0..extra.data.body_len];
|
|
|
|
|
const break_index = try sema.analyzeBody(&block_scope, body);
|
2021-04-28 22:43:26 -07:00
|
|
|
const result_ref = zir_datas[break_index].@"break".operand;
|
2021-05-12 22:50:47 -07:00
|
|
|
const src: LazySrcLoc = .{ .node_offset = 0 };
|
stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
instruction, Sema emits a constant with a `decl_ref`.
* AIR no longer has a `ref` instruction. There is no longer any
instruction that takes a value and returns a pointer to it. If this
is desired, Sema must either create an anynomous Decl and return a
constant `decl_ref`, or in the case of a runtime value, emit an
`alloc` instruction, `store` the value to it, and then return the
`alloc`.
* The `ref_val` Value Tag is eliminated. `decl_ref` should be used
instead. Also added is `eu_payload_ptr` which points to the payload
of an error union, given an error union pointer.
In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.
There is a new abstraction in Sema, which looks like this:
var anon_decl = try block.startAnonDecl();
defer anon_decl.deinit();
// here 'anon_decl.arena()` may be used
const decl = try anon_decl.finish(ty, val);
// decl is typically now used with `decl_ref`.
This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.
Additional improvements:
* Sema: fix source location resolution for calling convention
expression.
* Sema: properly report "unable to resolve comptime value" for loads of
global variables. There is now a set of functions which can be
called if the callee wants to obtain the Value even if the tag is
`variable` (indicating comptime-known address but runtime-known value).
* Sema: `coerce` resolves builtin types before checking equality.
* Sema: fix `u1_type` missing from `addType`, making this type have a
slightly more efficient representation in AIR.
* LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
to properly do an LLVMConstBitCast.
* Remove unused parameter from `Value.toEnum`.
After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
|
|
|
const decl_tv = try sema.resolveInstValue(&block_scope, src, result_ref);
|
2021-04-28 22:43:26 -07:00
|
|
|
const align_val = blk: {
|
|
|
|
|
const align_ref = decl.zirAlignRef();
|
|
|
|
|
if (align_ref == .none) break :blk Value.initTag(.null_value);
|
2021-05-06 17:20:45 -07:00
|
|
|
break :blk (try sema.resolveInstConst(&block_scope, src, align_ref)).val;
|
2021-04-28 22:43:26 -07:00
|
|
|
};
|
|
|
|
|
const linksection_val = blk: {
|
|
|
|
|
const linksection_ref = decl.zirLinksectionRef();
|
|
|
|
|
if (linksection_ref == .none) break :blk Value.initTag(.null_value);
|
2021-05-06 17:20:45 -07:00
|
|
|
break :blk (try sema.resolveInstConst(&block_scope, src, linksection_ref)).val;
|
2021-04-28 22:43:26 -07:00
|
|
|
};
|
2021-08-20 15:23:55 -07:00
|
|
|
// Note this resolves the type of the Decl, not the value; if this Decl
|
|
|
|
|
// is a struct, for example, this resolves `type` (which needs no resolution),
|
|
|
|
|
// not the struct itself.
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
try sema.resolveTypeLayout(&block_scope, src, decl_tv.ty);
|
2021-04-27 18:36:12 -07:00
|
|
|
|
2021-04-28 22:43:26 -07:00
|
|
|
// We need the memory for the Type to go into the arena for the Decl
|
|
|
|
|
var decl_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
errdefer decl_arena.deinit();
|
|
|
|
|
const decl_arena_state = try decl_arena.allocator.create(std.heap.ArenaAllocator.State);
|
|
|
|
|
|
2021-08-28 15:35:59 -07:00
|
|
|
if (decl.is_usingnamespace) {
|
|
|
|
|
const ty_ty = Type.initTag(.type);
|
|
|
|
|
if (!decl_tv.ty.eql(ty_ty)) {
|
|
|
|
|
return mod.fail(&block_scope.base, src, "expected type, found {}", .{decl_tv.ty});
|
|
|
|
|
}
|
|
|
|
|
var buffer: Value.ToTypeBuffer = undefined;
|
|
|
|
|
const ty = decl_tv.val.toType(&buffer);
|
|
|
|
|
if (ty.getNamespace() == null) {
|
|
|
|
|
return mod.fail(&block_scope.base, src, "type {} has no namespace", .{ty});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
decl.ty = ty_ty;
|
|
|
|
|
decl.val = try Value.Tag.ty.create(&decl_arena.allocator, ty);
|
|
|
|
|
decl.align_val = Value.initTag(.null_value);
|
|
|
|
|
decl.linksection_val = Value.initTag(.null_value);
|
|
|
|
|
decl.has_tv = true;
|
|
|
|
|
decl.owns_tv = false;
|
|
|
|
|
decl_arena_state.* = decl_arena.state;
|
|
|
|
|
decl.value_arena = decl_arena_state;
|
|
|
|
|
decl.analysis = .complete;
|
|
|
|
|
decl.generation = mod.generation;
|
|
|
|
|
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-11 14:17:52 -07:00
|
|
|
if (decl_tv.val.castTag(.function)) |fn_payload| {
|
2021-07-20 15:22:37 -07:00
|
|
|
const func = fn_payload.data;
|
|
|
|
|
const owns_tv = func.owner_decl == decl;
|
|
|
|
|
if (owns_tv) {
|
|
|
|
|
var prev_type_has_bits = false;
|
|
|
|
|
var prev_is_inline = false;
|
|
|
|
|
var type_changed = true;
|
|
|
|
|
|
|
|
|
|
if (decl.has_tv) {
|
|
|
|
|
prev_type_has_bits = decl.ty.hasCodeGenBits();
|
|
|
|
|
type_changed = !decl.ty.eql(decl_tv.ty);
|
|
|
|
|
if (decl.getFunction()) |prev_func| {
|
|
|
|
|
prev_is_inline = prev_func.state == .inline_only;
|
|
|
|
|
}
|
|
|
|
|
decl.clearValues(gpa);
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
|
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
decl.ty = try decl_tv.ty.copy(&decl_arena.allocator);
|
|
|
|
|
decl.val = try decl_tv.val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.align_val = try align_val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.linksection_val = try linksection_val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.has_tv = true;
|
|
|
|
|
decl.owns_tv = owns_tv;
|
|
|
|
|
decl_arena_state.* = decl_arena.state;
|
|
|
|
|
decl.value_arena = decl_arena_state;
|
|
|
|
|
decl.analysis = .complete;
|
|
|
|
|
decl.generation = mod.generation;
|
|
|
|
|
|
|
|
|
|
const is_inline = decl_tv.ty.fnCallingConvention() == .Inline;
|
|
|
|
|
if (!is_inline and decl_tv.ty.hasCodeGenBits()) {
|
|
|
|
|
// We don't fully codegen the decl until later, but we do need to reserve a global
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
// offset table index for it. This allows us to codegen decls out of dependency
|
|
|
|
|
// order, increasing how many computations can be done in parallel.
|
2021-07-20 15:22:37 -07:00
|
|
|
try mod.comp.bin_file.allocateDeclIndexes(decl);
|
|
|
|
|
try mod.comp.work_queue.writeItem(.{ .codegen_func = func });
|
|
|
|
|
if (type_changed and mod.emit_h != null) {
|
|
|
|
|
try mod.comp.work_queue.writeItem(.{ .emit_h_decl = decl });
|
|
|
|
|
}
|
|
|
|
|
} else if (!prev_is_inline and prev_type_has_bits) {
|
|
|
|
|
mod.comp.bin_file.freeDecl(decl);
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
|
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
if (decl.is_exported) {
|
|
|
|
|
const export_src = src; // TODO make this point at `export` token
|
|
|
|
|
if (is_inline) {
|
|
|
|
|
return mod.fail(&block_scope.base, export_src, "export of inline function", .{});
|
|
|
|
|
}
|
|
|
|
|
// The scope needs to have the decl in it.
|
|
|
|
|
try mod.analyzeExport(&block_scope.base, export_src, mem.spanZ(decl.name), decl);
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
return type_changed or is_inline != prev_is_inline;
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
}
|
|
|
|
|
var type_changed = true;
|
|
|
|
|
if (decl.has_tv) {
|
|
|
|
|
type_changed = !decl.ty.eql(decl_tv.ty);
|
|
|
|
|
decl.clearValues(gpa);
|
|
|
|
|
}
|
2021-04-28 22:43:26 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
decl.owns_tv = false;
|
|
|
|
|
var queue_linker_work = false;
|
|
|
|
|
if (decl_tv.val.castTag(.variable)) |payload| {
|
|
|
|
|
const variable = payload.data;
|
|
|
|
|
if (variable.owner_decl == decl) {
|
|
|
|
|
decl.owns_tv = true;
|
|
|
|
|
queue_linker_work = true;
|
2021-05-11 14:17:52 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
const copied_init = try variable.init.copy(&decl_arena.allocator);
|
|
|
|
|
variable.init = copied_init;
|
2021-05-11 14:17:52 -07:00
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
} else if (decl_tv.val.castTag(.extern_fn)) |payload| {
|
|
|
|
|
const owner_decl = payload.data;
|
|
|
|
|
if (decl == owner_decl) {
|
|
|
|
|
decl.owns_tv = true;
|
|
|
|
|
queue_linker_work = true;
|
|
|
|
|
}
|
|
|
|
|
}
|
2021-05-11 14:17:52 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
decl.ty = try decl_tv.ty.copy(&decl_arena.allocator);
|
|
|
|
|
decl.val = try decl_tv.val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.align_val = try align_val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.linksection_val = try linksection_val.copy(&decl_arena.allocator);
|
|
|
|
|
decl.has_tv = true;
|
|
|
|
|
decl_arena_state.* = decl_arena.state;
|
|
|
|
|
decl.value_arena = decl_arena_state;
|
|
|
|
|
decl.analysis = .complete;
|
|
|
|
|
decl.generation = mod.generation;
|
2021-05-07 16:05:44 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
if (queue_linker_work and decl.ty.hasCodeGenBits()) {
|
|
|
|
|
try mod.comp.bin_file.allocateDeclIndexes(decl);
|
|
|
|
|
try mod.comp.work_queue.writeItem(.{ .codegen_decl = decl });
|
2021-05-07 16:05:44 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
if (type_changed and mod.emit_h != null) {
|
|
|
|
|
try mod.comp.work_queue.writeItem(.{ .emit_h_decl = decl });
|
2021-05-11 14:17:52 -07:00
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
}
|
2021-08-28 15:35:59 -07:00
|
|
|
// In case this Decl is a struct or union, we need to resolve the fields
|
|
|
|
|
// while we still have the `Sema` in scope, so that the field type expressions
|
|
|
|
|
// can use the resolved AIR instructions that they possibly reference.
|
|
|
|
|
// We do this after the decl is populated and set to `complete` so that a `Decl`
|
|
|
|
|
// may reference itself.
|
|
|
|
|
try sema.resolvePendingTypes(&block_scope);
|
2021-05-11 14:17:52 -07:00
|
|
|
|
2021-07-20 15:22:37 -07:00
|
|
|
if (decl.is_exported) {
|
|
|
|
|
const export_src = src; // TODO point to the export token
|
|
|
|
|
// The scope needs to have the decl in it.
|
|
|
|
|
try mod.analyzeExport(&block_scope.base, export_src, mem.spanZ(decl.name), decl);
|
2021-04-28 22:43:26 -07:00
|
|
|
}
|
2021-07-20 15:22:37 -07:00
|
|
|
|
|
|
|
|
return type_changed;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.
A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.
This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.
Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.
The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.
AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.
AstGen: fix struct astgen incorrectly counting decls as fields.
Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.
Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.
Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 17:43:56 -07:00
|
|
|
/// Returns the depender's index of the dependee.
|
2021-04-14 11:26:53 -07:00
|
|
|
pub fn declareDeclDependency(mod: *Module, depender: *Decl, dependee: *Decl) !void {
|
2021-05-14 17:41:22 -07:00
|
|
|
if (depender == dependee) return;
|
|
|
|
|
|
|
|
|
|
log.debug("{*} ({s}) depends on {*} ({s})", .{
|
|
|
|
|
depender, depender.name, dependee, dependee.name,
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
try depender.dependencies.ensureUnusedCapacity(mod.gpa, 1);
|
|
|
|
|
try dependee.dependants.ensureUnusedCapacity(mod.gpa, 1);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-04-07 19:38:00 -07:00
|
|
|
if (dependee.deletion_flag) {
|
|
|
|
|
dependee.deletion_flag = false;
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(mod.deletion_set.swapRemove(dependee));
|
2021-04-07 19:38:00 -07:00
|
|
|
}
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
dependee.dependants.putAssumeCapacity(depender, {});
|
2021-04-14 11:26:53 -07:00
|
|
|
depender.dependencies.putAssumeCapacity(dependee, {});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-16 17:28:28 -07:00
|
|
|
pub const ImportFileResult = struct {
|
|
|
|
|
file: *Scope.File,
|
|
|
|
|
is_new: bool,
|
|
|
|
|
};
|
|
|
|
|
|
2021-06-21 13:27:52 -07:00
|
|
|
pub fn importPkg(mod: *Module, pkg: *Package) !ImportFileResult {
|
2021-04-16 19:45:58 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
|
|
|
|
// The resolved path is used as the key in the import table, to detect if
|
|
|
|
|
// an import refers to the same as another, despite different relative paths
|
|
|
|
|
// or differently mapped package names.
|
|
|
|
|
const resolved_path = try std.fs.path.resolve(gpa, &[_][]const u8{
|
|
|
|
|
pkg.root_src_directory.path orelse ".", pkg.root_src_path,
|
|
|
|
|
});
|
|
|
|
|
var keep_resolved_path = false;
|
|
|
|
|
defer if (!keep_resolved_path) gpa.free(resolved_path);
|
|
|
|
|
|
|
|
|
|
const gop = try mod.import_table.getOrPut(gpa, resolved_path);
|
|
|
|
|
if (gop.found_existing) return ImportFileResult{
|
2021-06-03 15:39:26 -05:00
|
|
|
.file = gop.value_ptr.*,
|
2021-04-16 19:45:58 -07:00
|
|
|
.is_new = false,
|
|
|
|
|
};
|
|
|
|
|
keep_resolved_path = true; // It's now owned by import_table.
|
|
|
|
|
|
|
|
|
|
const sub_file_path = try gpa.dupe(u8, pkg.root_src_path);
|
|
|
|
|
errdefer gpa.free(sub_file_path);
|
|
|
|
|
|
|
|
|
|
const new_file = try gpa.create(Scope.File);
|
|
|
|
|
errdefer gpa.destroy(new_file);
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
gop.value_ptr.* = new_file;
|
2021-04-16 19:45:58 -07:00
|
|
|
new_file.* = .{
|
|
|
|
|
.sub_file_path = sub_file_path,
|
|
|
|
|
.source = undefined,
|
|
|
|
|
.source_loaded = false,
|
|
|
|
|
.tree_loaded = false,
|
|
|
|
|
.zir_loaded = false,
|
|
|
|
|
.stat_size = undefined,
|
|
|
|
|
.stat_inode = undefined,
|
|
|
|
|
.stat_mtime = undefined,
|
|
|
|
|
.tree = undefined,
|
|
|
|
|
.zir = undefined,
|
|
|
|
|
.status = .never_loaded,
|
|
|
|
|
.pkg = pkg,
|
2021-05-11 23:20:22 -07:00
|
|
|
.root_decl = null,
|
2021-04-16 19:45:58 -07:00
|
|
|
};
|
|
|
|
|
return ImportFileResult{
|
|
|
|
|
.file = new_file,
|
|
|
|
|
.is_new = true,
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-16 17:28:28 -07:00
|
|
|
pub fn importFile(
|
|
|
|
|
mod: *Module,
|
2021-04-16 19:45:58 -07:00
|
|
|
cur_file: *Scope.File,
|
2021-04-16 17:28:28 -07:00
|
|
|
import_string: []const u8,
|
|
|
|
|
) !ImportFileResult {
|
2021-04-16 19:45:58 -07:00
|
|
|
if (cur_file.pkg.table.get(import_string)) |pkg| {
|
2021-06-21 13:27:52 -07:00
|
|
|
return mod.importPkg(pkg);
|
2021-04-16 19:45:58 -07:00
|
|
|
}
|
2021-06-22 16:11:02 -07:00
|
|
|
if (!mem.endsWith(u8, import_string, ".zig")) {
|
|
|
|
|
return error.PackageNotFound;
|
|
|
|
|
}
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
2021-04-16 19:45:58 -07:00
|
|
|
// The resolved path is used as the key in the import table, to detect if
|
|
|
|
|
// an import refers to the same as another, despite different relative paths
|
|
|
|
|
// or differently mapped package names.
|
|
|
|
|
const cur_pkg_dir_path = cur_file.pkg.root_src_directory.path orelse ".";
|
|
|
|
|
const resolved_path = try std.fs.path.resolve(gpa, &[_][]const u8{
|
|
|
|
|
cur_pkg_dir_path, cur_file.sub_file_path, "..", import_string,
|
|
|
|
|
});
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
var keep_resolved_path = false;
|
|
|
|
|
defer if (!keep_resolved_path) gpa.free(resolved_path);
|
|
|
|
|
|
|
|
|
|
const gop = try mod.import_table.getOrPut(gpa, resolved_path);
|
2021-04-16 17:28:28 -07:00
|
|
|
if (gop.found_existing) return ImportFileResult{
|
2021-06-03 15:39:26 -05:00
|
|
|
.file = gop.value_ptr.*,
|
2021-04-16 17:28:28 -07:00
|
|
|
.is_new = false,
|
|
|
|
|
};
|
2021-04-16 19:45:58 -07:00
|
|
|
keep_resolved_path = true; // It's now owned by import_table.
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
|
2021-04-16 19:45:58 -07:00
|
|
|
const new_file = try gpa.create(Scope.File);
|
|
|
|
|
errdefer gpa.destroy(new_file);
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
|
2021-04-16 19:45:58 -07:00
|
|
|
const resolved_root_path = try std.fs.path.resolve(gpa, &[_][]const u8{cur_pkg_dir_path});
|
|
|
|
|
defer gpa.free(resolved_root_path);
|
|
|
|
|
|
|
|
|
|
if (!mem.startsWith(u8, resolved_path, resolved_root_path)) {
|
|
|
|
|
return error.ImportOutsidePkgPath;
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
}
|
2021-04-16 19:45:58 -07:00
|
|
|
// +1 for the directory separator here.
|
|
|
|
|
const sub_file_path = try gpa.dupe(u8, resolved_path[resolved_root_path.len + 1 ..]);
|
|
|
|
|
errdefer gpa.free(sub_file_path);
|
|
|
|
|
|
|
|
|
|
log.debug("new importFile. resolved_root_path={s}, resolved_path={s}, sub_file_path={s}, import_string={s}", .{
|
|
|
|
|
resolved_root_path, resolved_path, sub_file_path, import_string,
|
|
|
|
|
});
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
gop.value_ptr.* = new_file;
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
new_file.* = .{
|
2021-04-16 19:45:58 -07:00
|
|
|
.sub_file_path = sub_file_path,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.source = undefined,
|
|
|
|
|
.source_loaded = false,
|
2021-04-14 11:26:53 -07:00
|
|
|
.tree_loaded = false,
|
|
|
|
|
.zir_loaded = false,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.stat_size = undefined,
|
|
|
|
|
.stat_inode = undefined,
|
|
|
|
|
.stat_mtime = undefined,
|
|
|
|
|
.tree = undefined,
|
2021-04-14 11:26:53 -07:00
|
|
|
.zir = undefined,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.status = .never_loaded,
|
2021-04-16 19:45:58 -07:00
|
|
|
.pkg = cur_file.pkg,
|
2021-05-11 23:20:22 -07:00
|
|
|
.root_decl = null,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
};
|
2021-04-16 17:28:28 -07:00
|
|
|
return ImportFileResult{
|
|
|
|
|
.file = new_file,
|
|
|
|
|
.is_new = true,
|
|
|
|
|
};
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
pub fn scanNamespace(
|
2021-04-12 16:44:51 -07:00
|
|
|
mod: *Module,
|
|
|
|
|
namespace: *Scope.Namespace,
|
2021-04-26 20:41:07 -07:00
|
|
|
extra_start: usize,
|
|
|
|
|
decls_len: u32,
|
|
|
|
|
parent_decl: *Decl,
|
2021-07-14 12:16:48 -07:00
|
|
|
) SemaError!usize {
|
2020-09-13 19:17:58 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
const zir = namespace.file_scope.zir;
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
try mod.comp.work_queue.ensureUnusedCapacity(decls_len);
|
|
|
|
|
try namespace.decls.ensureCapacity(gpa, decls_len);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
const bit_bags_count = std.math.divCeil(usize, decls_len, 8) catch unreachable;
|
|
|
|
|
var extra_index = extra_start + bit_bags_count;
|
|
|
|
|
var bit_bag_index: usize = extra_start;
|
|
|
|
|
var cur_bit_bag: u32 = undefined;
|
|
|
|
|
var decl_i: u32 = 0;
|
2021-04-28 16:55:22 -07:00
|
|
|
var scan_decl_iter: ScanDeclIter = .{
|
|
|
|
|
.module = mod,
|
|
|
|
|
.namespace = namespace,
|
|
|
|
|
.parent_decl = parent_decl,
|
|
|
|
|
};
|
2021-04-26 20:41:07 -07:00
|
|
|
while (decl_i < decls_len) : (decl_i += 1) {
|
|
|
|
|
if (decl_i % 8 == 0) {
|
|
|
|
|
cur_bit_bag = zir.extra[bit_bag_index];
|
|
|
|
|
bit_bag_index += 1;
|
|
|
|
|
}
|
2021-04-28 16:55:22 -07:00
|
|
|
const flags = @truncate(u4, cur_bit_bag);
|
2021-04-28 23:16:13 -07:00
|
|
|
cur_bit_bag >>= 4;
|
|
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
const decl_sub_index = extra_index;
|
2021-05-01 21:57:52 -07:00
|
|
|
extra_index += 7; // src_hash(4) + line(1) + name(1) + value(1)
|
2021-04-28 16:55:22 -07:00
|
|
|
extra_index += @truncate(u1, flags >> 2);
|
|
|
|
|
extra_index += @truncate(u1, flags >> 3);
|
2021-02-11 23:29:55 -07:00
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
try scanDecl(&scan_decl_iter, decl_sub_index, flags);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
|
|
|
|
return extra_index;
|
2021-02-11 23:29:55 -07:00
|
|
|
}
|
|
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
const ScanDeclIter = struct {
|
|
|
|
|
module: *Module,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
namespace: *Scope.Namespace,
|
2021-04-26 20:41:07 -07:00
|
|
|
parent_decl: *Decl,
|
2021-04-28 16:55:22 -07:00
|
|
|
usingnamespace_index: usize = 0,
|
|
|
|
|
comptime_index: usize = 0,
|
|
|
|
|
unnamed_test_index: usize = 0,
|
|
|
|
|
};
|
|
|
|
|
|
2021-07-14 12:16:48 -07:00
|
|
|
fn scanDecl(iter: *ScanDeclIter, decl_sub_index: usize, flags: u4) SemaError!void {
|
2021-02-11 23:29:55 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
const mod = iter.module;
|
|
|
|
|
const namespace = iter.namespace;
|
2021-04-26 20:41:07 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
const zir = namespace.file_scope.zir;
|
2021-04-27 18:36:12 -07:00
|
|
|
|
2021-04-28 16:55:22 -07:00
|
|
|
// zig fmt: off
|
|
|
|
|
const is_pub = (flags & 0b0001) != 0;
|
2021-08-28 15:35:59 -07:00
|
|
|
const export_bit = (flags & 0b0010) != 0;
|
2021-04-28 16:55:22 -07:00
|
|
|
const has_align = (flags & 0b0100) != 0;
|
|
|
|
|
const has_linksection = (flags & 0b1000) != 0;
|
|
|
|
|
// zig fmt: on
|
2021-04-26 20:41:07 -07:00
|
|
|
|
2021-05-01 21:57:52 -07:00
|
|
|
const line = iter.parent_decl.relativeToLine(zir.extra[decl_sub_index + 4]);
|
|
|
|
|
const decl_name_index = zir.extra[decl_sub_index + 5];
|
|
|
|
|
const decl_index = zir.extra[decl_sub_index + 6];
|
2021-04-28 16:55:22 -07:00
|
|
|
const decl_block_inst_data = zir.instructions.items(.data)[decl_index].pl_node;
|
|
|
|
|
const decl_node = iter.parent_decl.relativeToNodeIndex(decl_block_inst_data.src_node);
|
|
|
|
|
|
|
|
|
|
// Every Decl needs a name.
|
2021-05-07 18:52:11 -07:00
|
|
|
var is_named_test = false;
|
|
|
|
|
const decl_name: [:0]const u8 = switch (decl_name_index) {
|
2021-04-28 16:55:22 -07:00
|
|
|
0 => name: {
|
2021-08-28 15:35:59 -07:00
|
|
|
if (export_bit) {
|
2021-04-28 16:55:22 -07:00
|
|
|
const i = iter.usingnamespace_index;
|
|
|
|
|
iter.usingnamespace_index += 1;
|
2021-05-07 16:18:49 -07:00
|
|
|
break :name try std.fmt.allocPrintZ(gpa, "usingnamespace_{d}", .{i});
|
2021-04-28 16:55:22 -07:00
|
|
|
} else {
|
|
|
|
|
const i = iter.comptime_index;
|
|
|
|
|
iter.comptime_index += 1;
|
2021-05-07 16:18:49 -07:00
|
|
|
break :name try std.fmt.allocPrintZ(gpa, "comptime_{d}", .{i});
|
2021-04-28 16:55:22 -07:00
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
1 => name: {
|
|
|
|
|
const i = iter.unnamed_test_index;
|
|
|
|
|
iter.unnamed_test_index += 1;
|
2021-05-07 16:18:49 -07:00
|
|
|
break :name try std.fmt.allocPrintZ(gpa, "test_{d}", .{i});
|
2021-04-28 16:55:22 -07:00
|
|
|
},
|
2021-05-07 18:52:11 -07:00
|
|
|
else => name: {
|
|
|
|
|
const raw_name = zir.nullTerminatedString(decl_name_index);
|
|
|
|
|
if (raw_name.len == 0) {
|
|
|
|
|
is_named_test = true;
|
|
|
|
|
const test_name = zir.nullTerminatedString(decl_name_index + 1);
|
|
|
|
|
break :name try std.fmt.allocPrintZ(gpa, "test.{s}", .{test_name});
|
|
|
|
|
} else {
|
|
|
|
|
break :name try gpa.dupeZ(u8, raw_name);
|
|
|
|
|
}
|
|
|
|
|
},
|
2021-05-02 14:58:27 -07:00
|
|
|
};
|
2021-08-28 15:35:59 -07:00
|
|
|
const is_exported = export_bit and decl_name_index != 0;
|
|
|
|
|
const is_usingnamespace = export_bit and decl_name_index == 0;
|
|
|
|
|
if (is_usingnamespace) try namespace.usingnamespace_set.ensureUnusedCapacity(gpa, 1);
|
2021-04-26 21:34:40 -07:00
|
|
|
|
2021-04-26 20:41:07 -07:00
|
|
|
// We create a Decl for it regardless of analysis status.
|
2021-04-28 16:55:22 -07:00
|
|
|
const gop = try namespace.decls.getOrPut(gpa, decl_name);
|
2021-04-26 20:41:07 -07:00
|
|
|
if (!gop.found_existing) {
|
2021-04-27 18:36:12 -07:00
|
|
|
const new_decl = try mod.allocateNewDecl(namespace, decl_node);
|
2021-08-28 15:35:59 -07:00
|
|
|
if (is_usingnamespace) {
|
|
|
|
|
namespace.usingnamespace_set.putAssumeCapacity(new_decl, is_pub);
|
|
|
|
|
}
|
2021-05-07 18:52:11 -07:00
|
|
|
log.debug("scan new {*} ({s}) into {*}", .{ new_decl, decl_name, namespace });
|
2021-05-01 21:57:52 -07:00
|
|
|
new_decl.src_line = line;
|
2021-04-28 16:55:22 -07:00
|
|
|
new_decl.name = decl_name;
|
2021-06-03 15:39:26 -05:00
|
|
|
gop.value_ptr.* = new_decl;
|
2021-04-26 21:34:40 -07:00
|
|
|
// Exported decls, comptime decls, usingnamespace decls, and
|
|
|
|
|
// test decls if in test mode, get analyzed.
|
2021-07-23 22:23:03 -07:00
|
|
|
const decl_pkg = namespace.file_scope.pkg;
|
2021-04-26 21:34:40 -07:00
|
|
|
const want_analysis = is_exported or switch (decl_name_index) {
|
2021-08-28 15:35:59 -07:00
|
|
|
0 => true, // comptime or usingnamespace decl
|
2021-07-23 22:23:03 -07:00
|
|
|
1 => blk: {
|
|
|
|
|
// test decl with no name. Skip the part where we check against
|
|
|
|
|
// the test name filter.
|
|
|
|
|
if (!mod.comp.bin_file.options.is_test) break :blk false;
|
|
|
|
|
if (decl_pkg != mod.main_pkg) break :blk false;
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
try mod.test_functions.put(gpa, new_decl, {});
|
2021-07-23 22:23:03 -07:00
|
|
|
break :blk true;
|
|
|
|
|
},
|
|
|
|
|
else => blk: {
|
|
|
|
|
if (!is_named_test) break :blk false;
|
|
|
|
|
if (!mod.comp.bin_file.options.is_test) break :blk false;
|
|
|
|
|
if (decl_pkg != mod.main_pkg) break :blk false;
|
|
|
|
|
// TODO check the name against --test-filter
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
try mod.test_functions.put(gpa, new_decl, {});
|
2021-07-23 22:23:03 -07:00
|
|
|
break :blk true;
|
|
|
|
|
},
|
2021-04-26 21:34:40 -07:00
|
|
|
};
|
|
|
|
|
if (want_analysis) {
|
2021-04-26 20:41:07 -07:00
|
|
|
mod.comp.work_queue.writeItemAssumeCapacity(.{ .analyze_decl = new_decl });
|
2021-02-11 23:29:55 -07:00
|
|
|
}
|
2021-04-26 20:41:07 -07:00
|
|
|
new_decl.is_pub = is_pub;
|
2021-04-28 16:55:22 -07:00
|
|
|
new_decl.is_exported = is_exported;
|
2021-08-28 15:35:59 -07:00
|
|
|
new_decl.is_usingnamespace = is_usingnamespace;
|
2021-04-28 16:55:22 -07:00
|
|
|
new_decl.has_align = has_align;
|
|
|
|
|
new_decl.has_linksection = has_linksection;
|
|
|
|
|
new_decl.zir_decl_index = @intCast(u32, decl_sub_index);
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
new_decl.alive = true; // This Decl corresponds to an AST node and therefore always alive.
|
2021-04-26 20:41:07 -07:00
|
|
|
return;
|
2021-02-11 23:29:55 -07:00
|
|
|
}
|
2021-05-07 18:52:11 -07:00
|
|
|
gpa.free(decl_name);
|
2021-06-03 15:39:26 -05:00
|
|
|
const decl = gop.value_ptr.*;
|
2021-05-12 22:02:44 -07:00
|
|
|
log.debug("scan existing {*} ({s}) of {*}", .{ decl, decl.name, namespace });
|
2021-04-26 20:41:07 -07:00
|
|
|
// Update the AST node of the decl; even if its contents are unchanged, it may
|
|
|
|
|
// have been re-ordered.
|
|
|
|
|
decl.src_node = decl_node;
|
2021-05-01 21:57:52 -07:00
|
|
|
decl.src_line = line;
|
2021-04-28 16:55:22 -07:00
|
|
|
|
2021-04-27 18:36:12 -07:00
|
|
|
decl.is_pub = is_pub;
|
|
|
|
|
decl.is_exported = is_exported;
|
2021-08-28 15:35:59 -07:00
|
|
|
decl.is_usingnamespace = is_usingnamespace;
|
2021-04-28 16:55:22 -07:00
|
|
|
decl.has_align = has_align;
|
|
|
|
|
decl.has_linksection = has_linksection;
|
|
|
|
|
decl.zir_decl_index = @intCast(u32, decl_sub_index);
|
2021-06-19 21:10:22 -04:00
|
|
|
if (decl.getFunction()) |_| {
|
2021-05-06 17:20:45 -07:00
|
|
|
switch (mod.comp.bin_file.tag) {
|
2021-04-26 20:41:07 -07:00
|
|
|
.coff => {
|
|
|
|
|
// TODO Implement for COFF
|
|
|
|
|
},
|
|
|
|
|
.elf => if (decl.fn_link.elf.len != 0) {
|
|
|
|
|
// TODO Look into detecting when this would be unnecessary by storing enough state
|
|
|
|
|
// in `Decl` to notice that the line number did not change.
|
|
|
|
|
mod.comp.work_queue.writeItemAssumeCapacity(.{ .update_line_number = decl });
|
|
|
|
|
},
|
|
|
|
|
.macho => if (decl.fn_link.macho.len != 0) {
|
|
|
|
|
// TODO Look into detecting when this would be unnecessary by storing enough state
|
|
|
|
|
// in `Decl` to notice that the line number did not change.
|
|
|
|
|
mod.comp.work_queue.writeItemAssumeCapacity(.{ .update_line_number = decl });
|
|
|
|
|
},
|
2021-06-01 16:07:08 -04:00
|
|
|
.plan9 => {
|
|
|
|
|
// TODO implement for plan9
|
|
|
|
|
},
|
2021-04-26 20:41:07 -07:00
|
|
|
.c, .wasm, .spirv => {},
|
2021-05-06 17:20:45 -07:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
/// Make it as if the semantic analysis for this Decl never happened.
|
|
|
|
|
pub fn clearDecl(
|
2021-04-07 19:38:00 -07:00
|
|
|
mod: *Module,
|
|
|
|
|
decl: *Decl,
|
|
|
|
|
outdated_decls: ?*std.AutoArrayHashMap(*Decl, void),
|
2021-05-11 22:12:36 -07:00
|
|
|
) Allocator.Error!void {
|
2021-02-11 23:29:55 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
log.debug("clearing {*} ({s})", .{ decl, decl.name });
|
2021-04-07 19:38:00 -07:00
|
|
|
|
2021-05-11 22:12:36 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
try mod.deletion_set.ensureUnusedCapacity(gpa, decl.dependencies.count());
|
|
|
|
|
|
2021-04-07 19:38:00 -07:00
|
|
|
if (outdated_decls) |map| {
|
|
|
|
|
_ = map.swapRemove(decl);
|
2021-05-07 18:52:11 -07:00
|
|
|
try map.ensureUnusedCapacity(decl.dependants.count());
|
2021-04-07 19:38:00 -07:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
// Remove itself from its dependencies.
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decl.dependencies.keys()) |dep| {
|
2020-09-13 19:17:58 -07:00
|
|
|
dep.removeDependant(decl);
|
2021-06-03 15:39:26 -05:00
|
|
|
if (dep.dependants.count() == 0 and !dep.deletion_flag) {
|
2021-07-07 00:39:23 -07:00
|
|
|
log.debug("insert {*} ({s}) dependant {*} ({s}) into deletion set", .{
|
|
|
|
|
decl, decl.name, dep, dep.name,
|
|
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
// We don't recursively perform a deletion here, because during the update,
|
|
|
|
|
// another reference to it may turn up.
|
|
|
|
|
dep.deletion_flag = true;
|
2021-04-07 19:38:00 -07:00
|
|
|
mod.deletion_set.putAssumeCapacity(dep, {});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
decl.dependencies.clearRetainingCapacity();
|
|
|
|
|
|
2021-04-07 19:38:00 -07:00
|
|
|
// Anything that depends on this deleted decl needs to be re-analyzed.
|
2021-06-03 15:39:26 -05:00
|
|
|
for (decl.dependants.keys()) |dep| {
|
2020-09-13 19:17:58 -07:00
|
|
|
dep.removeDependency(decl);
|
2021-04-07 19:38:00 -07:00
|
|
|
if (outdated_decls) |map| {
|
|
|
|
|
map.putAssumeCapacity(dep, {});
|
|
|
|
|
} else if (std.debug.runtime_safety) {
|
|
|
|
|
// If `outdated_decls` is `null`, it means we're being called from
|
|
|
|
|
// `Compilation` after `performAllTheWork` and we cannot queue up any
|
|
|
|
|
// more work. `dep` must necessarily be another Decl that is no longer
|
|
|
|
|
// being referenced, and will be in the `deletion_set`. Otherwise,
|
|
|
|
|
// something has gone wrong.
|
|
|
|
|
assert(mod.deletion_set.contains(dep));
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
decl.dependants.clearRetainingCapacity();
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
if (mod.failed_decls.fetchSwapRemove(decl)) |kv| {
|
|
|
|
|
kv.value.destroy(gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-04-26 20:41:07 -07:00
|
|
|
if (mod.emit_h) |emit_h| {
|
2021-06-03 15:39:26 -05:00
|
|
|
if (emit_h.failed_decls.fetchSwapRemove(decl)) |kv| {
|
|
|
|
|
kv.value.destroy(gpa);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(emit_h.decl_table.swapRemove(decl));
|
2021-01-05 17:33:31 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
_ = mod.compile_log_decls.swapRemove(decl);
|
|
|
|
|
mod.deleteDeclExports(decl);
|
2021-05-11 22:12:36 -07:00
|
|
|
|
|
|
|
|
if (decl.has_tv) {
|
2021-05-17 13:44:20 -07:00
|
|
|
if (decl.ty.hasCodeGenBits()) {
|
|
|
|
|
mod.comp.bin_file.freeDecl(decl);
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
|
|
|
|
|
// TODO instead of a union, put this memory trailing Decl objects,
|
|
|
|
|
// and allow it to be variably sized.
|
|
|
|
|
decl.link = switch (mod.comp.bin_file.tag) {
|
|
|
|
|
.coff => .{ .coff = link.File.Coff.TextBlock.empty },
|
|
|
|
|
.elf => .{ .elf = link.File.Elf.TextBlock.empty },
|
|
|
|
|
.macho => .{ .macho = link.File.MachO.TextBlock.empty },
|
2021-06-01 22:48:20 -04:00
|
|
|
.plan9 => .{ .plan9 = link.File.Plan9.DeclBlock.empty },
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
.c => .{ .c = link.File.C.DeclBlock.empty },
|
|
|
|
|
.wasm => .{ .wasm = link.File.Wasm.DeclBlock.empty },
|
|
|
|
|
.spirv => .{ .spirv = {} },
|
|
|
|
|
};
|
|
|
|
|
decl.fn_link = switch (mod.comp.bin_file.tag) {
|
|
|
|
|
.coff => .{ .coff = {} },
|
|
|
|
|
.elf => .{ .elf = link.File.Elf.SrcFn.empty },
|
|
|
|
|
.macho => .{ .macho = link.File.MachO.SrcFn.empty },
|
2021-06-01 22:48:20 -04:00
|
|
|
.plan9 => .{ .plan9 = {} },
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
.c => .{ .c = link.File.C.FnBlock.empty },
|
|
|
|
|
.wasm => .{ .wasm = link.File.Wasm.FnData.empty },
|
|
|
|
|
.spirv => .{ .spirv = .{} },
|
|
|
|
|
};
|
2021-05-17 13:44:20 -07:00
|
|
|
}
|
2021-05-11 22:12:36 -07:00
|
|
|
if (decl.getInnerNamespace()) |namespace| {
|
|
|
|
|
try namespace.deleteAllDecls(mod, outdated_decls);
|
|
|
|
|
}
|
|
|
|
|
decl.clearValues(gpa);
|
2021-05-07 18:52:11 -07:00
|
|
|
}
|
2021-01-05 17:33:31 -07:00
|
|
|
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
if (decl.deletion_flag) {
|
|
|
|
|
decl.deletion_flag = false;
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(mod.deletion_set.swapRemove(decl));
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
decl.analysis = .unreferenced;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
pub fn deleteUnusedDecl(mod: *Module, decl: *Decl) void {
|
|
|
|
|
log.debug("deleteUnusedDecl {*} ({s})", .{ decl, decl.name });
|
|
|
|
|
|
|
|
|
|
// TODO: remove `allocateDeclIndexes` and make the API that the linker backends
|
|
|
|
|
// are required to notice the first time `updateDecl` happens and keep track
|
|
|
|
|
// of it themselves. However they can rely on getting a `freeDecl` call if any
|
|
|
|
|
// `updateDecl` or `updateFunc` calls happen. This will allow us to avoid any call
|
|
|
|
|
// into the linker backend here, since the linker backend will never have been told
|
|
|
|
|
// about the Decl in the first place.
|
|
|
|
|
// Until then, we did call `allocateDeclIndexes` on this anonymous Decl and so we
|
|
|
|
|
// must call `freeDecl` in the linker backend now.
|
|
|
|
|
if (decl.has_tv) {
|
|
|
|
|
if (decl.ty.hasCodeGenBits()) {
|
|
|
|
|
mod.comp.bin_file.freeDecl(decl);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const dependants = decl.dependants.keys();
|
|
|
|
|
assert(dependants[0].namespace.anon_decls.swapRemove(decl));
|
|
|
|
|
|
|
|
|
|
for (dependants) |dep| {
|
|
|
|
|
dep.removeDependency(decl);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
for (decl.dependencies.keys()) |dep| {
|
|
|
|
|
dep.removeDependant(decl);
|
|
|
|
|
}
|
|
|
|
|
decl.destroy(mod);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn deleteAnonDecl(mod: *Module, scope: *Scope, decl: *Decl) void {
|
|
|
|
|
log.debug("deleteAnonDecl {*} ({s})", .{ decl, decl.name });
|
|
|
|
|
const scope_decl = scope.ownerDecl().?;
|
|
|
|
|
assert(scope_decl.namespace.anon_decls.swapRemove(decl));
|
|
|
|
|
decl.destroy(mod);
|
|
|
|
|
}
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
/// Delete all the Export objects that are caused by this Decl. Re-analysis of
|
|
|
|
|
/// this Decl will cause them to be re-created (or not).
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
fn deleteDeclExports(mod: *Module, decl: *Decl) void {
|
2021-06-03 15:39:26 -05:00
|
|
|
const kv = mod.export_owners.fetchSwapRemove(decl) orelse return;
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
for (kv.value) |exp| {
|
2021-06-03 15:39:26 -05:00
|
|
|
if (mod.decl_exports.getPtr(exp.exported_decl)) |value_ptr| {
|
2020-09-13 19:17:58 -07:00
|
|
|
// Remove exports with owner_decl matching the regenerating decl.
|
2021-06-03 15:39:26 -05:00
|
|
|
const list = value_ptr.*;
|
2020-09-13 19:17:58 -07:00
|
|
|
var i: usize = 0;
|
|
|
|
|
var new_len = list.len;
|
|
|
|
|
while (i < new_len) {
|
|
|
|
|
if (list[i].owner_decl == decl) {
|
|
|
|
|
mem.copyBackwards(*Export, list[i..], list[i + 1 .. new_len]);
|
|
|
|
|
new_len -= 1;
|
|
|
|
|
} else {
|
|
|
|
|
i += 1;
|
|
|
|
|
}
|
|
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
value_ptr.* = mod.gpa.shrink(list, new_len);
|
2020-09-13 19:17:58 -07:00
|
|
|
if (new_len == 0) {
|
2021-06-03 15:39:26 -05:00
|
|
|
assert(mod.decl_exports.swapRemove(exp.exported_decl));
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
if (mod.comp.bin_file.cast(link.File.Elf)) |elf| {
|
2020-10-07 20:32:02 +02:00
|
|
|
elf.deleteExport(exp.link.elf);
|
|
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
if (mod.comp.bin_file.cast(link.File.MachO)) |macho| {
|
2020-10-07 20:32:02 +02:00
|
|
|
macho.deleteExport(exp.link.macho);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
if (mod.failed_exports.fetchSwapRemove(exp)) |failed_kv| {
|
|
|
|
|
failed_kv.value.destroy(mod.gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.gpa.free(exp.options.name);
|
|
|
|
|
mod.gpa.destroy(exp);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod.gpa.free(kv.value);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-07-14 12:16:48 -07:00
|
|
|
pub fn analyzeFnBody(mod: *Module, decl: *Decl, func: *Fn) SemaError!Air {
|
2020-09-13 19:17:58 -07:00
|
|
|
const tracy = trace(@src());
|
|
|
|
|
defer tracy.end();
|
|
|
|
|
|
2021-07-11 16:32:11 -07:00
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
// Use the Decl's arena for function memory.
|
2021-07-11 16:32:11 -07:00
|
|
|
var arena = decl.value_arena.?.promote(gpa);
|
2021-04-27 18:36:12 -07:00
|
|
|
defer decl.value_arena.?.* = arena.state;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
2021-04-27 18:36:12 -07:00
|
|
|
const fn_ty = decl.ty;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
|
2021-04-30 23:11:20 -07:00
|
|
|
var sema: Sema = .{
|
|
|
|
|
.mod = mod,
|
2021-07-11 16:32:11 -07:00
|
|
|
.gpa = gpa,
|
2021-04-30 23:11:20 -07:00
|
|
|
.arena = &arena.allocator,
|
2021-07-12 15:30:30 -07:00
|
|
|
.code = decl.namespace.file_scope.zir,
|
2021-04-30 23:11:20 -07:00
|
|
|
.owner_decl = decl,
|
|
|
|
|
.namespace = decl.namespace,
|
|
|
|
|
.func = func,
|
2021-08-06 16:24:39 -07:00
|
|
|
.fn_ret_ty = func.owner_decl.ty.fnReturnType(),
|
2021-04-30 23:11:20 -07:00
|
|
|
.owner_func = func,
|
|
|
|
|
};
|
2021-05-17 17:39:52 -07:00
|
|
|
defer sema.deinit();
|
2021-04-30 23:11:20 -07:00
|
|
|
|
2021-07-11 16:32:11 -07:00
|
|
|
// First few indexes of extra are reserved and set at the end.
|
|
|
|
|
const reserved_count = @typeInfo(Air.ExtraIndex).Enum.fields.len;
|
|
|
|
|
try sema.air_extra.ensureTotalCapacity(gpa, reserved_count);
|
|
|
|
|
sema.air_extra.items.len += reserved_count;
|
|
|
|
|
|
2021-04-30 23:11:20 -07:00
|
|
|
var inner_block: Scope.Block = .{
|
|
|
|
|
.parent = null,
|
|
|
|
|
.sema = &sema,
|
|
|
|
|
.src_decl = decl,
|
|
|
|
|
.instructions = .{},
|
|
|
|
|
.inlining = null,
|
|
|
|
|
.is_comptime = false,
|
|
|
|
|
};
|
2021-07-11 16:32:11 -07:00
|
|
|
defer inner_block.instructions.deinit(gpa);
|
2021-04-30 23:11:20 -07:00
|
|
|
|
2021-08-03 17:29:59 -07:00
|
|
|
const fn_info = sema.code.getFnInfo(func.zir_body_inst);
|
|
|
|
|
const zir_tags = sema.code.instructions.items(.tag);
|
|
|
|
|
|
|
|
|
|
// Here we are performing "runtime semantic analysis" for a function body, which means
|
|
|
|
|
// we must map the parameter ZIR instructions to `arg` AIR instructions.
|
|
|
|
|
// AIR requires the `arg` parameters to be the first N instructions.
|
2021-08-03 22:34:22 -07:00
|
|
|
// This could be a generic function instantiation, however, in which case we need to
|
|
|
|
|
// map the comptime parameters to constant values and only emit arg AIR instructions
|
|
|
|
|
// for the runtime ones.
|
|
|
|
|
const runtime_params_len = @intCast(u32, fn_ty.fnParamLen());
|
|
|
|
|
try inner_block.instructions.ensureTotalCapacity(gpa, runtime_params_len);
|
|
|
|
|
try sema.air_instructions.ensureUnusedCapacity(gpa, fn_info.total_params_len * 2); // * 2 for the `addType`
|
|
|
|
|
try sema.inst_map.ensureUnusedCapacity(gpa, fn_info.total_params_len);
|
2021-08-03 17:29:59 -07:00
|
|
|
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
var runtime_param_index: usize = 0;
|
|
|
|
|
var total_param_index: usize = 0;
|
2021-08-03 17:29:59 -07:00
|
|
|
for (fn_info.param_body) |inst| {
|
|
|
|
|
const name = switch (zir_tags[inst]) {
|
|
|
|
|
.param, .param_comptime => blk: {
|
|
|
|
|
const inst_data = sema.code.instructions.items(.data)[inst].pl_tok;
|
|
|
|
|
const extra = sema.code.extraData(Zir.Inst.Param, inst_data.payload_index).data;
|
|
|
|
|
break :blk extra.name;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
.param_anytype, .param_anytype_comptime => blk: {
|
|
|
|
|
const str_tok = sema.code.instructions.items(.data)[inst].str_tok;
|
|
|
|
|
break :blk str_tok.start;
|
|
|
|
|
},
|
|
|
|
|
|
|
|
|
|
else => continue,
|
|
|
|
|
};
|
2021-08-03 22:34:22 -07:00
|
|
|
if (func.comptime_args) |comptime_args| {
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
const arg_tv = comptime_args[total_param_index];
|
2021-08-06 16:24:39 -07:00
|
|
|
if (arg_tv.val.tag() != .generic_poison) {
|
2021-08-03 22:34:22 -07:00
|
|
|
// We have a comptime value for this parameter.
|
|
|
|
|
const arg = try sema.addConstant(arg_tv.ty, arg_tv.val);
|
|
|
|
|
sema.inst_map.putAssumeCapacityNoClobber(inst, arg);
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
total_param_index += 1;
|
2021-08-03 22:34:22 -07:00
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
}
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
const param_type = fn_ty.fnParamType(runtime_param_index);
|
2021-07-12 15:30:30 -07:00
|
|
|
const ty_ref = try sema.addType(param_type);
|
2021-07-13 15:45:08 -07:00
|
|
|
const arg_index = @intCast(u32, sema.air_instructions.len);
|
|
|
|
|
inner_block.instructions.appendAssumeCapacity(arg_index);
|
2021-08-03 17:29:59 -07:00
|
|
|
sema.air_instructions.appendAssumeCapacity(.{
|
2021-07-12 15:30:30 -07:00
|
|
|
.tag = .arg,
|
2021-08-03 17:29:59 -07:00
|
|
|
.data = .{ .ty_str = .{
|
|
|
|
|
.ty = ty_ref,
|
|
|
|
|
.str = name,
|
|
|
|
|
} },
|
2021-07-12 15:30:30 -07:00
|
|
|
});
|
2021-08-03 17:29:59 -07:00
|
|
|
sema.inst_map.putAssumeCapacityNoClobber(inst, Air.indexToRef(arg_index));
|
stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
|
|
|
total_param_index += 1;
|
|
|
|
|
runtime_param_index += 1;
|
2021-07-12 15:30:30 -07:00
|
|
|
}
|
2021-04-30 23:11:20 -07:00
|
|
|
|
|
|
|
|
func.state = .in_progress;
|
|
|
|
|
log.debug("set {s} to in_progress", .{decl.name});
|
|
|
|
|
|
2021-08-03 17:29:59 -07:00
|
|
|
_ = sema.analyzeBody(&inner_block, fn_info.body) catch |err| switch (err) {
|
2021-09-14 21:58:22 -07:00
|
|
|
// TODO make these unreachable instead of @panic
|
2021-08-05 23:20:53 -07:00
|
|
|
error.NeededSourceLocation => @panic("zig compiler bug: NeededSourceLocation"),
|
|
|
|
|
error.GenericPoison => @panic("zig compiler bug: GenericPoison"),
|
2021-09-14 21:58:22 -07:00
|
|
|
error.ComptimeReturn => @panic("zig compiler bug: ComptimeReturn"),
|
2021-08-03 17:29:59 -07:00
|
|
|
else => |e| return e,
|
|
|
|
|
};
|
2021-04-30 23:11:20 -07:00
|
|
|
|
2021-07-11 16:32:11 -07:00
|
|
|
// Copy the block into place and mark that as the main block.
|
2021-07-15 15:52:06 -07:00
|
|
|
try sema.air_extra.ensureUnusedCapacity(gpa, @typeInfo(Air.Block).Struct.fields.len +
|
|
|
|
|
inner_block.instructions.items.len);
|
2021-07-12 15:30:30 -07:00
|
|
|
const main_block_index = sema.addExtraAssumeCapacity(Air.Block{
|
|
|
|
|
.body_len = @intCast(u32, inner_block.instructions.items.len),
|
|
|
|
|
});
|
|
|
|
|
sema.air_extra.appendSliceAssumeCapacity(inner_block.instructions.items);
|
|
|
|
|
sema.air_extra.items[@enumToInt(Air.ExtraIndex.main_block)] = main_block_index;
|
2021-07-11 16:32:11 -07:00
|
|
|
|
2021-04-30 23:11:20 -07:00
|
|
|
func.state = .success;
|
|
|
|
|
log.debug("set {s} to success", .{decl.name});
|
2021-07-11 16:32:11 -07:00
|
|
|
|
|
|
|
|
return Air{
|
|
|
|
|
.instructions = sema.air_instructions.toOwnedSlice(),
|
2021-07-12 15:30:30 -07:00
|
|
|
.extra = sema.air_extra.toOwnedSlice(gpa),
|
|
|
|
|
.values = sema.air_values.toOwnedSlice(gpa),
|
2021-07-11 16:32:11 -07:00
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
fn markOutdatedDecl(mod: *Module, decl: *Decl) !void {
|
2021-05-06 17:20:45 -07:00
|
|
|
log.debug("mark outdated {*} ({s})", .{ decl, decl.name });
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
try mod.comp.work_queue.writeItem(.{ .analyze_decl = decl });
|
2021-06-03 15:39:26 -05:00
|
|
|
if (mod.failed_decls.fetchSwapRemove(decl)) |kv| {
|
|
|
|
|
kv.value.destroy(mod.gpa);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-04-26 20:41:07 -07:00
|
|
|
if (mod.emit_h) |emit_h| {
|
2021-06-03 15:39:26 -05:00
|
|
|
if (emit_h.failed_decls.fetchSwapRemove(decl)) |kv| {
|
|
|
|
|
kv.value.destroy(mod.gpa);
|
2021-04-26 20:41:07 -07:00
|
|
|
}
|
2021-01-05 17:33:31 -07:00
|
|
|
}
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
_ = mod.compile_log_decls.swapRemove(decl);
|
2020-09-13 19:17:58 -07:00
|
|
|
decl.analysis = .outdated;
|
|
|
|
|
}
|
|
|
|
|
|
2021-08-30 19:22:04 -07:00
|
|
|
pub fn allocateNewDecl(mod: *Module, namespace: *Scope.Namespace, src_node: Ast.Node.Index) !*Decl {
|
2021-01-05 17:33:31 -07:00
|
|
|
// If we have emit-h then we must allocate a bigger structure to store the emit-h state.
|
|
|
|
|
const new_decl: *Decl = if (mod.emit_h != null) blk: {
|
|
|
|
|
const parent_struct = try mod.gpa.create(DeclPlusEmitH);
|
|
|
|
|
parent_struct.* = .{
|
|
|
|
|
.emit_h = .{},
|
|
|
|
|
.decl = undefined,
|
|
|
|
|
};
|
|
|
|
|
break :blk &parent_struct.decl;
|
|
|
|
|
} else try mod.gpa.create(Decl);
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
new_decl.* = .{
|
|
|
|
|
.name = "",
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace = namespace,
|
2021-04-08 20:37:19 -07:00
|
|
|
.src_node = src_node,
|
2021-05-01 21:57:52 -07:00
|
|
|
.src_line = undefined,
|
2021-04-27 18:36:12 -07:00
|
|
|
.has_tv = false,
|
2021-05-11 14:17:52 -07:00
|
|
|
.owns_tv = false,
|
2021-04-27 18:36:12 -07:00
|
|
|
.ty = undefined,
|
|
|
|
|
.val = undefined,
|
|
|
|
|
.align_val = undefined,
|
|
|
|
|
.linksection_val = undefined,
|
2020-09-13 19:17:58 -07:00
|
|
|
.analysis = .unreferenced,
|
|
|
|
|
.deletion_flag = false,
|
2021-05-05 13:16:14 -07:00
|
|
|
.zir_decl_index = 0,
|
2021-01-05 17:33:31 -07:00
|
|
|
.link = switch (mod.comp.bin_file.tag) {
|
2020-09-13 19:17:58 -07:00
|
|
|
.coff => .{ .coff = link.File.Coff.TextBlock.empty },
|
|
|
|
|
.elf => .{ .elf = link.File.Elf.TextBlock.empty },
|
|
|
|
|
.macho => .{ .macho = link.File.MachO.TextBlock.empty },
|
2021-06-01 22:48:20 -04:00
|
|
|
.plan9 => .{ .plan9 = link.File.Plan9.DeclBlock.empty },
|
2021-01-05 11:08:34 -07:00
|
|
|
.c => .{ .c = link.File.C.DeclBlock.empty },
|
2021-04-08 22:44:29 +02:00
|
|
|
.wasm => .{ .wasm = link.File.Wasm.DeclBlock.empty },
|
2021-01-19 00:34:44 +01:00
|
|
|
.spirv => .{ .spirv = {} },
|
2020-09-13 19:17:58 -07:00
|
|
|
},
|
2021-01-05 17:33:31 -07:00
|
|
|
.fn_link = switch (mod.comp.bin_file.tag) {
|
2020-09-13 19:17:58 -07:00
|
|
|
.coff => .{ .coff = {} },
|
|
|
|
|
.elf => .{ .elf = link.File.Elf.SrcFn.empty },
|
|
|
|
|
.macho => .{ .macho = link.File.MachO.SrcFn.empty },
|
2021-06-01 22:48:20 -04:00
|
|
|
.plan9 => .{ .plan9 = {} },
|
2021-01-05 11:08:34 -07:00
|
|
|
.c => .{ .c = link.File.C.FnBlock.empty },
|
2021-04-08 22:44:29 +02:00
|
|
|
.wasm => .{ .wasm = link.File.Wasm.FnData.empty },
|
2021-01-19 00:34:44 +01:00
|
|
|
.spirv => .{ .spirv = .{} },
|
2020-09-13 19:17:58 -07:00
|
|
|
},
|
|
|
|
|
.generation = 0,
|
|
|
|
|
.is_pub = false,
|
2021-04-27 18:36:12 -07:00
|
|
|
.is_exported = false,
|
2021-04-28 16:55:22 -07:00
|
|
|
.has_linksection = false,
|
|
|
|
|
.has_align = false,
|
stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
|
|
|
.alive = false,
|
2021-08-28 15:35:59 -07:00
|
|
|
.is_usingnamespace = false,
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
return new_decl;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Get error value for error tag `name`.
|
2021-06-03 15:39:26 -05:00
|
|
|
pub fn getErrorValue(mod: *Module, name: []const u8) !std.StringHashMapUnmanaged(ErrorInt).KV {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const gop = try mod.global_error_set.getOrPut(mod.gpa, name);
|
2021-06-03 15:39:26 -05:00
|
|
|
if (gop.found_existing) {
|
|
|
|
|
return std.StringHashMapUnmanaged(ErrorInt).KV{
|
|
|
|
|
.key = gop.key_ptr.*,
|
|
|
|
|
.value = gop.value_ptr.*,
|
|
|
|
|
};
|
|
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
errdefer assert(mod.global_error_set.remove(name));
|
2021-03-26 17:54:41 -04:00
|
|
|
try mod.error_name_list.ensureCapacity(mod.gpa, mod.error_name_list.items.len + 1);
|
2021-06-03 15:39:26 -05:00
|
|
|
gop.key_ptr.* = try mod.gpa.dupe(u8, name);
|
|
|
|
|
gop.value_ptr.* = @intCast(ErrorInt, mod.error_name_list.items.len);
|
|
|
|
|
mod.error_name_list.appendAssumeCapacity(gop.key_ptr.*);
|
|
|
|
|
return std.StringHashMapUnmanaged(ErrorInt).KV{
|
|
|
|
|
.key = gop.key_ptr.*,
|
|
|
|
|
.value = gop.value_ptr.*,
|
|
|
|
|
};
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2020-12-28 17:15:29 -07:00
|
|
|
pub fn analyzeExport(
|
2021-01-16 22:51:01 -07:00
|
|
|
mod: *Module,
|
2020-12-28 17:15:29 -07:00
|
|
|
scope: *Scope,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
src: LazySrcLoc,
|
2020-12-28 17:15:29 -07:00
|
|
|
borrowed_symbol_name: []const u8,
|
|
|
|
|
exported_decl: *Decl,
|
|
|
|
|
) !void {
|
2021-01-16 22:51:01 -07:00
|
|
|
try mod.ensureDeclAnalyzed(exported_decl);
|
2021-04-27 18:36:12 -07:00
|
|
|
switch (exported_decl.ty.zigTypeTag()) {
|
2020-09-13 19:17:58 -07:00
|
|
|
.Fn => {},
|
2021-04-27 18:36:12 -07:00
|
|
|
else => return mod.fail(scope, src, "unable to export type '{}'", .{exported_decl.ty}),
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: improvements towards `zig test`
* Add AIR instruction: struct_field_val
- This is part of an effort to eliminate the AIR instruction `ref`.
- It's implemented for C backend and LLVM backend so far.
* Rename `resolvePossiblyUndefinedValue` to `resolveMaybeUndefVal` just
to save some columns on long lines.
* Sema: add `fieldVal` alongside `fieldPtr` (renamed from
`namedFieldPtr`). This is part of an effort to eliminate the AIR
instruction `ref`. The idea is to avoid unnecessary loads, stores,
stack usage, and IR instructions, by paying a DRY cost.
LLVM backend improvements:
* internal linkage vs exported linkage is implemented, along with
aliases. There is an issue with incremental updates due to missing
LLVM API for deleting aliases; see the relevant comment in this commit.
- `updateDeclExports` is hooked up to the LLVM backend now.
* Fix usage of `Type.tag() == .noreturn` rather than calling `isNoReturn()`.
* Properly mark global variables as mutable/constant.
* Fix llvm type generation of function pointers
* Fix codegen for calls of function pointers
* Implement llvm type generation of error unions and error sets.
* Implement AIR instructions: addwrap, subwrap, mul, mulwrap, div,
bit_and, bool_and, bit_or, bool_or, xor, struct_field_ptr,
struct_field_val, unwrap_errunion_err, add for floats, sub for
floats.
After this commit, `zig test` on a file with `test "example" {}`
correctly generates and executes a test binary. However the
`test_functions` slice is undefined and just happens to be going into
the .bss section, causing the length to be 0. The next step towards
`zig test` will be replacing the `test_functions` Decl Value with the
set of test function pointers, before it is sent to linker/codegen.
2021-07-26 19:12:34 -07:00
|
|
|
try mod.decl_exports.ensureUnusedCapacity(mod.gpa, 1);
|
|
|
|
|
try mod.export_owners.ensureUnusedCapacity(mod.gpa, 1);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-01-16 22:51:01 -07:00
|
|
|
const new_export = try mod.gpa.create(Export);
|
|
|
|
|
errdefer mod.gpa.destroy(new_export);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-01-16 22:51:01 -07:00
|
|
|
const symbol_name = try mod.gpa.dupe(u8, borrowed_symbol_name);
|
|
|
|
|
errdefer mod.gpa.free(symbol_name);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-01-16 22:51:01 -07:00
|
|
|
const owner_decl = scope.ownerDecl().?;
|
2020-09-13 19:17:58 -07:00
|
|
|
|
2021-05-03 20:05:29 -07:00
|
|
|
log.debug("exporting Decl '{s}' as symbol '{s}' from Decl '{s}'", .{
|
|
|
|
|
exported_decl.name, borrowed_symbol_name, owner_decl.name,
|
|
|
|
|
});
|
|
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
new_export.* = .{
|
|
|
|
|
.options = .{ .name = symbol_name },
|
|
|
|
|
.src = src,
|
2021-01-16 22:51:01 -07:00
|
|
|
.link = switch (mod.comp.bin_file.tag) {
|
2020-10-07 20:32:02 +02:00
|
|
|
.coff => .{ .coff = {} },
|
|
|
|
|
.elf => .{ .elf = link.File.Elf.Export{} },
|
|
|
|
|
.macho => .{ .macho = link.File.MachO.Export{} },
|
2021-06-01 22:48:20 -04:00
|
|
|
.plan9 => .{ .plan9 = null },
|
2020-10-07 20:32:02 +02:00
|
|
|
.c => .{ .c = {} },
|
|
|
|
|
.wasm => .{ .wasm = {} },
|
2021-01-19 00:34:44 +01:00
|
|
|
.spirv => .{ .spirv = {} },
|
2020-10-07 20:32:02 +02:00
|
|
|
},
|
2020-09-13 19:17:58 -07:00
|
|
|
.owner_decl = owner_decl,
|
|
|
|
|
.exported_decl = exported_decl,
|
|
|
|
|
.status = .in_progress,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
// Add to export_owners table.
|
2021-01-16 22:51:01 -07:00
|
|
|
const eo_gop = mod.export_owners.getOrPutAssumeCapacity(owner_decl);
|
2020-09-13 19:17:58 -07:00
|
|
|
if (!eo_gop.found_existing) {
|
2021-06-03 15:39:26 -05:00
|
|
|
eo_gop.value_ptr.* = &[0]*Export{};
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
eo_gop.value_ptr.* = try mod.gpa.realloc(eo_gop.value_ptr.*, eo_gop.value_ptr.len + 1);
|
|
|
|
|
eo_gop.value_ptr.*[eo_gop.value_ptr.len - 1] = new_export;
|
|
|
|
|
errdefer eo_gop.value_ptr.* = mod.gpa.shrink(eo_gop.value_ptr.*, eo_gop.value_ptr.len - 1);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
|
|
|
|
// Add to exported_decl table.
|
2021-01-16 22:51:01 -07:00
|
|
|
const de_gop = mod.decl_exports.getOrPutAssumeCapacity(exported_decl);
|
2020-09-13 19:17:58 -07:00
|
|
|
if (!de_gop.found_existing) {
|
2021-06-03 15:39:26 -05:00
|
|
|
de_gop.value_ptr.* = &[0]*Export{};
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-06-03 15:39:26 -05:00
|
|
|
de_gop.value_ptr.* = try mod.gpa.realloc(de_gop.value_ptr.*, de_gop.value_ptr.len + 1);
|
|
|
|
|
de_gop.value_ptr.*[de_gop.value_ptr.len - 1] = new_export;
|
|
|
|
|
errdefer de_gop.value_ptr.* = mod.gpa.shrink(de_gop.value_ptr.*, de_gop.value_ptr.len - 1);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-05-10 22:50:00 -07:00
|
|
|
/// Takes ownership of `name` even if it returns an error.
|
|
|
|
|
pub fn createAnonymousDeclNamed(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
scope: *Scope,
|
|
|
|
|
typed_value: TypedValue,
|
|
|
|
|
name: [:0]u8,
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
) !*Decl {
|
|
|
|
|
return mod.createAnonymousDeclFromDeclNamed(scope.ownerDecl().?, typed_value, name);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn createAnonymousDecl(mod: *Module, scope: *Scope, typed_value: TypedValue) !*Decl {
|
|
|
|
|
return mod.createAnonymousDeclFromDecl(scope.ownerDecl().?, typed_value);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn createAnonymousDeclFromDecl(mod: *Module, owner_decl: *Decl, tv: TypedValue) !*Decl {
|
|
|
|
|
const name_index = mod.getNextAnonNameIndex();
|
|
|
|
|
const name = try std.fmt.allocPrintZ(mod.gpa, "{s}__anon_{d}", .{
|
|
|
|
|
owner_decl.name, name_index,
|
|
|
|
|
});
|
|
|
|
|
return mod.createAnonymousDeclFromDeclNamed(owner_decl, tv, name);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Takes ownership of `name` even if it returns an error.
|
|
|
|
|
pub fn createAnonymousDeclFromDeclNamed(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
owner_decl: *Decl,
|
|
|
|
|
typed_value: TypedValue,
|
|
|
|
|
name: [:0]u8,
|
2021-05-10 22:50:00 -07:00
|
|
|
) !*Decl {
|
|
|
|
|
errdefer mod.gpa.free(name);
|
|
|
|
|
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
const namespace = owner_decl.namespace;
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
try namespace.anon_decls.ensureUnusedCapacity(mod.gpa, 1);
|
|
|
|
|
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
const new_decl = try mod.allocateNewDecl(namespace, owner_decl.src_node);
|
2021-04-27 18:36:12 -07:00
|
|
|
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
new_decl.name = name;
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
new_decl.src_line = owner_decl.src_line;
|
2021-04-27 18:36:12 -07:00
|
|
|
new_decl.ty = typed_value.ty;
|
|
|
|
|
new_decl.val = typed_value.val;
|
|
|
|
|
new_decl.has_tv = true;
|
2020-09-13 19:17:58 -07:00
|
|
|
new_decl.analysis = .complete;
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
new_decl.generation = mod.generation;
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
namespace.anon_decls.putAssumeCapacityNoClobber(new_decl, {});
|
|
|
|
|
|
2021-03-28 19:38:19 -07:00
|
|
|
// TODO: This generates the Decl into the machine code file if it is of a
|
|
|
|
|
// type that is non-zero size. We should be able to further improve the
|
|
|
|
|
// compiler to omit Decls which are only referenced at compile-time and not runtime.
|
2020-09-13 19:17:58 -07:00
|
|
|
if (typed_value.ty.hasCodeGenBits()) {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
try mod.comp.bin_file.allocateDeclIndexes(new_decl);
|
|
|
|
|
try mod.comp.work_queue.writeItem(.{ .codegen_decl = new_decl });
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return new_decl;
|
|
|
|
|
}
|
|
|
|
|
|
2021-05-10 22:50:00 -07:00
|
|
|
pub fn getNextAnonNameIndex(mod: *Module) usize {
|
stage2: type declarations ZIR encode AnonNameStrategy
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
2021-05-10 21:34:43 -07:00
|
|
|
return @atomicRmw(usize, &mod.next_anon_name_index, .Add, 1, .Monotonic);
|
|
|
|
|
}
|
|
|
|
|
|
2021-03-23 16:12:26 -07:00
|
|
|
pub fn makeIntType(arena: *Allocator, signedness: std.builtin.Signedness, bits: u16) !Type {
|
2021-03-17 22:54:56 -07:00
|
|
|
const int_payload = try arena.create(Type.Payload.Bits);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
int_payload.* = .{
|
|
|
|
|
.base = .{
|
2021-03-23 16:12:26 -07:00
|
|
|
.tag = switch (signedness) {
|
|
|
|
|
.signed => .int_signed,
|
|
|
|
|
.unsigned => .int_unsigned,
|
|
|
|
|
},
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
},
|
|
|
|
|
.data = bits,
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.initPayload(&int_payload.base);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// We don't return a pointer to the new error note because the pointer
|
|
|
|
|
/// becomes invalid when you add another one.
|
|
|
|
|
pub fn errNote(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
scope: *Scope,
|
|
|
|
|
src: LazySrcLoc,
|
|
|
|
|
parent: *ErrorMsg,
|
|
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
2021-04-02 21:06:09 -07:00
|
|
|
) error{OutOfMemory}!void {
|
|
|
|
|
return mod.errNoteNonLazy(src.toSrcLoc(scope), parent, format, args);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn errNoteNonLazy(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
src_loc: SrcLoc,
|
|
|
|
|
parent: *ErrorMsg,
|
|
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
) error{OutOfMemory}!void {
|
|
|
|
|
const msg = try std.fmt.allocPrint(mod.gpa, format, args);
|
|
|
|
|
errdefer mod.gpa.free(msg);
|
2020-09-13 19:17:58 -07:00
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
parent.notes = try mod.gpa.realloc(parent.notes, parent.notes.len + 1);
|
|
|
|
|
parent.notes[parent.notes.len - 1] = .{
|
2021-04-02 21:06:09 -07:00
|
|
|
.src_loc = src_loc,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
.msg = msg,
|
2021-01-16 22:51:01 -07:00
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn errMsg(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
scope: *Scope,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
src: LazySrcLoc,
|
2021-01-16 22:51:01 -07:00
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
|
|
|
|
) error{OutOfMemory}!*ErrorMsg {
|
2021-03-17 00:56:08 -07:00
|
|
|
return ErrorMsg.create(mod.gpa, src.toSrcLoc(scope), format, args);
|
2021-01-16 22:51:01 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn fail(
|
|
|
|
|
mod: *Module,
|
|
|
|
|
scope: *Scope,
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
src: LazySrcLoc,
|
2021-01-16 22:51:01 -07:00
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
2021-07-14 12:16:48 -07:00
|
|
|
) CompileError {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const err_msg = try mod.errMsg(scope, src, format, args);
|
2021-01-16 22:51:01 -07:00
|
|
|
return mod.failWithOwnedErrorMsg(scope, err_msg);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// Same as `fail`, except given a token index, and the function sets up the `LazySrcLoc`
|
|
|
|
|
/// for pointing at it relatively by subtracting from the containing `Decl`.
|
2020-09-13 19:17:58 -07:00
|
|
|
pub fn failTok(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod: *Module,
|
2020-09-13 19:17:58 -07:00
|
|
|
scope: *Scope,
|
2021-08-30 19:22:04 -07:00
|
|
|
token_index: Ast.TokenIndex,
|
2020-09-13 19:17:58 -07:00
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
2021-07-14 12:16:48 -07:00
|
|
|
) CompileError {
|
2021-03-20 17:09:06 -07:00
|
|
|
const src = scope.srcDecl().?.tokSrcLoc(token_index);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return mod.fail(scope, src, format, args);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
/// Same as `fail`, except given an AST node index, and the function sets up the `LazySrcLoc`
|
|
|
|
|
/// for pointing at it relatively by subtracting from the containing `Decl`.
|
2020-09-13 19:17:58 -07:00
|
|
|
pub fn failNode(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
mod: *Module,
|
2020-09-13 19:17:58 -07:00
|
|
|
scope: *Scope,
|
2021-08-30 19:22:04 -07:00
|
|
|
node_index: Ast.Node.Index,
|
2020-09-13 19:17:58 -07:00
|
|
|
comptime format: []const u8,
|
|
|
|
|
args: anytype,
|
2021-07-14 12:16:48 -07:00
|
|
|
) CompileError {
|
2021-03-20 17:09:06 -07:00
|
|
|
const src = scope.srcDecl().?.nodeSrcLoc(node_index);
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return mod.fail(scope, src, format, args);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-07-14 12:16:48 -07:00
|
|
|
pub fn failWithOwnedErrorMsg(mod: *Module, scope: *Scope, err_msg: *ErrorMsg) CompileError {
|
2021-01-16 22:51:01 -07:00
|
|
|
@setCold(true);
|
2021-04-14 11:26:53 -07:00
|
|
|
|
2020-09-13 19:17:58 -07:00
|
|
|
{
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
errdefer err_msg.destroy(mod.gpa);
|
2021-07-14 12:16:48 -07:00
|
|
|
if (err_msg.src_loc.lazy == .unneeded) {
|
|
|
|
|
return error.NeededSourceLocation;
|
|
|
|
|
}
|
|
|
|
|
try mod.failed_decls.ensureUnusedCapacity(mod.gpa, 1);
|
|
|
|
|
try mod.failed_files.ensureUnusedCapacity(mod.gpa, 1);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
switch (scope.tag) {
|
|
|
|
|
.block => {
|
|
|
|
|
const block = scope.cast(Scope.Block).?;
|
2021-03-23 21:37:10 -07:00
|
|
|
if (block.sema.owner_func) |func| {
|
|
|
|
|
func.state = .sema_failure;
|
2020-09-13 19:17:58 -07:00
|
|
|
} else {
|
2021-03-23 21:37:10 -07:00
|
|
|
block.sema.owner_decl.analysis = .sema_failure;
|
|
|
|
|
block.sema.owner_decl.generation = mod.generation;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-03-17 22:54:56 -07:00
|
|
|
mod.failed_decls.putAssumeCapacityNoClobber(block.sema.owner_decl, err_msg);
|
2020-09-13 19:17:58 -07:00
|
|
|
},
|
|
|
|
|
.file => unreachable,
|
stage2: entry point via std lib and proper updated file detection
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.
2021-04-09 23:17:50 -07:00
|
|
|
.namespace => unreachable,
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
return error.AnalysisFail;
|
|
|
|
|
}
|
|
|
|
|
|
stage2: inferred local variables
This patch introduces the following new things:
Types:
- inferred_alloc
- This is a special value that tracks a set of types that have been stored
to an inferred allocation. It does not support most of the normal type queries.
However it does respond to `isConstPtr`, `ptrSize`, `zigTypeTag`, etc.
- The payload for this type simply points to the corresponding Value
payload.
Values:
- inferred_alloc
- This is a special value that tracks a set of types that have been stored
to an inferred allocation. It does not support any of the normal value queries.
ZIR instructions:
- store_to_inferred_ptr,
- Same as `store` but the type of the value being stored will be used to infer
the pointer type.
- resolve_inferred_alloc
- Each `store_to_inferred_ptr` puts the type of the stored value into a set,
and then `resolve_inferred_alloc` triggers peer type resolution on the set.
The operand is a `alloc_inferred` or `alloc_inferred_mut` instruction, which
is the allocation that needs to have its type inferred.
Changes to the C backend:
* Implements the bitcast instruction. If the source and dest types
are both pointers, uses a cast, otherwise uses memcpy.
* Tests are run with -Wno-declaration-after-statement. Someday we can
conform to this but not today.
In ZIR form it looks like this:
```zir
fn_body main { // unanalyzed
%0 = dbg_stmt()
=>%1 = alloc_inferred()
%2 = declval_in_module(Decl(add))
%3 = deref(%2)
%4 = param_type(%3, 0)
%5 = const(TypedValue{ .ty = comptime_int, .val = 1})
%6 = as(%4, %5)
%7 = param_type(%3, 1)
%8 = const(TypedValue{ .ty = comptime_int, .val = 2})
%9 = as(%7, %8)
%10 = call(%3, [%6, %9], modifier=auto)
=>%11 = store_to_inferred_ptr(%1, %10)
=>%12 = resolve_inferred_alloc(%1)
%13 = dbg_stmt()
%14 = ret_type()
%15 = const(TypedValue{ .ty = comptime_int, .val = 3})
%16 = sub(%10, %15)
%17 = as(%14, %16)
%18 = return(%17)
} // fn_body main
```
I have not played around with very many test cases yet. Some interesting
ones that I want to look at before merging:
```zig
var x = blk: {
var y = foo();
y.a = 1;
break :blk y;
};
```
In the above test case, x and y are supposed to alias.
```zig
var x = if (bar()) blk: {
var y = foo();
y.a = 1;
break :blk y;
} else blk: {
var z = baz();
z.b = 1;
break :blk z;
};
```
In the above test case, x, y, and z are supposed to alias.
I also haven't tested with `var` instead of `const` yet.
2020-12-31 01:54:02 -07:00
|
|
|
pub fn simplePtrType(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena: *Allocator,
|
stage2: inferred local variables
This patch introduces the following new things:
Types:
- inferred_alloc
- This is a special value that tracks a set of types that have been stored
to an inferred allocation. It does not support most of the normal type queries.
However it does respond to `isConstPtr`, `ptrSize`, `zigTypeTag`, etc.
- The payload for this type simply points to the corresponding Value
payload.
Values:
- inferred_alloc
- This is a special value that tracks a set of types that have been stored
to an inferred allocation. It does not support any of the normal value queries.
ZIR instructions:
- store_to_inferred_ptr,
- Same as `store` but the type of the value being stored will be used to infer
the pointer type.
- resolve_inferred_alloc
- Each `store_to_inferred_ptr` puts the type of the stored value into a set,
and then `resolve_inferred_alloc` triggers peer type resolution on the set.
The operand is a `alloc_inferred` or `alloc_inferred_mut` instruction, which
is the allocation that needs to have its type inferred.
Changes to the C backend:
* Implements the bitcast instruction. If the source and dest types
are both pointers, uses a cast, otherwise uses memcpy.
* Tests are run with -Wno-declaration-after-statement. Someday we can
conform to this but not today.
In ZIR form it looks like this:
```zir
fn_body main { // unanalyzed
%0 = dbg_stmt()
=>%1 = alloc_inferred()
%2 = declval_in_module(Decl(add))
%3 = deref(%2)
%4 = param_type(%3, 0)
%5 = const(TypedValue{ .ty = comptime_int, .val = 1})
%6 = as(%4, %5)
%7 = param_type(%3, 1)
%8 = const(TypedValue{ .ty = comptime_int, .val = 2})
%9 = as(%7, %8)
%10 = call(%3, [%6, %9], modifier=auto)
=>%11 = store_to_inferred_ptr(%1, %10)
=>%12 = resolve_inferred_alloc(%1)
%13 = dbg_stmt()
%14 = ret_type()
%15 = const(TypedValue{ .ty = comptime_int, .val = 3})
%16 = sub(%10, %15)
%17 = as(%14, %16)
%18 = return(%17)
} // fn_body main
```
I have not played around with very many test cases yet. Some interesting
ones that I want to look at before merging:
```zig
var x = blk: {
var y = foo();
y.a = 1;
break :blk y;
};
```
In the above test case, x and y are supposed to alias.
```zig
var x = if (bar()) blk: {
var y = foo();
y.a = 1;
break :blk y;
} else blk: {
var z = baz();
z.b = 1;
break :blk z;
};
```
In the above test case, x, y, and z are supposed to alias.
I also haven't tested with `var` instead of `const` yet.
2020-12-31 01:54:02 -07:00
|
|
|
elem_ty: Type,
|
|
|
|
|
mutable: bool,
|
|
|
|
|
size: std.builtin.TypeInfo.Pointer.Size,
|
|
|
|
|
) Allocator.Error!Type {
|
2020-09-13 19:17:58 -07:00
|
|
|
if (!mutable and size == .Slice and elem_ty.eql(Type.initTag(.u8))) {
|
|
|
|
|
return Type.initTag(.const_slice_u8);
|
|
|
|
|
}
|
|
|
|
|
// TODO stage1 type inference bug
|
|
|
|
|
const T = Type.Tag;
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
const type_payload = try arena.create(Type.Payload.ElemType);
|
2020-09-13 19:17:58 -07:00
|
|
|
type_payload.* = .{
|
|
|
|
|
.base = .{
|
|
|
|
|
.tag = switch (size) {
|
|
|
|
|
.One => if (mutable) T.single_mut_pointer else T.single_const_pointer,
|
|
|
|
|
.Many => if (mutable) T.many_mut_pointer else T.many_const_pointer,
|
|
|
|
|
.C => if (mutable) T.c_mut_pointer else T.c_const_pointer,
|
|
|
|
|
.Slice => if (mutable) T.mut_slice else T.const_slice,
|
|
|
|
|
},
|
|
|
|
|
},
|
2020-12-30 19:57:11 -07:00
|
|
|
.data = elem_ty,
|
2020-09-13 19:17:58 -07:00
|
|
|
};
|
|
|
|
|
return Type.initPayload(&type_payload.base);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn ptrType(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena: *Allocator,
|
2020-09-13 19:17:58 -07:00
|
|
|
elem_ty: Type,
|
|
|
|
|
sentinel: ?Value,
|
|
|
|
|
@"align": u32,
|
|
|
|
|
bit_offset: u16,
|
|
|
|
|
host_size: u16,
|
|
|
|
|
mutable: bool,
|
|
|
|
|
@"allowzero": bool,
|
|
|
|
|
@"volatile": bool,
|
|
|
|
|
size: std.builtin.TypeInfo.Pointer.Size,
|
|
|
|
|
) Allocator.Error!Type {
|
|
|
|
|
assert(host_size == 0 or bit_offset < host_size * 8);
|
|
|
|
|
|
|
|
|
|
// TODO check if type can be represented by simplePtrType
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.pointer.create(arena, .{
|
2020-09-13 19:17:58 -07:00
|
|
|
.pointee_type = elem_ty,
|
|
|
|
|
.sentinel = sentinel,
|
|
|
|
|
.@"align" = @"align",
|
|
|
|
|
.bit_offset = bit_offset,
|
|
|
|
|
.host_size = host_size,
|
|
|
|
|
.@"allowzero" = @"allowzero",
|
|
|
|
|
.mutable = mutable,
|
|
|
|
|
.@"volatile" = @"volatile",
|
|
|
|
|
.size = size,
|
2020-12-30 19:57:11 -07:00
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2021-07-30 16:05:46 -07:00
|
|
|
pub fn optionalType(arena: *Allocator, child_type: Type) Allocator.Error!Type {
|
2020-12-30 19:57:11 -07:00
|
|
|
switch (child_type.tag()) {
|
|
|
|
|
.single_const_pointer => return Type.Tag.optional_single_const_pointer.create(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena,
|
2020-12-30 19:57:11 -07:00
|
|
|
child_type.elemType(),
|
|
|
|
|
),
|
|
|
|
|
.single_mut_pointer => return Type.Tag.optional_single_mut_pointer.create(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena,
|
2020-12-30 19:57:11 -07:00
|
|
|
child_type.elemType(),
|
|
|
|
|
),
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
else => return Type.Tag.optional.create(arena, child_type),
|
2020-12-30 19:57:11 -07:00
|
|
|
}
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2020-12-30 19:57:11 -07:00
|
|
|
pub fn arrayType(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena: *Allocator,
|
2020-12-30 19:57:11 -07:00
|
|
|
len: u64,
|
|
|
|
|
sentinel: ?Value,
|
|
|
|
|
elem_type: Type,
|
|
|
|
|
) Allocator.Error!Type {
|
2020-09-13 19:17:58 -07:00
|
|
|
if (elem_type.eql(Type.initTag(.u8))) {
|
|
|
|
|
if (sentinel) |some| {
|
2021-07-30 16:05:46 -07:00
|
|
|
if (some.eql(Value.initTag(.zero), elem_type)) {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.array_u8_sentinel_0.create(arena, len);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
} else {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.array_u8.create(arena, len);
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (sentinel) |some| {
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.array_sentinel.create(arena, .{
|
2020-09-13 19:17:58 -07:00
|
|
|
.len = len,
|
|
|
|
|
.sentinel = some,
|
|
|
|
|
.elem_type = elem_type,
|
2020-12-30 19:57:11 -07:00
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.array.create(arena, .{
|
2020-09-13 19:17:58 -07:00
|
|
|
.len = len,
|
|
|
|
|
.elem_type = elem_type,
|
2020-12-30 19:57:11 -07:00
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
2020-12-30 19:57:11 -07:00
|
|
|
pub fn errorUnionType(
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
arena: *Allocator,
|
2020-12-30 19:57:11 -07:00
|
|
|
error_set: Type,
|
|
|
|
|
payload: Type,
|
|
|
|
|
) Allocator.Error!Type {
|
2020-09-13 19:17:58 -07:00
|
|
|
assert(error_set.zigTypeTag() == .ErrorSet);
|
|
|
|
|
if (error_set.eql(Type.initTag(.anyerror)) and payload.eql(Type.initTag(.void))) {
|
|
|
|
|
return Type.initTag(.anyerror_void_error_union);
|
|
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
return Type.Tag.error_union.create(arena, .{
|
2020-09-13 19:17:58 -07:00
|
|
|
.error_set = error_set,
|
|
|
|
|
.payload = payload,
|
2020-12-30 19:57:11 -07:00
|
|
|
});
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn getTarget(mod: Module) Target {
|
|
|
|
|
return mod.comp.bin_file.options.target;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
|
|
|
|
|
stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:
* `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
each instruction independently, there is now a Tag and 8 bytes of
data available for all ZIR instructions. Small instructions fit
within these 8 bytes; larger ones use 4 bytes for an index into
`extra`. There is also `string_bytes` so that we can have 4 byte
references to strings. `zir.Inst.Tag` describes how to interpret
those 8 bytes of data.
- This is shared by all `Block` scopes.
* `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
structure, the arrays are mutable, and get resized as we add/delete
things. There is extra state to keep track of things. This struct is
stored on the stack. Once it is finished, it produces an immutable
`zir.Code`, which will remain on the heap for the duration of a
function's existence.
- This is shared by all `GenZir` scopes.
* `Sema`: represents in-progress semantic analysis of a `zir.Code`.
This data is stored on the stack and is shared among all `Block`
scopes. It is now the main "self" argument to everything in the file
that was previously named `zir_sema.zig`.
Additionally, I moved some logic that was in `Module` into here.
`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.
astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.
I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.
Overhaul of Source Locations:
Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.
Now there are more types involved into source locations, and more ways
to describe a source location.
* AllErrors.Message: embrace the assumption that files always have less
than 2 << 32 bytes.
* SrcLoc gets more complicated, to model more complicated source
locations.
* Introduce LazySrcLoc, which can model interesting source locations
with very little stored state. Useful for avoiding doing unnecessary
work when no compile errors occur.
Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.
Miscellaneous:
* std.zig: string literals have more helpful result values for
reporting errors. There is now a lower level API and a higher level
API.
- side note: I noticed that the string literal logic needs some love.
There is some unnecessarily hacky code there.
* cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
probably broke stuff and needs to get fixed.
* Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
think this quite how this code will be organized. Need some more
careful planning about how to implement structs, unions, enums. They
need to be independent Decls, just like a top level function.
2021-03-15 23:38:38 -07:00
|
|
|
pub fn optimizeMode(mod: Module) std.builtin.Mode {
|
|
|
|
|
return mod.comp.bin_file.options.optimize_mode;
|
2020-09-13 19:17:58 -07:00
|
|
|
}
|
2021-04-28 16:55:22 -07:00
|
|
|
|
|
|
|
|
fn lockAndClearFileCompileError(mod: *Module, file: *Scope.File) void {
|
|
|
|
|
switch (file.status) {
|
2021-05-11 17:34:13 -07:00
|
|
|
.success_zir, .retryable_failure => {},
|
2021-04-28 16:55:22 -07:00
|
|
|
.never_loaded, .parse_failure, .astgen_failure => {
|
|
|
|
|
const lock = mod.comp.mutex.acquire();
|
|
|
|
|
defer lock.release();
|
2021-06-03 15:39:26 -05:00
|
|
|
if (mod.failed_files.fetchSwapRemove(file)) |kv| {
|
|
|
|
|
if (kv.value) |msg| msg.destroy(mod.gpa); // Delete previous error message.
|
2021-04-28 16:55:22 -07:00
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
}
|
2021-05-02 17:08:19 -07:00
|
|
|
|
|
|
|
|
pub const SwitchProngSrc = union(enum) {
|
|
|
|
|
scalar: u32,
|
|
|
|
|
multi: Multi,
|
|
|
|
|
range: Multi,
|
|
|
|
|
|
|
|
|
|
pub const Multi = struct {
|
|
|
|
|
prong: u32,
|
|
|
|
|
item: u32,
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
pub const RangeExpand = enum { none, first, last };
|
|
|
|
|
|
|
|
|
|
/// This function is intended to be called only when it is certain that we need
|
|
|
|
|
/// the LazySrcLoc in order to emit a compile error.
|
|
|
|
|
pub fn resolve(
|
|
|
|
|
prong_src: SwitchProngSrc,
|
2021-05-17 19:11:11 -07:00
|
|
|
gpa: *Allocator,
|
2021-05-02 17:08:19 -07:00
|
|
|
decl: *Decl,
|
|
|
|
|
switch_node_offset: i32,
|
|
|
|
|
range_expand: RangeExpand,
|
|
|
|
|
) LazySrcLoc {
|
|
|
|
|
@setCold(true);
|
2021-05-17 19:11:11 -07:00
|
|
|
const tree = decl.namespace.file_scope.getTree(gpa) catch |err| {
|
|
|
|
|
// In this case we emit a warning + a less precise source location.
|
|
|
|
|
log.warn("unable to load {s}: {s}", .{
|
|
|
|
|
decl.namespace.file_scope.sub_file_path, @errorName(err),
|
|
|
|
|
});
|
2021-05-17 19:30:38 -07:00
|
|
|
return LazySrcLoc{ .node_offset = 0 };
|
2021-05-17 19:11:11 -07:00
|
|
|
};
|
2021-05-02 17:08:19 -07:00
|
|
|
const switch_node = decl.relativeToNodeIndex(switch_node_offset);
|
|
|
|
|
const main_tokens = tree.nodes.items(.main_token);
|
|
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const node_tags = tree.nodes.items(.tag);
|
2021-08-30 19:22:04 -07:00
|
|
|
const extra = tree.extraData(node_datas[switch_node].rhs, Ast.Node.SubRange);
|
2021-05-02 17:08:19 -07:00
|
|
|
const case_nodes = tree.extra_data[extra.start..extra.end];
|
|
|
|
|
|
|
|
|
|
var multi_i: u32 = 0;
|
|
|
|
|
var scalar_i: u32 = 0;
|
|
|
|
|
for (case_nodes) |case_node| {
|
|
|
|
|
const case = switch (node_tags[case_node]) {
|
|
|
|
|
.switch_case_one => tree.switchCaseOne(case_node),
|
|
|
|
|
.switch_case => tree.switchCase(case_node),
|
|
|
|
|
else => unreachable,
|
|
|
|
|
};
|
|
|
|
|
if (case.ast.values.len == 0)
|
|
|
|
|
continue;
|
|
|
|
|
if (case.ast.values.len == 1 and
|
|
|
|
|
node_tags[case.ast.values[0]] == .identifier and
|
|
|
|
|
mem.eql(u8, tree.tokenSlice(main_tokens[case.ast.values[0]]), "_"))
|
|
|
|
|
{
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
const is_multi = case.ast.values.len != 1 or
|
|
|
|
|
node_tags[case.ast.values[0]] == .switch_range;
|
|
|
|
|
|
|
|
|
|
switch (prong_src) {
|
|
|
|
|
.scalar => |i| if (!is_multi and i == scalar_i) return LazySrcLoc{
|
|
|
|
|
.node_offset = decl.nodeIndexToRelative(case.ast.values[0]),
|
|
|
|
|
},
|
|
|
|
|
.multi => |s| if (is_multi and s.prong == multi_i) {
|
|
|
|
|
var item_i: u32 = 0;
|
|
|
|
|
for (case.ast.values) |item_node| {
|
|
|
|
|
if (node_tags[item_node] == .switch_range) continue;
|
|
|
|
|
|
|
|
|
|
if (item_i == s.item) return LazySrcLoc{
|
|
|
|
|
.node_offset = decl.nodeIndexToRelative(item_node),
|
|
|
|
|
};
|
|
|
|
|
item_i += 1;
|
|
|
|
|
} else unreachable;
|
|
|
|
|
},
|
|
|
|
|
.range => |s| if (is_multi and s.prong == multi_i) {
|
|
|
|
|
var range_i: u32 = 0;
|
|
|
|
|
for (case.ast.values) |range| {
|
|
|
|
|
if (node_tags[range] != .switch_range) continue;
|
|
|
|
|
|
|
|
|
|
if (range_i == s.item) switch (range_expand) {
|
|
|
|
|
.none => return LazySrcLoc{
|
|
|
|
|
.node_offset = decl.nodeIndexToRelative(range),
|
|
|
|
|
},
|
|
|
|
|
.first => return LazySrcLoc{
|
|
|
|
|
.node_offset = decl.nodeIndexToRelative(node_datas[range].lhs),
|
|
|
|
|
},
|
|
|
|
|
.last => return LazySrcLoc{
|
|
|
|
|
.node_offset = decl.nodeIndexToRelative(node_datas[range].rhs),
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
range_i += 1;
|
|
|
|
|
} else unreachable;
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
if (is_multi) {
|
|
|
|
|
multi_i += 1;
|
|
|
|
|
} else {
|
|
|
|
|
scalar_i += 1;
|
|
|
|
|
}
|
|
|
|
|
} else unreachable;
|
|
|
|
|
}
|
|
|
|
|
};
|
2021-05-03 11:46:02 -07:00
|
|
|
|
2021-07-09 20:43:19 +08:00
|
|
|
pub const PeerTypeCandidateSrc = union(enum) {
|
|
|
|
|
/// Do not print out error notes for candidate sources
|
|
|
|
|
none: void,
|
|
|
|
|
/// When we want to know the the src of candidate i, look up at
|
|
|
|
|
/// index i in this slice
|
|
|
|
|
override: []LazySrcLoc,
|
|
|
|
|
/// resolvePeerTypes originates from a @TypeOf(...) call
|
|
|
|
|
typeof_builtin_call_node_offset: i32,
|
|
|
|
|
|
|
|
|
|
pub fn resolve(
|
|
|
|
|
self: PeerTypeCandidateSrc,
|
|
|
|
|
gpa: *Allocator,
|
|
|
|
|
decl: *Decl,
|
|
|
|
|
candidate_i: usize,
|
|
|
|
|
) ?LazySrcLoc {
|
|
|
|
|
@setCold(true);
|
|
|
|
|
|
|
|
|
|
switch (self) {
|
|
|
|
|
.none => {
|
|
|
|
|
return null;
|
|
|
|
|
},
|
|
|
|
|
.override => |candidate_srcs| {
|
|
|
|
|
return candidate_srcs[candidate_i];
|
|
|
|
|
},
|
|
|
|
|
.typeof_builtin_call_node_offset => |node_offset| {
|
2021-09-14 21:58:22 -07:00
|
|
|
switch (candidate_i) {
|
|
|
|
|
0 => return LazySrcLoc{ .node_offset_builtin_call_arg0 = node_offset },
|
|
|
|
|
1 => return LazySrcLoc{ .node_offset_builtin_call_arg1 = node_offset },
|
|
|
|
|
2 => return LazySrcLoc{ .node_offset_builtin_call_arg2 = node_offset },
|
|
|
|
|
3 => return LazySrcLoc{ .node_offset_builtin_call_arg3 = node_offset },
|
|
|
|
|
4 => return LazySrcLoc{ .node_offset_builtin_call_arg4 = node_offset },
|
|
|
|
|
5 => return LazySrcLoc{ .node_offset_builtin_call_arg5 = node_offset },
|
|
|
|
|
else => {},
|
2021-07-09 20:43:19 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
const tree = decl.namespace.file_scope.getTree(gpa) catch |err| {
|
|
|
|
|
// In this case we emit a warning + a less precise source location.
|
|
|
|
|
log.warn("unable to load {s}: {s}", .{
|
|
|
|
|
decl.namespace.file_scope.sub_file_path, @errorName(err),
|
|
|
|
|
});
|
|
|
|
|
return LazySrcLoc{ .node_offset = 0 };
|
|
|
|
|
};
|
|
|
|
|
const node = decl.relativeToNodeIndex(node_offset);
|
|
|
|
|
const node_datas = tree.nodes.items(.data);
|
|
|
|
|
const params = tree.extra_data[node_datas[node].lhs..node_datas[node].rhs];
|
|
|
|
|
|
|
|
|
|
return LazySrcLoc{ .node_abs = params[candidate_i] };
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
/// Called from `performAllTheWork`, after all AstGen workers have finished,
|
|
|
|
|
/// and before the main semantic analysis loop begins.
|
|
|
|
|
pub fn processOutdatedAndDeletedDecls(mod: *Module) !void {
|
|
|
|
|
// Ultimately, the goal is to queue up `analyze_decl` tasks in the work queue
|
|
|
|
|
// for the outdated decls, but we cannot queue up the tasks until after
|
|
|
|
|
// we find out which ones have been deleted, otherwise there would be
|
|
|
|
|
// deleted Decl pointers in the work queue.
|
|
|
|
|
var outdated_decls = std.AutoArrayHashMap(*Decl, void).init(mod.gpa);
|
|
|
|
|
defer outdated_decls.deinit();
|
2021-06-03 15:39:26 -05:00
|
|
|
for (mod.import_table.values()) |file| {
|
2021-05-06 17:20:45 -07:00
|
|
|
try outdated_decls.ensureUnusedCapacity(file.outdated_decls.items.len);
|
|
|
|
|
for (file.outdated_decls.items) |decl| {
|
|
|
|
|
outdated_decls.putAssumeCapacity(decl, {});
|
|
|
|
|
}
|
2021-05-11 22:12:36 -07:00
|
|
|
file.outdated_decls.clearRetainingCapacity();
|
|
|
|
|
|
2021-05-06 17:20:45 -07:00
|
|
|
// Handle explicitly deleted decls from the source code. This is one of two
|
|
|
|
|
// places that Decl deletions happen. The other is in `Compilation`, after
|
|
|
|
|
// `performAllTheWork`, where we iterate over `Module.deletion_set` and
|
|
|
|
|
// delete Decls which are no longer referenced.
|
|
|
|
|
// If a Decl is explicitly deleted from source, and also no longer referenced,
|
|
|
|
|
// it may be both in this `deleted_decls` set, as well as in the
|
|
|
|
|
// `Module.deletion_set`. To avoid deleting it twice, we remove it from the
|
|
|
|
|
// deletion set at this time.
|
|
|
|
|
for (file.deleted_decls.items) |decl| {
|
|
|
|
|
log.debug("deleted from source: {*} ({s})", .{ decl, decl.name });
|
stage2: fix deletion of Decls that get re-referenced
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
2021-05-18 12:35:36 -07:00
|
|
|
|
|
|
|
|
// Remove from the namespace it resides in, preserving declaration order.
|
|
|
|
|
assert(decl.zir_decl_index != 0);
|
|
|
|
|
_ = decl.namespace.decls.orderedRemove(mem.spanZ(decl.name));
|
|
|
|
|
|
|
|
|
|
try mod.clearDecl(decl, &outdated_decls);
|
|
|
|
|
decl.destroy(mod);
|
2021-05-06 17:20:45 -07:00
|
|
|
}
|
2021-05-11 22:12:36 -07:00
|
|
|
file.deleted_decls.clearRetainingCapacity();
|
2021-05-06 17:20:45 -07:00
|
|
|
}
|
|
|
|
|
// Finally we can queue up re-analysis tasks after we have processed
|
|
|
|
|
// the deleted decls.
|
2021-06-03 15:39:26 -05:00
|
|
|
for (outdated_decls.keys()) |key| {
|
|
|
|
|
try mod.markOutdatedDecl(key);
|
2021-05-06 17:20:45 -07:00
|
|
|
}
|
|
|
|
|
}
|
2021-05-14 17:41:22 -07:00
|
|
|
|
|
|
|
|
/// Called from `Compilation.update`, after everything is done, just before
|
|
|
|
|
/// reporting compile errors. In this function we emit exported symbol collision
|
|
|
|
|
/// errors and communicate exported symbols to the linker backend.
|
|
|
|
|
pub fn processExports(mod: *Module) !void {
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
// Map symbol names to `Export` for name collision detection.
|
|
|
|
|
var symbol_exports: std.StringArrayHashMapUnmanaged(*Export) = .{};
|
|
|
|
|
defer symbol_exports.deinit(gpa);
|
|
|
|
|
|
2021-06-03 15:39:26 -05:00
|
|
|
var it = mod.decl_exports.iterator();
|
|
|
|
|
while (it.next()) |entry| {
|
|
|
|
|
const exported_decl = entry.key_ptr.*;
|
|
|
|
|
const exports = entry.value_ptr.*;
|
2021-05-14 17:41:22 -07:00
|
|
|
for (exports) |new_export| {
|
|
|
|
|
const gop = try symbol_exports.getOrPut(gpa, new_export.options.name);
|
|
|
|
|
if (gop.found_existing) {
|
|
|
|
|
new_export.status = .failed_retryable;
|
|
|
|
|
try mod.failed_exports.ensureUnusedCapacity(gpa, 1);
|
|
|
|
|
const src_loc = new_export.getSrcLoc();
|
|
|
|
|
const msg = try ErrorMsg.create(gpa, src_loc, "exported symbol collision: {s}", .{
|
|
|
|
|
new_export.options.name,
|
|
|
|
|
});
|
|
|
|
|
errdefer msg.destroy(gpa);
|
2021-06-03 15:39:26 -05:00
|
|
|
const other_export = gop.value_ptr.*;
|
2021-05-14 17:41:22 -07:00
|
|
|
const other_src_loc = other_export.getSrcLoc();
|
|
|
|
|
try mod.errNoteNonLazy(other_src_loc, msg, "other symbol here", .{});
|
|
|
|
|
mod.failed_exports.putAssumeCapacityNoClobber(new_export, msg);
|
|
|
|
|
new_export.status = .failed;
|
|
|
|
|
} else {
|
2021-06-03 15:39:26 -05:00
|
|
|
gop.value_ptr.* = new_export;
|
2021-05-14 17:41:22 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
mod.comp.bin_file.updateDeclExports(mod, exported_decl, exports) catch |err| switch (err) {
|
|
|
|
|
error.OutOfMemory => return error.OutOfMemory,
|
|
|
|
|
else => {
|
|
|
|
|
const new_export = exports[0];
|
|
|
|
|
new_export.status = .failed_retryable;
|
|
|
|
|
try mod.failed_exports.ensureUnusedCapacity(gpa, 1);
|
|
|
|
|
const src_loc = new_export.getSrcLoc();
|
|
|
|
|
const msg = try ErrorMsg.create(gpa, src_loc, "unable to export: {s}", .{
|
|
|
|
|
@errorName(err),
|
|
|
|
|
});
|
|
|
|
|
mod.failed_exports.putAssumeCapacityNoClobber(new_export, msg);
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
}
|
stage2: `zig test` now works with the LLVM backend
Frontend improvements:
* When compiling in `zig test` mode, put a task on the work queue to
analyze the main package root file. Normally, start code does
`_ = import("root");` to make Zig analyze the user's code, however in
the case of `zig test`, the root source file is the test runner.
Without this change, no tests are picked up.
* In the main pipeline, once semantic analysis is finished, if there
are no compile errors, populate the `test_functions` Decl with the
set of test functions picked up from semantic analysis.
* Value: add `array` and `slice` Tags.
LLVM backend improvements:
* Fix incremental updates of globals. Previously the
value of a global would not get replaced with a new value.
* Fix LLVM type of arrays. They were incorrectly sending
the ABI size as the element count.
* Remove the FuncGen parameter from genTypedValue. This function is for
generating global constants and there is no function available when
it is being called.
- The `ref_val` case is now commented out. I'd like to eliminate
`ref_val` as one of the possible Value Tags. Instead it should
always be done via `decl_ref`.
* Implement constant value generation for slices, arrays, and structs.
* Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:06:42 -07:00
|
|
|
|
|
|
|
|
pub fn populateTestFunctions(mod: *Module) !void {
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
const builtin_pkg = mod.main_pkg.table.get("builtin").?;
|
|
|
|
|
const builtin_file = (mod.importPkg(builtin_pkg) catch unreachable).file;
|
|
|
|
|
const builtin_namespace = builtin_file.root_decl.?.namespace;
|
|
|
|
|
const decl = builtin_namespace.decls.get("test_functions").?;
|
|
|
|
|
var buf: Type.Payload.ElemType = undefined;
|
|
|
|
|
const tmp_test_fn_ty = decl.ty.slicePtrFieldType(&buf).elemType();
|
|
|
|
|
|
|
|
|
|
const array_decl = d: {
|
|
|
|
|
// Add mod.test_functions to an array decl then make the test_functions
|
|
|
|
|
// decl reference it as a slice.
|
|
|
|
|
var new_decl_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
errdefer new_decl_arena.deinit();
|
|
|
|
|
const arena = &new_decl_arena.allocator;
|
|
|
|
|
|
|
|
|
|
const test_fn_vals = try arena.alloc(Value, mod.test_functions.count());
|
|
|
|
|
const array_decl = try mod.createAnonymousDeclFromDecl(decl, .{
|
|
|
|
|
.ty = try Type.Tag.array.create(arena, .{
|
|
|
|
|
.len = test_fn_vals.len,
|
|
|
|
|
.elem_type = try tmp_test_fn_ty.copy(arena),
|
|
|
|
|
}),
|
|
|
|
|
.val = try Value.Tag.array.create(arena, test_fn_vals),
|
|
|
|
|
});
|
|
|
|
|
for (mod.test_functions.keys()) |test_decl, i| {
|
|
|
|
|
const test_name_slice = mem.sliceTo(test_decl.name, 0);
|
|
|
|
|
const test_name_decl = n: {
|
|
|
|
|
var name_decl_arena = std.heap.ArenaAllocator.init(gpa);
|
|
|
|
|
errdefer name_decl_arena.deinit();
|
|
|
|
|
const bytes = try name_decl_arena.allocator.dupe(u8, test_name_slice);
|
|
|
|
|
const test_name_decl = try mod.createAnonymousDeclFromDecl(array_decl, .{
|
|
|
|
|
.ty = try Type.Tag.array_u8.create(&name_decl_arena.allocator, bytes.len),
|
|
|
|
|
.val = try Value.Tag.bytes.create(&name_decl_arena.allocator, bytes),
|
|
|
|
|
});
|
|
|
|
|
try test_name_decl.finalizeNewArena(&name_decl_arena);
|
|
|
|
|
break :n test_name_decl;
|
|
|
|
|
};
|
|
|
|
|
try mod.linkerUpdateDecl(test_name_decl);
|
|
|
|
|
|
|
|
|
|
const field_vals = try arena.create([3]Value);
|
|
|
|
|
field_vals.* = .{
|
|
|
|
|
try Value.Tag.slice.create(arena, .{
|
|
|
|
|
.ptr = try Value.Tag.decl_ref.create(arena, test_name_decl),
|
|
|
|
|
.len = try Value.Tag.int_u64.create(arena, test_name_slice.len),
|
|
|
|
|
}), // name
|
|
|
|
|
try Value.Tag.decl_ref.create(arena, test_decl), // func
|
|
|
|
|
Value.initTag(.null_value), // async_frame_size
|
|
|
|
|
};
|
|
|
|
|
test_fn_vals[i] = try Value.Tag.@"struct".create(arena, field_vals);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
try array_decl.finalizeNewArena(&new_decl_arena);
|
|
|
|
|
break :d array_decl;
|
|
|
|
|
};
|
|
|
|
|
try mod.linkerUpdateDecl(array_decl);
|
|
|
|
|
|
|
|
|
|
{
|
|
|
|
|
var arena_instance = decl.value_arena.?.promote(gpa);
|
|
|
|
|
defer decl.value_arena.?.* = arena_instance.state;
|
|
|
|
|
const arena = &arena_instance.allocator;
|
|
|
|
|
|
|
|
|
|
decl.ty = try Type.Tag.const_slice.create(arena, try tmp_test_fn_ty.copy(arena));
|
|
|
|
|
decl.val = try Value.Tag.slice.create(arena, .{
|
|
|
|
|
.ptr = try Value.Tag.decl_ref.create(arena, array_decl),
|
|
|
|
|
.len = try Value.Tag.int_u64.create(arena, mod.test_functions.count()),
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
try mod.linkerUpdateDecl(decl);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn linkerUpdateDecl(mod: *Module, decl: *Decl) !void {
|
|
|
|
|
mod.comp.bin_file.updateDecl(mod, decl) catch |err| switch (err) {
|
|
|
|
|
error.OutOfMemory => return error.OutOfMemory,
|
|
|
|
|
error.AnalysisFail => {
|
|
|
|
|
decl.analysis = .codegen_failure;
|
|
|
|
|
return;
|
|
|
|
|
},
|
|
|
|
|
else => {
|
|
|
|
|
const gpa = mod.gpa;
|
|
|
|
|
try mod.failed_decls.ensureUnusedCapacity(gpa, 1);
|
|
|
|
|
mod.failed_decls.putAssumeCapacityNoClobber(decl, try ErrorMsg.create(
|
|
|
|
|
gpa,
|
|
|
|
|
decl.srcLoc(),
|
|
|
|
|
"unable to codegen: {s}",
|
|
|
|
|
.{@errorName(err)},
|
|
|
|
|
));
|
|
|
|
|
decl.analysis = .codegen_failure_retryable;
|
|
|
|
|
return;
|
|
|
|
|
},
|
|
|
|
|
};
|
|
|
|
|
}
|