@TypeOf semantics

I want to write a semantics for (a useful subset of) Zig as a master’s thesis, so I’m doing preliminary research on how the language works.

There are a couple of corners in the documentation that just leave me utterly bewildered. Foremost of them, for @TypeOf the docs say that ‘The expressions are evaluated, however they are guaranteed to have no runtime side-effects’; however, this makes no sense whatsoever in the presence of such things as calls to external functions, since those by necessity (since they are black boxes to the compiler) must be treated as observable behavior with potential side-effects by the compiler.

On the other hand, the only observable behavior I can find that distinguishes this ‘side-effect free evaluation’ from no evaluation at all is that @TypeOf(undefined) is documented to lead to a compile-time error rather than yielding noreturn.

This leads me to believe that the only parts that are in fact evaluated are the compile-time-known parts of the expression (since undefined has no run-time inputs, it can notionally be evaluated at compile-time, leading to a compiler error). Is this understanding correct, and if not, how is @TypeOf in the presence of external functions supposed to operate?

I think it just means that all instances of @TypeOf must resolve to a concrete type at compile time. Meaning there are no runtime reflection shenanigans happening. If your expression is not able to be evaluated at compile time, it will result in compiler error. I am not sure whether external functions can be evaluated by the virtual machine at compile time. I have never tried it, but I think it should be possible in theory!

AIUI extern functions are used to link in objects after compilation, so the compiler usually does not have any notion of what happens in them (if it can even be expressed in Zig or even LLVM in the first place), so there should be no way for the compiler to pre-evaluate it.

const std = @import("std");
extern fn func(i32) u32;
pub fn main() !void {
    const stdout = std.io.getStdOut().writer();
    return stdout.print("{}", .{@TypeOf(func(0))});
}

The program above compiles well on 0.8.0, and prints u32 as expected, since the type of func(0) can be determined even if the function cannot be run.

Meanwhile, if I use @TypeOf(unreachable) I get a compiler error as described in the examples of the docs. Confusingly, the same ‘unreachable code’ message also appears for @TypeOf(return) or if I make func noreturn. Meanwhile the compiler crashes if I use @TypeOf(while(true){}). I just don’t understand what the type of the expression (which is supposed to be noreturn for all of these examples) has to do with evaluating the expression, since only explicit type subexpressions (like in a cast, or an explicitly typed variable definition in a block expression) would need to be evaluated to figure out the type of the full expression.

(Note: it seems I confused unreachable with undefined in the OP … undefined is not of type noreturn after all)

You don’t have to call a function to know what type it returns.

Of course you don’t; that’s the purpose of having a type system in the first place. Yet if the function is noreturn the compiler takes issue with it.

I was responding to the 2nd paragraph in your original post where you said you didn’t understand how @TypeOf could work with extern functions because they are black boxes that the compiler cannot execute with any guarantees, but I don’t understand why that’s a problem since @TypeOf doesn’t need to execute them.

Now that I’ve read your second post I see you did understand that @TypeOf doesn’t execute the function, so now I’m confused about the 2nd paragraph in your original post. I must be missing something. I think I was also confused by the unreachable/undefined substitutions…maybe you could restart your questions again in another way and I’ll understand them better?

Maybe you were taking issue with the documentation? ‘The expressions are evaluated, however they are guaranteed to have no runtime side-effects’

Maybe you were taking the word “evaluate” to mean “execute”? In this case I believe the intention is to say that the “type of the expression is evaluated” rather than to say the expression is “executed”. The documentation could word this better. Is that what your point was?

Well, I don’t quite know if the compiler or the docs are wrong, because the docs say that the type of while(true){} is noreturn (which matches the intuition) but doing @TypeOf(while(true){}) crashes the compiler with some LLVM-related message, which implies to me that it’s actually generating the code for while(true){} in order to find out the type (whether or not it actually tries to run it), even if just figuring out that it’s a loop without exits (which can be proven by evaluating the condition (which is comptime-known) and noting that there are no breaks in the body) should have sufficed.

And yes, I do take ‘evaluate’ to mean ‘execute’, since that’s how the word is used in the C/C++ specs IIRC. AFAIK the term for doing type propagation and running comptime blocks (as opposed to runtime execution) is ‘analyze’ in Zig, too, so if type analysis is meant, that would probably have been a better word (it still doesn’t explain the @TypeOf(while(true){}) behavior though; it also doesn’t explain why @TypeOf(unreachable) should be an error, since the type of unreachable is well-defined to be noreturn).

This is interesting. First, Zig does not need to generate LLVM code to find out the type for while(true){}. All types are determined and resolved before Zig generates LLVM IR, so this LLVM error you’re getting is a bug in the compiler and occurs after the type of while(true){} was already determined. My guess here is that when Zig is “evaluating” expressions at “comptime”, it keeps track of whether it’s inside a @TypeOf. I believe this “special handling” of evaluating @TypeOf combined with the while(true){} expression is exposing a corner case bug.

To find out more about what’s going on, let’s consider some examples. Here’s what happens when you use that expression as the return type of a function:

fn foo() @TypeOf(while(true){}) { }
test { _ = foo(); }
$ zig test example.zig
zig test example.zig 
./example.zig:1:18: error: evaluation exceeded 1000 backwards branches
fn foo() @TypeOf(while(true){}) { }
                 ^
./example.zig:1:10: note: referenced here
fn foo() @TypeOf(while(true){}) { }
         ^
./example.zig:2:12: note: referenced here
test { _ = foo(); }

When we use this expression as a return type, the compiler is forced to evaluate it at comptime, and thus gets into an infinite loop. Note however that simply wrapping the expression inside a comptime{} block does not have the same affect (likely another bug).

Consider another example where we call an extern function inside our return type expression:

extern fn bar() u32;
fn foo() @TypeOf(bar()) { return 0; }
test { _ = foo(); }
$ zig test example.zig 
Test [1/1] test ""... 
All 1 tests passed.

In this case there is no issue, but we know it’s impossible for the compiler to call the bar function at compile time. So why does the compiler execute the return type expression for while(true){} but only evaluate the return type when it’s bar()? I think the answer is like I said above, the compiler keeps track of whether it’s currently inside a @TypeOf expression. When it encounters a function call, it will “short-curcuit” executing the function and just grab its return type.

Let’s trick the compiler by forcing it to evaluate the bar() function by moving it outside a @TypeOf call:

fn MyTypeOf(x: anytype) type { return @TypeOf(x); }
extern fn bar() u32;
fn foo() MyTypeOf(bar()) { return 0; }
test { _ = foo(); }
$ zig test example.zig 
./example.zig:3:19: error: unable to evaluate constant expression
fn foo() MyTypeOf(bar()) { return 0; }
                  ^
./example.zig:3:22: note: referenced here
fn foo() MyTypeOf(bar()) { return 0; }
                     ^
./example.zig:4:12: note: referenced here
test { _ = foo(); }
           ^

Now we get a compiler error. Note that this example is exactly the same as the previous one, the only difference is we changed the order of evaluation by forcing Zig to “evaluate” the bar() function before it knew that it was just going to be passed to @TypeOf and we would only need the type. This confirms that the compiler’s “comptime execution engine” has contextual knowledge about what it’s trying to evaluate. The compiler may or may not execute function calls when it is evaluating an expression based on the surrounding context.

I think the TL;DR is that it’s a mix of quirks of how the stage1 compiler is implemented + customization of behavior certain types of introspection. For example, you were explicitly trying to evaluate unreachable where normally reaching that would be a bug, so the compiler stops the evaluation and immediately points at the offending line, which is what you want when developing normally, but which also might feel inelegant when explicitly trying to get the type of undefined.

Note that since we don’t have a language specification yet (work on it hast started but it’s nowhere close to finished), most of this discussion is basically about the C++ code of the stage1 compiler which is going to be thrown away in a few months anyway.

The semantics I want to write is basically a proposal for a subset of a specification for the ‘full language’, so waiting for the spec to be more advanced would be very much counter to the point.

So this discussion (and the many others I will probably start in the future) will be about what is the intended behavior, not about documenting bugs in stage1. It would be great if we could distill a statement about how @TypeOf is supposed to work.

I personally don’t think @TypeOf(unreachable) (or for that matter @TypeOf(noreturn_func()) or @TypeOf(while(true){})) should be forbidden in the first place. If that were allowed, it would make a specification of the language semantics tremendously simpler, since the compiler would only need to do the normal type propagation, instead of it also emitting code in some special mode that only exists for the purpose of this construct. It would also get rid of a confusing paragraph in the high-level documentation.

However, if there is some real benefit to having @TypeOf(unreachable) raising a compiler error, I would be glad to hear the rationale. As it is, AFAIK Zig is alone among programming languages in having a typeof operator actually evaluate the contained expression (and error out if it doesn’t return), and I don’t know what the point is.

well but if it’s not in the spec then the intended behavior is not yet well defined and you’re left with having to infer it from the existing implementation & surrounding materials

You might want to talk with SpexGuy (Martin Wickham) · GitHub who is in charge of writing the spec. He’s also here on the forum @SpexGuy.

You might want to talk with SpexGuy (Martin Wickham) · GitHub who is in charge of writing the spec. He’s also here on the forum @SpexGuy.

Thanks, I’ll try contacting him :slight_smile:
Sounds like someone who should be interested in the same kinds of questions as I will have to ask in the foreseeable future.