Skip to content

Proposal to improve the ergonomics and precision of type inference in generic functions #9260

@ghost

Description

Problem

Generic functions, as currently supported in Zig have the form

fn eql(x: anytype, y: @TypeOf(x)) bool {
    return x == y;
}

This works well enough in general, but also has some problems and limitations:

1. Aesthetics

Besides the controversial keyword (#5893), the @TypeOf pattern is a bit of a hack, even though everybody is used to it by now. It somewhat obscures the fact that eql takes two arguments of the same type. It also leads to unnecessary verbosity in some cases:

fn foo(x: anytype, y: @TypeOf(x)) Container(@TypeOf(x)) {
    const T = @TypeOf(x);
    ...
}

2. Type signatures

The type of a simple function like same(x: i32, y: i32) bool {...}, is fn(i32, i32) bool. For the generic function eql, the intended type is something like fn<T>(T,T) bool. But the actual type inferred by the compiler seems to be fn(anytype,anytype) anytype. In my limited testing, I found that the compiler essentially considers all generic and parametric functions to have the same type. Only the number of arguments is checked. E.g., the following code compiles:

const eq: fn(anytype, f32) i32 = eql; // should have failed
assert(eq(1, 1));

It seems that the type check automatically simplifies to const eq: any_generic_function = eql;, which then succeeds. Maybe this is a limitation of the compiler that can be fixed independently, but then only in part. For example, the fact that both arguments of eql should have the same type is simply not expressible in the current system, since parameter names are not part of the type: fn(anytype, @TypeOf(?)) bool.

This state of affairs reduces the precision of type checks, and also discards valuable information that could be used by documentation generators, mataprogrammed interfaces and error messages.

3. Order dependence

Another minor nit is that generic functions constrain the order of function parameters in some cases. This, for example, is not possible:

fn find(haystack: ArrayList(@TypeOf(needle)), needle: anytype) usize {...} // error

The arguments need to be the other way round, although logically they can be in any order. This is an arbitrary, albeit minor, limitation.

Proposed solution

Change the fn(x: anytype) syntax to fn(x: infer T), which infers the type of x and binds it to T. Example:

// before:
fn eql(x: anytype, y: @TypeOf(x)) bool {...}
// after:
fn eql(x: infer T, y: T) bool {...}

// before:
fn find(needle: anytype, haystack: ArrayList(@TypeOf(needle))) usize {...}
// after:
fn find(needle: infer T, haystack: ArrayList(T)) usize {...}

This makes the intent much clearer, IMO. More importantly, the naming of inferred types allows us to talk about the types of generic functions with much greater precision:

// before:
eql: fn(anytype, anytype) anytype
// after:
eql: fn(infer T, T) bool

// before:
find: fn(anytype, anytype) anytype
// after:
find: fn(infer T, ArrayList(T)) usize

This forms the bare bones of the proposal, but there are additional possibilities that could be easily layered on top of it:

// Make type name optional:
fn wrapper(x: infer _) void { other_generic_function(x, 0); }

// Unrestrict order of inferred parameters:
fn find(haystack: ArrayList(T), needle: infer T) usize {...}
// Note: T can still only be inferred once, and only at a particular location, but it
// can be referred to before that. The procedure would be to resolve all infers first,
// and then use them in a second pass to finalize the types of the remaining arguments.

// Structural pattern matching on simple types:
fn foo(bar: ?*infer T, baz: []const T) bool {...}

// However, note that structural inference across custom types is not possible
// and will probably never be possible, because type constructors in Zig can be
// arbitrarily complex. 
fn convert(x: ArrayList(infer T)) HashSet(T) {...} // No can do. Use this instead:
fn convert(comptime T: type, x: ArrayList(T)) HashSet(T) {...}

Peer type resolution

Another straight-forward possibility is to allow peer type resolution, as suggested by @mlugg in this comment. Peer type resolution is enabled by using an infer with the same label more than once:

fn max(x: infer T, y: infer T) T {
    return if (x > y) x else y;
}

This allows us to call max(@as(u8, 5), @as(u16, 3)) and get a u16 as a result. If the function were instead declared as fn max(x: infer T, y: T) T { ... }, then a call with different argument types would be an error.

Nearly the same thing could be achieved by declaring fn max(x: infer _, y: infer _) @TypeOf(x, y) { ... }, but this is less nice syntactically. It may also lead to over-instantiation in some cases: max(u8, u16), max(u16, u8) and max(u16, u16) would produce separate instances. This can be avoided if the peer type resolution happens at the argument level and is not deferred to the return type calculation.

Similar proposals

This proposal is an improved (I hope) version of something suggested by @YohananDiamond here: #5893 (comment). Towards the end of the comment, the following is proposed:

fn max(generic a: T, b: T) T {
    return if (a > b) a else b;
}

In the same comment it is pointed out that this makes it somewhat unclear that T is being declared here. The present proposal addresses this shortcoming and generally makes it clear what and where exactly is inferred. It also leads to a more natural representation of the type of a generic function, but is otherwise only a minor reshuffling of keywords.

Edit: There's also the proposal #6997, which provides a partial solution to the type signature imprecision problem.

Summary

Advantages over status quo:

  • Preserve more type information for docgen, metaprogramming and error messages
  • Reduce visual noise and unintuitive keyword naming
  • Express intent more clearly

Optionally also:

  • Remove constraints on order of arguments
  • Allow structural pattern matching without std.meta in simple cases

It should also be mentioned that inference remains simple and fully deterministic and does not open the door to template metaprogamming or some such.

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions