Everything You Always Wanted to Know About Type Inference - And a Little Bit More

The Go Blog

Everything You Always Wanted to Know About Type Inference - And a Little Bit More

Robert Griesemer
9 October 2023

This is the blog version of my talk on type inference at GopherCon 2023 in San Diego, slightly expanded and edited for clarity.

What is type inference?

Wikipedia defines type inference as follows:

Type inference is the ability to automatically deduce, either partially or fully, the type of an expression at compile time. The compiler is often able to infer the type of a variable or the type signature of a function, without explicit type annotations having been given.

The key phrase here is “automatically deduce … the type of an expression”. Go supported a basic form of type inference from the start:

const x = expr  // the type of x is the type of expr
var x = expr
x := expr

No explicit types are given in these declarations, and therefore the types of the constant and variables x on the left of = and := are the types of the respective initialization expressions, on the right. We say that the types are inferred from (the types of) their initialization expressions. With the introduction of generics in Go 1.18, Go’s type inference abilities were significantly expanded.

Why type inference?

In non-generic Go code, the effect of leaving away types is most pronounced in a short variable declaration. Such a declaration combines type inference and a little bit of syntactic sugar—the ability to leave away the var keyword—into one very compact statement. Consider the following map variable declaration:

var m map[string]int = map[string]int{}

vs

m := map[string]int{}

Omitting the type on the left of := removes repetition and at the same time increases readability.

Generic Go code has the potential to significantly increase the number of types appearing in code: without type inference, each generic function and type instantiation requires type arguments. Being able to omit them becomes even more important. Consider using the following two functions from the new slices package:

package slices
func BinarySearch[S ~[]E, E cmp.Ordered](x S, target E) (int, bool)
func Sort[S ~[]E, E cmp.Ordered](x S)

Without type inference, calling BinarySearch and Sort requires explicit type arguments:

type List []int
var list List
slices.Sort[List, int](list)
index, found := slices.BinarySearch[List, int](list, 42)

We’d rather not repeat [List, int] with each such generic function call. With type inference the code simplifies to:

type List []int
var list List
slices.Sort(list)
index, found := slices.BinarySearch(list, 42)

This is both cleaner and more compact. In fact it looks exactly like non-generic code, and type inference makes this possible.

Importantly, type inference is an optional mechanism: if type arguments make code clearer, by all means, write them down.

Type inference is a form of type pattern matching

Inference compares type patterns, where a type pattern is a type containing type parameters. For reasons that will become obvious in a bit, type parameters are sometimes also called type variables. Type pattern matching allows us to infer the types that need to go into these type variables. Let’s consider a short example:

// From the slices package
// func Sort[S ~[]E, E cmp.Ordered](x S)
type List []int
var list List
slices.Sort(list)

The Sort function call passes the list variable as function argument for the parameter x of slices.Sort. Therefore the type of list, which is List, must match the type of x, which is type parameter S. If S has the type List, this assignment becomes valid. In reality, the rules for assignments are complicated, but for now it’s good enough to assume that the types must be identical.

Once we have inferred the type for S, we can look at the type constraint for S. It says—because of the tilde ~ symbol—that the underlying type of S must be the slice []E. The underlying type of S is []int, therefore []int must match []E, and with that we can conclude that E must be int. We’ve been able to find types for S and E such that corresponding types match. Inference has succeeded!

Here’s a more complicated scenario where we have a lot of type parameters: S1, S2, E1, and E2 from slices.EqualFunc, and E1 and E2 from the generic function equal. The local function foo calls slices.EqualFunc with the equal function as an argument:

// From the slices package
// func EqualFunc[S1 ~[]E1, S2 ~[]E2, E1, E2 any](s1 S1, s2 S2, eq func(E1, E2) bool) bool
// Local code
func equal[E1, E2 comparable](E1, E2) bool { … }
func foo(list1 []int, list2 []float64) {
    …
    if slices.EqualFunc(list1, list2, equal) {
        …
    }
    …
}

This is an example where type inference really shines as we can potentially leave away six type arguments, one for each of the type parameters. The type pattern matching approach still works, but we can see how it may get complicated quickly because the number of type relationships is proliferating. We need a systematic approach to determine which type parameters and which types get involved with which patterns.

It helps to look at type inference in a slightly different way.

Type equations

We can reframe type inference as a problem of solving type equations. Solving equations is something that we are all familiar with from high school algebra. Luckily, solving type equations is a simpler problem as we will see shortly.

Let’s look again at our earlier example:

// From the slices package
// func Sort[S ~[]E, E cmp.Ordered](x S)
type List []int
var list List
slices.Sort(list)

Inference succeeds if the type equations below can be solved. Here stands for is identical to, and under(S) represents the underlying type of S:

S ≡ List        // find S such that S ≡ List is true
under(S) ≡ []E  // find E such that under(S) ≡ []E is true

The type parameters are the variables in the equations. Solving the equations means finding values (type arguments) for these variables (type parameters), such that the equations become true. This view makes the type inference problem more tractable because it gives us a formal framework that allows us to write down the information that flows into inference.

Being precise with type relations

Until now we have simply talked about types having to be identical. But for actual Go code that is too strong a requirement. In the previous example, S need not be identical to List, rather List must be assignable to S. Similarly, S must satisfy its corresponding type constraint. We can formulate our type equations more precisely by using specific operators that we write as :≡ and :

S :≡ List         // List is assignable to S
S ∈ ~[]E          // S satisfies constraint ~[]E
E ∈ cmp.Ordered   // E satisfies constraint cmp.Ordered

Generally, we can say that type equations come in three forms: two types must be identical, one type must be assignable to the other type, or one type must satisfy a type constraint:

X ≡ Y             // X and Y must be identical
X :≡ Y            // Y is assignable to X
X ∈ Y             // X satisfies constraint Y

(Note: In the GopherCon talk we used the symbols A for :≡ and C for . We believe :≡ more clearly evokes an assignment relation; and directly expresses that the type represented by a type parameter must be an element of its constraint’s type set.)

Sources of type equations

In a generic function call we may have explicit type arguments, though most of the time we hope that they can be inferred. Typically we also have ordinary function arguments. Each explicit type argument contributes a (trivial) type equation: the type parameter must be identical to the type argument because the code says so. Each ordinary function argument contributes another type equation: the function argument must be assignable to its corresponding function parameter. And finally, each type constraint provides a type equation as well by constraining what types satisfy the constraint.

Altogether, this produces n type parameters and m type equations. In contrast to basic high school algebra, n and m don’t have to be the same for type equations to be solvable. For instance, the single equation below allows us to infer the type arguments for two type parameters:

map[K]V ≡ map[int]string  // K ➞ int, V ➞ string (n = 2, m = 1)

Let’s look at each of these sources of type equations in turn:

1. Type equations from type arguments

For each type parameter declaration

func f[…, P constraint, …]…

and explicitly provided type argument

f[…, A, …]…

we get the type equation

P ≡ A

We can trivially solve this for P: P must be A and we write P ➞ A. In other words, there is nothing to do here. We could still write down the respective type equation for completeness, but in this case, the Go compiler simply substitutes the type arguments for their type parameters throughout and then those type parameters are gone and we can forget about them.

2. Type equations from assignments

For each function argument x passed to a function parameter p

f(…, x, …)

where p or x contain type parameters, the type of x must be assignable to the type of the parameter p. We can express this with the equation

𝑻(p) :≡ 𝑻(x)

where 𝑻(x) means “the type of x”. If neither p nor x contains type parameters, there is no type variable to solve for: the equation is either true because the assignment is valid Go code, or false if the code is invalid. For this reason, type inference only considers types that contain type parameters of the involved function (or functions).

Starting with Go 1.21, an uninstantiated or partially instantiated function (but not a function call) may also be assigned to a function-typed variable, as in:

// From the slices package
// func Sort[S ~[]E, E cmp.Ordered](x S)
var intSort func([]int) = slices.Sort

Analogous to parameter passing, such assignments lead to a corresponding type equation. For this example it would be

𝑻(intSort) :≡ 𝑻(slices.Sort)

or simplified

func([]int) :≡ func(S)

together with equations for the constraints for S and E from slices.Sort (see below).

3. Type equations from constraints

Finally, for each type parameter P for which we want to infer a type argument, we can extract a type equation from its constraint because the type parameter must satisfy the constraint. Given the declaration

func f[…, P constraint, …]…

we can write down the equation

P ∈ constraint

Here, the means “must satisfy constraint” which is (almost) the same as being a type element of the constraint’s type set. We will see later that some constraints (such as any) are not useful or currently cannot be used due to limitations of the implementation. Inference simply ignores the respective equations in those cases.

Type parameters and equations may be from multiple functions

In Go 1.18, inferred type parameters had to all be from the same function. Specifically, it was not possible to pass a generic, uninstantiated or partially instantiated function as a function argument, or assign it to a (function-typed) variable.

As mentioned earlier, in Go 1.21 type inference also works in these cases. For instance, the generic function

func myEq[P comparable](x, y P) bool { return x == y }

can be assigned to a variable of function type

var strEq func(x, y string) bool = myEq  // same as using myEq[string]

without myEq being fully instantiated, and type inference will infer that the type argument for P must be string.

Furthermore, a generic function may be used uninstantiated or partially instantiated as an argument to another, possibly generic function:

// From the slices package
// func CompactFunc[S ~[]E, E any](s S, eq func(E, E) bool) S
type List []int
var list List
result := slices.CompactFunc(list, myEq)  // same as using slices.CompactFunc[List, int](list, myEq[int])

In this last example, type inference determines the type arguments for CompactFunc and myEq. More generally, type parameters from arbitrarily many functions may need to be inferred. With multiple functions involved, type equations may also be from or involve multiple functions. In the CompactFunc example we end up with three type parameters and five type equations:

Type parameters and constraints:
    S ~[]E
    E any
    P comparable
Explicit type arguments:
    none
Type equations:
    S :≡ List
    func(E, E) bool :≡ func(P, P) bool
    S ∈ ~[]E
    E ∈ any
    P ∈ comparable
Solution:
    S ➞ List
    E ➞ int
    P ➞ int

Bound vs free type parameters

At this point we have a clearer understanding of the various source of type equations, but we have not been very precise about which type parameters to solve the equations for. Let’s consider another example. In the code below, the function body of sortedPrint calls slices.Sort for the sorting part. sortedPrint and slices.Sort are generic functions as both declare type parameters.

// From the slices package
// func Sort[S ~[]E, E cmp.Ordered](x S)
// sortedPrint prints the elements of the provided list in sorted order.
func sortedPrint[F any](list []F) {
    slices.Sort(list)  // 𝑻(list) is []F
    …                  // print list
}

We want to infer the type argument for the slices.Sort call. Passing list to parameter x of slices.Sort gives rise to the equation

𝑻(x) :≡ 𝑻(list)

which is the same as

S :≡ []F

In this equation we have two type parameters, S and F. Which one do we need to solve the type equation for? Because the invoked function is Sort, we care about its type parameter S, not the type parameter F. We say that S is bound to Sort because it is declared by Sort. S is the relevant type variable in this equation. By contrast, F is bound to (declared by) sortedPrint. We say that F is free with respect to Sort. It has its own, already given type. That type is F, whatever that is (determined at instantiation time). In this equation, F is already given, it is a type constant.

When solving type equations we always solve for the type parameters bound to the function we are calling (or assigning in case of a generic function assignment).

Solving type equations

The missing piece, now that we have established how to collect the relevant type parameters and type equations, is of course the algorithm that allows us to solve the equations. After the various examples, it probably has become obvious that solving X ≡ Y simply means comparing the types X and Y recursively against each other, and in the process determining suitable type arguments for type parameters that may occur in X and Y. The goal is to make the types X and Y identical. This matching process is called unification.

The rules for type identity tell us how to compare types. Since bound type parameters play the role of type variables, we need to spe

Établi 1y | 9 oct. 2023 à 22:10:49


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe