Type Inference in JavaScript

25 Jan 2019

Everybody seems to be on the TypeScript bandwagon these days, and to me it seems obvious why: it takes the wild west of JavaScript and tries to smooth it out to look more like Java/C#/C++ that have much wider adoption. Being able to specify that a variable or field is of type number is extremely valuable since it helps to curb a lot of the problems inherent in the fast-and-loose JavaScript type system. There is also good support for things like union types (e.g. string | number, read "string or number") that more closely resemble the real way that JavaScript code is written. Unfortunately I think that while TypeScript implements some really cool types, I'm not sure it covers all of the types that you need to effectively handle JavaScript code without avoiding some of the more common pitfalls. I'm also a huge fan of type inference, so I set out on a hypothetical exercise in type inference.

What is Type Inference?

If you're not familiar with the phrase, you may still be familiar with what it means in practice. Type inference is when you don't have to specify a type explicitly because the compiler just looks at what's going on and says "I've got this". Here's a quick list of examples from popular languages:

In Java, if you're assigning to a variable or field with a known type, you don't need to specify generic parameters, Java can just figure it out for you:

List<Integer> integers = new ArrayList<>();

In C# when you specify a variable using var, the compiler makes it the same type as the expression that you're assigning.

var x = 3 + 4; // x is "int" because "3 + 4" is "int"

C++ does the same as C#, but uses the auto keyword instead of var (since auto was already reserved, but not a particularly useful keyword in its old usage)

auto x = 3 + 4; // x is "int" because "3 + 4" is "int"

Baby Steps

Take a function like this:

function identity(x) {
    return x;
}

What type should it be? In TypeScript you could decide to make it int => int or string => string or any => any, but we want to do type inference. So the first thing we want to do is take all the inputs and give them type parameters:

function identity<T1>(x: T1) {
    return x;
}

So, now that we have a type parameter, it's pretty easy, this function returns T1 making the function of type <T1> (x: T1) => T1. Well that was pretty easy, but it would have to since it's the first example :)

What if we wanted a function to get the last item out of an array:

function last(arr) {
    return arr[arr.length - 1];
}

Let's start with the same old trick: make a generic parameter for each of the arguments.

function last<T1>(arr: T1) {
    return arr[arr.length - 1];
}

Now this type is clearly wrong, because we wouldn't want you doing something crazy like last(5), but this type says it accepts anything, so that'd be valid. Now we need to take the things that we know about arr and put them in the type parameters. First off, we know that it is an object with a length property, let's put that in the type.

function last<T1 extends { length: infer T2 }>(arr: T1) {
    return arr[arr.length - 1];
}

Now, wait, that's not right. Why didn't I say it was a number, after all, we're dealing with an array? Well, we have to constrain ourselves to a set of rules that can happen without any human intervention. We could say that any property named "length" always has to be a number and that actually might get us further than you'd think, but we might not have to do that to get a good outcome. It's possible, though somewhat… yucky… that length might return " 12 ", which as we all know in JavaScript " 12 " - 1 is 11. (extra credit: there are much, much yuckier types that will work for length, look up JavaScript type coercion)

OK, so now let's see. The other thing that affects arr is the indexer, so let's add that to the type:

function last<T1 extends { length: infer T2, [T2 - 1]: infer T3  }>(arr: T1) {
    return arr[arr.length - 1];
}

HOLD YOUR HORSES. <T1 extends { length: infer T2, [T2 - 1]: infer T3 }> (arr: T1) => T3? What on Earth did I just put there? That's not a type, that's just a whole lot of nonsense. Well, it is and it isn't. You see, in a language like C# or Java, we'd just call the indexer something like [number]: infer T3 because that's what their type systems are all about, but in JavaScript land, we want a much more powerful type system, so we're going to have to add some new ideas. See, when I put in length: infer T2, you'd probably think that means T2 is just number, but while that can be the type of T2 (as long with some other yuck), sometimes I can tell you exactly which number T2 represents. Case in point: in last([1, 2, 3]) I can tell you that length is exactly of the type 3, and since there's only one value 3 that's of type 3, I can also tell you that T2 - 1 is exactly of type 2. This has probably either made you sick to your stomach, or you're getting butterflies (honestly, it's hard to tell sometimes). But it also means that I can tell you for sure that T3 is 3 since I know what the items in that array are. It actually makes a lot of things possible that are cooler than they might seem at first:

Now what's the point, who would want such a thing? Well, I'd argue that the fact that you can write such a function and it'll work means that we should want a type system that understands such things. All this and more is possible without ever typing out types yourself, if you can just get over the initial hurdles of representing types as a mix of "traditional" types and also constants.