In Python, don’t initialize local variables unnecessarily

A common pattern:

def foo():
    x = some value # but do we need this? (short answer: no)
    if something:
        # ... do stuff ...
        x = 'bla'
    else:
        x = 'blo'

The variable x is being initialized before the if/else, but the intention of the programmer is that its value will actually be determined by the if/else itself. If somebody later comes around and mistakenly removes one of the assignments (inside ‘if’ or ‘else’), no runtime error will occur and x will remain initialized to a probably wrong value.

Leaving out the initialization is better – in that case, forgetting to set x in one of the branches will cause an UnboundLocalError:

>>> def foo():
...     if False:
...         x = 0
...     return x
... 
>>> foo()
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 4, in foo
UnboundLocalError: local variable 'x' referenced before assignment

Errors are good! (when they flag buggy code)

Now, what if we also have an x declared in the global scope? Because of how Python handles variable scope, the error will still happen (which is good).

>>> x = 1
>>> def foo():
...     if False:
...         x = 0
...     return x
... 
>>> foo()
Traceback (most recent call last):
..
UnboundLocalError: local variable 'x' referenced before assignment

Summary: In Python, don’t initialize variables until you know what value to assign to them.

In Python, don’t initialize local variables unnecessarily

Beware of ‘var’ in for loops, JS programmers

Time and time again, I see people using ‘var’ in the initialization part of a for loop. Example from MDN (Mozilla Developer Network):

for (var i = 0; i < 9; i++) {
   console.log(i);
   // more statements
}

What’s wrong with var i = 0 above? The problem is that variables declared in a for initialization have function scope, just like any var declaration does. In other words, they affect the scope of the entire function. Consider the following:

function outer() {
    var x = 'outer';
    function inner() {
        x = 'inner';
        //
        // ... lots o' code
        //
        for (var x = 0; x < 1; x++) {
            // in for
        }
    }
    inner();
}

In the inner function, x shadows the outer variable throughout, not just inside the for loop. So also the initial statement x = 'inner' at the head of ‘inner’ affects only the locally scoped variable.

This is a classic example of var hoisting, which should qualify as one of JavaScript’s awful parts.

Don’t do it! Move all your ‘var’ statements to the head of each function, please.

Beware of ‘var’ in for loops, JS programmers

infernu news

In the past month, most of the work on infernu was to stabilize the type system, making decisions about some trade-offs. Here’s a short summary:

  • Colors! I refactored all the pretty printing code to use the ansi-wl-pprint package, which fixes indentation and formatting issues and also adds ansi colors to the output. One neat thing is that identical type variables have the same color:
    infernu-colors
  • Changed the way constructor functions are treated. Until now, constructor functions were treated at definition site as regular functions. The difference between constructors and non-constructors only happened at “new” calls, where the expected “this” value was forced to be a closed row type. Unfortunately this breaks if you call “new Foo()” inside the definition of “Foo”.
    To avoid this issue, functions with uppercase names are now treated specially and the “this” row-type is forced to be closed when the function name is added to the environment. This allows maximum flexibility while defining Foo, while ensuring Foo’s type is closed outside the constructor to prevent junk code like var x = new Foo(); x.myWrongProperty = 2;
  • Explored the idea of “Maybe” (or “optional”) types, including having common JS APIs use them for stronger safety. For example, array access should return a Maybe value.
    Unfortunately, since there is no syntax to construct a Maybe-typed value (no “Just”), all values can be implicitly “upgradeed” to Maybe. In other words, there is an ambiguity that break type inference. So for now, not implementing Maybe types (perhaps with explicit annotations they can come back).
  • Decided to disable generalization of row fields (object properties). This decision means that user-defined objects will by default not have polymorphic methods, although the object itself could be polymorphic (and thus different occurrences of the object in the program syntax, will allow instantiation to different types). The reason for this decision is that overly-polymorphic fields cause unexpected type errors, such as when passing objects that contain them as parameters to functions (contravariance can surprise you).
  • Started a document sketching out denotational semantics of JS, as infernu decides these should be, which helped clarify a few issues in the JS -> LC translator. The next step is to change all translations to preserve semantics, currently they only preserve types.
  • Bug fixes: polymorphic subsumption checking, unification of recursive types.
  • Increased compatibility: now using base-compat and a custom prelude to increase compatibility with different GHC versions (thanks to RyanGlScott for submitting a fix to use base-orphans which prompted me to do this).
infernu news

A simple problem with recursive types and subtyping

Here’s a simple example of recursive types interacting badly with subtyping:

T={ foo: T -> A}
U={ foo: U -> B}

Consider T <: U, therefore

(T -> A) <: (U -> B)

Which implies:

U <: T

So T <: U but also U <: T, which is true iff A <: B and B <: A.

In my case, the subtyping relation is polymorphic subsumption: T is subsumed by U iff U is “more polymorphic”, intuitively, it can be instantiated to all types that T can.

This situation arises in rather simple code, involving polymorphic vs. non-polymorphic row fields. For example, A is a row with a polymorphic method, whereas B is a row with a monomorphic (but compatible) method, such as:

A = { method: forall a. a -> () }
B = { method: String -> () }

In this case subsumption (the form of subtyping in play) fails.

One way around this is to avoid subsumption issues altogether by keeping things rank-1, and not using higher-rank row fields. Unfortunately, throwing away polymorphic methods is very bad: consider a non-polymorphic array.map (in JS).

A slightly better workaround is to push the foralls all the way out, keeping all types (including row types) rank-1. Every time an object method is accessed via the property syntax obj.method, we end up instantiating the object’s row type, and get a “fresh” type for the method. We get practically polymorphic methods. That’s the approach I’m investigating for infernu.

A simple problem with recursive types and subtyping

Identity Crisis

Compared to other tools adding static types to JavaScript, Infernu’s main strengths are full type inference and strong type safety. Here are a few examples.

Identity Function

Here is the simplest possible function:

function id(x) {
  return x;
}

TypeScript

TypeScript‘s compiler tsc can generate a .d.ts from that:

declare function id(x: any): any;

That’s no good! We can pass any type we want to our function, but the return type is not tied to the argument type – it is treated as anything, basically untyped. So we can do this:

var n = 'hi'; n = id(5);

And TypeScript will output the following .d.ts:

declare var n: string;

That seems wrong: n is assigned a number via id(5). But wait – there is a way to turn off inference of any types (with --noImplicitAny). If we try that on our identity function, we get:

id.ts(1,13): error TS7006: Parameter 'x' implicitly has an 'any' type.

Explicit Generics

Oops. Ok, but TypeScript has generics! Let’s try that: the TypeScript handbook gives exactly the example we need – we just write the type out explicitly, like so:

function identity<T>(arg: T): T {
    return arg;
}

Great! We got what we needed, but without type inference.

Flow

Facebook’s Flow has a type system that’s (slightly?) different from TypeScript’s, and apparently works differently. Let’s try it. We can use the flow suggest command to get suggested typing (I’m using version 0.7). Here’s what we get for a single file containing only the identity function above: nothing. It doesn’t suggest any type. Ok, let’s try using our id in a way that makes no sense, to induce an error (after all, type checkers are used to find errors). Here’s bad_id.js:

/* @flow */
function id(x) { return x;}
var n = 'hi'; n = id(5);
var z = n; // added so we can see what flow says the type of n is here.

(Note: The /* @flow */ header is used to tell flow that it should look at this file.)
Run flow suggest bad_id.js and you get a diff-style output. I’ve ‘applied’ it to make it easier to read – here’s what flow suggests:

function id(x: number) : number{ return x;}
var n: string | number = 'hi'; n = id(5);
var z: number = n;

Interesting! We managed to get something without reverting to explicit type annotations. But we didn’t get an error!

First, id was inferred to take and return number, apparently because that’s the only way we’ve used it. It would be interesting to see what happens when we use id several times with different types – we’ll try that soon.

Second, n was given a union type string | number, because it takes on both types during its lifetime. It may be a matter of taste, but I would rather not have the type checker deduce implicit union types in this case (n = 'hi'; n = 5;) – instead I would expect that to be an error.

The unique (and impressive) part is that flow is able to tell that z is only ever going to have number values, and so it’s safe to assign it that type. That’s probably a result of the flow analysis they do.

Now let’s try calling id several times, with different types:

/* @flow */
function id(x) { return x;}
id(5); id('hi');

Flow suggests:

function id(x: string | number) : string | number{ return x;}

Uh oh – does this means the argument and result types are no longer tied to each other? If I pass in a number, will the compiler check that I use the result only as a number (and not as a string)? Let’s try using it, doing var n = id(5), flow suggests:

var n: string | number = id(5);

Despite n only ever being assigned a number, it now has type string | number. So apparently, union types propagate implicitly, infecting everything on the way.

Explicit Generics

Fortunately, flow too has generics, and again straight out of the manual we find:

/* @flow */
function foo(x: X): X { return x; }

Great! We got what we needed, but without type inference.

Infernu

Let’s get down to business. Infernu says:

//       id : a.(b -> b)
function id(x) { return x; }

Cool! Without any help from us, Infernu figured out the most generic type. Take a type b, return the same type b. The magic of polymo…Wait, what’s that a. thing?

Well, JavaScript has this nice keyword called this which is dynamically scoped, meaning that this is bound to different things depending on how your function is invoked and not on how it’s defined. For example:

var obj = { hat: { type: 'top' }, getHatType: function() { return this.hat.type; } };
obj.getHatType(); // ok.
var f = obj.getHatType;
f(); // oops! TypeError: Cannot read property 'type' of undefined

Nasty stuff. Every JavaScript programmer should know this.

Fortunately, Infernu is here to save you. It infers not only what arguments and return types a function has, but also what this must be for the function call to work, and verifies that you use it correctly.

Infernu type signatures for functions have the following format (subject to change without notice, bla bla bla):

this.(arguments -> result)

So for our var f = obj.getHatType example, Infernu says:

//  f : {hat: {type: d, ..f}, ..e}.(() -> d)
var f = obj.getHatType;

Decomposing that type signature, we read that this is expected to be an object containing at least a property called ‘hat’ which is an object with at least a property called ‘type’ of some type d. The function takes no arguments (hence the empty ()) and returns the same type d that was taken from the hat.type property of this. (The ellipsis stuff ..f and ..e is due to row-type polymorphism, which will be elaborated upon in a future blog post.)

Back to our identity function, we examine the signature again: a.(b -> b) – the type of this is given an unconstrained type parameter a – so Infernu is telling us explicitly that this is allowed to be anything for our identity function. Yippy!

Summary

We saw that both TypeScript and Flow (and Google Closure too, which I haven’t shown) support generics that can express the identity function properly. They also offer weak forms of type inference that sometimes yields weakly-typed results. Infernu, on the other hand, will infer generic types automatically, and prefers to fail over giving weak typings.

There are many known discussions about subtyping (inheritance)-based type systems, represented here by TypeScript and Flow, vs. parametric polymorphism (being Infernu in this case). There are known pros and cons to both sides: but one important result is that type inference is just easier when there is no subtyping involved.

Infernu is designed to take advantage of that.

Identity Crisis

The wl-pprint package maintainer…

…is now me.

I’ve talked to the previous maintainer who is no longer interested. There was a small change required to support the new ghc (7.10), so I also released a new version. Since the new version only benefits people who need a new ghc, I also added the source-repository field in the cabal file, despite it breaking cabals older than 1.6.

According to hackage, wl-pprint was authored by Daan Leijen. The new github repo is up at https://github.com/sinelaw/wl-pprint.
I’ve taken the liberty of forking based on someone else’s patch for GHC 7.10 – thank you Alex Legg!

Incidentally, on hackage, there are several packages with the substring ‘wl-pprint’ in their names, some are newer and better maintained and have more features than others. This situation is rather confusing. wl-pprint itself is being used as a dependency by at least one package I need (language-ecmascript).

It would be nice if the community converged on a single Wadler-pretty-printer package. (Probably not wl-pprint – maybe ansi-wl-pprint?)

The wl-pprint package maintainer…

Type guards are not inferable

Type guards, or typecase expressions allow refining a binding’s type and are commonly used in dynamic languages such as JavaScript. Consider (adapted from Simple Unification-based Type Inference for GADTs):

f x y = typecase x of
            Int -> i + y
            other -> 0

The function can have either of these two types (among others):

a -> Int -> Int

And

a -> a -> Int

Neither is more general, so there is no principal type.

Unless the user tells us something (in a type annotation) we have no way of deciding which type should be inferred.

So for now, Infernu will probably not have support for type guards like Flow and TypeScript do (both of which don’t have full type inference). Another possibility is to allow type guards in trivial cases only.

Type guards are not inferable