The Road to TypeScript at Quip, Part Two

By Mihai Parparita

Our previous post described the motivation for Quip’s migration to TypeScript, as well as the “medium bangs” approach we would take. This one has some vignettes from various things that Quip’s Client Infra team encountered while doing that migration.

Communication and education

While the automated migration approach was aimed at minimizing complex transition periods, there was still the risk of an impact on productivity. We didn’t want engineers to come in after a flag day and either be surprised or not able to work because all of the best practices had changed. We employed the following strategies to mitigate this:

  • We wrote play-by-play documents with what will happen before, during, and after a migration. These were advertised at an all-hands and via chat, and feedback was solicited.
  • We carefully scheduled flag days to minimize disruption (for example, to not happen at the same time as major launches, or when large parts of the company would be out of office).
  • We created “cheat sheet” documents with equivalents of common idioms before and after the migration. We also took familiarity into account when deciding what code style the migration passes would output (for example, a preference for wildcard imports when dealing with “bag of functions” utility files, to mimic the previous namespace style).
  • The Client Infra team held office hours before and after the migration, as well as made sure that someone was available in our “Eng Help” channel to answer any questions.
  • We wrote new documentation about how to resolve common errors and debug generated code, and updated as much of the existing documentation as we could find.
  • About a month after the migration, we sent out a survey and wrote a retrospective document to make sure that we didn’t just walk away after the flag day. The biggest issues were around slow tooling, especially when run as a pre-commit check. (See below for how we handled TypeScript errors and prevented regressions; we ended up writing a persistent server to speed up these checks).

Migrating from React.createClass

In addition to migrating from namespaced JavaScript and from types in JSDoc to TypeScript annotations, we also moved from React.createClass to ES6 class components. Though this was technically not required to switch to TypeScript, the older createClass approach is not fully type-checked by TypeScript, thus we would lose a lot of type safety if we didn’t do this.

export const ThreadMention = React.createClass({
    mixins: [parts.mixins.Listener],

    propTypes: {
        objectId: React.PropTypes.string.isRequired,
        secretPath: React.PropTypes.string,
        ...
    },

    ...
});

React.createClass component

We had still been using createClass because we had code patterns that were heavily reliant on mixins. Some of the mixins were just collections of utility functions. We were able to convert those to plain functions, and stop relying on them. Others, however, relied on React lifecycle methods, and would result in much more boilerplate if we switched them to plain functions.

We therefore decided to keep supporting mixins by having runtime code do the appropriate method injection and composition. Though this migration was done while we were still using Closure Compiler, we also had a typesafe mixin plan (using decorators) for TypeScript, to make sure that this would not create problems for the next migration.

export class ThreadMention extends React.Component {
    /** @param {ThreadMention.Props} props */
    constructor(props) {
        super(props);
        ReactSupport.deprecatedBindMethodsToInstance(this);    
    }

    ...
}

ThreadMention.propTypes = {
    objectId: React.PropTypes.string.isRequired,
    secretPath: React.PropTypes.string,
    ...
};

ReactSupport.mixin(ThreadMention, ListenerMixin);

ES6 Class component

type ThreadMentionProps = {
    objectId: string,
    secretPath?: string,
    ...
};

@mixin(ListenerMixin)
export class ThreadMention extends React.Component<ThreadMentionProps>
    implements MixedIn<ListenerMixin> {
    constructor(props: ThreadMentionProps) {
        super(props);
        deprecatedBindMethodsToInstance(this);    
    }

   ...
}

TypeScript Class component

Another difference between createClass and class components is that the former have autobinding. While this makes the use of methods as event handlers very convenient, it has performance overhead — especially because it’s done to all methods, even those that are not used as event handlers — and is too much “magic.” We wanted to discourage its use in new code, but we didn’t want to risk breaking existing code. We therefore chose to make it explicit as part of the transform, where all existing components would have a deprecatedBindMethodsToInstance call in their constructor. By both giving it an unappealing name and flagging new uses via a pre-commit check, we’ve driven down its usage over time.

Hooks are the better long-term solution to both of these, but they appeared when we had already written a large part of our migration pipeline, and are also different enough that we could not automate a migration to them. It is something that we would like to start adopting for new code.

Bundling and developer mode serving

In addition to type checking, Closure Compiler also handled bundling and minification for us. We wanted to switch to a more “vanilla” toolchain, and after some experimentation, we settled on Rollup and Terser. This produced pretty good results; the biggest issue was that unused method removal is not supported, which made some classes much larger. The bulk of this was due to generated code from protocol buffers, and we ended up using runtime method generation instead to avoid the size hit.

However, for development-time serving this toolchain was too slow. Engineers would have had to wait for a minute or two to see changes. We therefore wrote our own custom development server that compiles and serves one file at a time as ES6 modules, relying on HTTP2 and caching to make serving of thousands of files still performant. The overall approach ends up being similar to the ones that Snowpack and Vite employ.

Having a custom dev server also enabled some nice tricks to improve the developer experience. One of the benefits of the old “namespaced” JS was that all symbols were accessible in the Dev Tools console for REPL exploration. When everything is in a module (or bundled into one file), it’s much harder to do such “exploratory” coding. We therefore augmented the dev server to also output a namespace-style $M object with all exported symbols grouped by directory:


Dev Tools console experience

Dev server-generated code to support it

A long tail

While migrating the core application code took up the majority of the effort, there were many other ancillary parts of the client stack that had to be touched as part of the TypeScript migration:

  • We have a large collection of custom ESLint rules to catch common errors and automate the style guide. Many of those had to be updated handle the TypeScript AST.
  • We use code generation for a few things: protocol buffers, emoji autocomplete data, and an automatic “asset/icon catalog.” All those had to be changed to output TypeScript too.
  • Our spreadsheets formula engine is written in JavaScript, so that we can run it on both the client and the server. The migration pipeline had to handle it too, and the server runtime had to handle generated code from TypeScript.
  • Our UI tests would sometimes have raw JavaScript in them (where interacting with the DOM was too tedious). We ended up replacing that with a protocol buffer-based API, so that they would be agnostic to what the generated code was like.
  • TypeScript’s enums (especially const enums) are not quite the same as Closure Compiler @enum type. We had to remove some runtime and reflection uses and introduce helpers for getting all of the values of enum types.

Driving down TypeScript compile errors

Even using TypeScript’s most permissive compilation mode, we knew that there was little we could do to avoid arriving in TypeScript with a large number of errors. In the long run this was a benefit, as we wanted to have TypeScript catch more things at compilation time, instead of getting real bugs in production. However, in the short term, we would need to deal with this onslaught of errors. Our approach to this was two-fold:

  1. Segment all TypeScript error codes into two groups: those we felt potentially pointed to a run-time problem and everything else. We would enumerate all errors which fell into the first camp and fix them prior to running the transformation. Doing this helped us to find bugs in our conversion step, and it also resulted in us fixing some potential bugs in the Quip code while still in Closure-land. We even created a custom Closure annotation called @tstype which was used sparingly but allowed us to encode inline a specific TypeScript type when there wasn’t a suitable automated conversion.
  2. Create a mechanism to simply “live with” a large number of TypeScript errors once we committed the result of the transformation. We wanted this because:
    1. We didn’t want to pause general engineering on the product while we drove the error rate down.
    2. We wanted not to be rushed in driving the error rate down and instead feel at liberty to refactor code when it seemed warranted.
    3. We wanted to be able to progressively ratchet down the compilation mode by enabling progressively stricter type checking. This would be a cycle of driving the error rate down and having it spike up again for each new stricter type of checking we enabled.

To codify this, we ended up with a legacy-typescript-errors.txt file which would be checked into Git along with the source. It lists error codes by file, in the order they appear:

A Git pre-commit hook manages this file allowing the removal of legacy errors but preventing the addition of new ones.

Our hope was that the workflow for engineers not specifically focused on driving down the error rate would be relatively normal — in the sense that they would only get compile errors for changes related to local edits at the time of commit. At the same time, when working on their change (for example in VSCode), the presence of red squiggly lines would be some incentive for them to take some additional time to clear some legacy errors while they were there.

When we did the migration, we were at roughly 6,000 errors. With the help of some tooling, as well as encouraging engineers to use “fixit” time to resolve errors in parts of the code they were familiar with, we were able to bring this down to near zero in a few months. In the process, we identified 30 small-yet-real bugs that TypeScript flagged at compile time, which Closure Compiler had not been able to catch.

Encouraged by the success of this approach, we decided to extend the legacy errors concept beyond the initial set from the migration. We therefore enabled TypeScript’s --strict option and have since been working through those errors, too. These have mostly been around strict null and undefined checks — see the Visual Studio Code’s team’s experience for dealing with those.

Expected/legacy TypeScript errors over time

An Unexpected Outage

A couple of weeks after the successful release of our TypeScript-backed code to production, we noticed that some servers were becoming unresponsive and eventually getting restarted by the supervisor process that managed them. These servers were responsible for handling uploads from users, which usually means attachments to documents and messages, but also involves reports of client-side JavaScript exceptions that were encountered.

After some digging, it turned out to have to do with the style of JavaScript at the new build pipeline generated — see if you can spot the difference below:


Closure Compiler-generated JS

TypeScript/Rollup/Terser-generated JS

While both outputs are minified, Closure Compiler occasionally puts in a line break (every few hundred characters) but Terser does not (by default). This should make no difference to the browser, and indeed it does not. However, when we receive an exception report on the server, we want to turn its call stack back to the original (un-minified) code to make debugging it easier. This is done using a source map that also gets output as part of the build process.

It turned out that our source mapping code had O(N) behavior for N being the output line length. When N was in the hundreds, this was not a problem, but when N became 1 million or more, this became much more CPU intensive. This was not immediately apparent after the migration because we have a low volume of errors. A few weeks later a harmless-yet-high-volume error was accidentally deployed, which effectively became a DDoS attack that we triggered on ourselves.

The fix was to change the mapping code to use a binary search, which made the process be O(log N).

The Road Continues

We hope you’ve enjoying reliving our adventures on the road to TypeScript. We’ve already mentioned that having Lucidchart’s experiences was invaluable, and were also interested to read about the path that Airbnb took. Mature codebases are all different, so different tradeoffs and technical solutions are going to be needed. The more these experiences are shared, the better other teams will be informed when attempting their migrations, whether it’s to TypeScript or any other new technology.