Proposal: Ecosystem: CoffeeScript in Prettier

From @GeoffreyBooth on 2017-11-25 07:12

An oft-requested improvement to the CoffeeScript ecosystem is support for the language in Prettier. Our own @lydell is also a maintainer of that project, so I asked him what would be required to make it happen. He boiled it down to two major tasks:

Produce a detailed abstract syntax tree (AST)

Something would need to be able to produce a JSON representation of the nodes of the abstract syntax tree (AST). An AST is a representation of all the parts of syntax of a program, like AssignmentExpression; the site astexplorer.net has great examples. You can see a simplified version of CoffeeScript’s AST by running coffee --nodes test.coffee. A fuller version can be seen by going to http://asaayers.github.io/clfiddle/ and clicking the AST tab, then one of the nodes in the tree.

Since the CoffeeScript compiler itself already has the --nodes option, it seems logical to me to extend it to produce this JSON-based output. Currently the Node API for the coffeescript module doesn’t support a nodes option, so we could add one, and have its output be plain JavaScript objects that could be JSON.stringify’ed.

That wouldn’t be the end of the job, however. We would also need to ensure that this AST output is complete, with the same amount of information as the original source code, such that you could reconstruct the original source using nothing but this AST. In the CoffeeScript compiler, some simplifications are made at the lexer stage, before the nodes get generated: numbers lose
their original 0x, 0o or 0b prefix (if any), whitespace is lost in multiline strings, multiline regexes are turned into a RegExp() call, etc. These changes would need to be refactored to happen in nodes.coffee, or added detail about the node would need to be saved as a property on the node (like we currently tack on the source maps location data or comments). The goal is that this JSON representation of the source code could then be used to output new source code, formatted as Prettier deems it should be formatted. Which leads us to:

Write a CoffeeScript code generator

Once a JSON version of the AST is available, we’ll need some function that takes it as input and produces a string of CoffeeScript source code as output. You’ve probably seen one of these already: js2coffee takes an AST produced by a JavaScript parser and creates CoffeeScript source code from those nodes. The function that does this is called a code generator, and js2coffee’s is here. With dependencies, it’s over a thousand lines of code. There’s one other CoffeeScript code generator that I’m aware of, cscodegen produced by the CoffeeScriptRedux effort, but it hasn’t been updated since 2012.

Prettier is itself a code generator. If it were to support CoffeeScript, a new code generator would need to be written as part of Prettier itself. Within the Prettier codebase, the code generators for supported languages are in src/printer*.js. One code generator supports all of JavaScript plus TypeScript and Flow, and it’s plain printer.js. It’s 5,000 lines of code. Writing a similar generator for CoffeeScript might not be much simpler, but you would be able to use js2coffee and cscodegen’s codebases as reference (not to mention Prettier’s JavaScript code generator) so you’re not starting from scratch.

So . . .

I would be willing to tackle the first task, outputting a detailed JSON AST, if one or more volunteers were up for the second task. Does anyone desire CoffeeScript support in Prettier strongly enough to invest the time in writing a quality CoffeeScript code generator?

2 thoughts on “Proposal: Ecosystem: CoffeeScript in Prettier

  1. A little progress update: reached a nice milestone (with the help of #5079) that I can reformat the whole Coffeescript test suite (ie tests/*.coffee) with Prettier and the tests still pass. That should give you a sense of the coverage of language constructs as far as outputting usable AST nodes. And they’re formatted rather nicely (sections below ~~~~~~~~ lines are the reformatted versions)

    This is using my prettier-plugin-coffeescript repo and prettier Coffeescript branch

    In order to achieve rules like “call parens are optional if the enclosing parent breaks”, eg this being ok:

    f(
      g h
      i
    )
    

    but this g() requiring parens:

    f(g(h), i)
    

    , I had to introduce a new formatting primitive to Prettier (opened PR). But with that primitive in place, I’m able to have a rather sophisticated awareness of eg when it’s ok to omit parens/braces, which I see as imperative for an opinionated Coffeescript formatter

    The biggest remaining chunk of both AST generation and Prettier formatting is comments