Introducing sphinx-js, a better way to document large JavaScript projects

Until now, there has been no good tool for documenting large JavaScript projects. JSDoc, long the sole contender, has some nice properties:

  • A well-defined set of tags for describing common structures
  • Tooling like the Closure Compiler which hooks into those tags

But the output is always a mere alphabetical list of everything in your project. JSDoc scrambles up and flattens out your functions, leaving new users to infer their relationships and mentally sort them into comprehensible groups. While you can get away with this for tiny libraries, it fails badly for large ones like Fathom, which has complex new concepts to explain. What I wanted for Fathom’s manual was the ability to organize it logically, intersperse explanatory prose with extracted docs, and add entire sections which are nothing but conceptual overview and yet link into the rest of the work.1

The Python world has long favored Sphinx, a mature documentation tool with support for many languages and output formats, along with top-notch indexing, glossary generation, search, and cross-referencing. People have written entire books in it. Via plugins, it supports everything from Graphviz diagrams to YouTube videos. However, its JavaScript support has always lacked the ability to extract docs from code.

Now sphinx-js adds that ability, giving JavaScript developers the best of both worlds.

sphinx-js consumes standard JSDoc comments and tags—you don’t have to do anything weird to your source code. (In fact, it delegates the parsing and extraction to JSDoc itself, letting it weather future changes smoothly.) You just have Sphinx initialize a docs folder in the root of your project, activate sphinx-js as a plugin, and then write docs to your heart’s content using simple reStructuredText. When it comes time to call in some extracted documentation, you use one of sphinx-js’s special directives, modeled after the Python-centric autodoc’s mature example. The simplest looks like this:

.. autofunction:: linkDensity

That would go and find this function…

/**
 * Return the ratio of the inline text length of the links in an element to
 * the inline text length of the entire element.
 *
 * @param {Node} node - The node whose density to measure
 * @throws {EldritchHorrorError|BoredomError} If the expected laws of the
 *     universe change, raise EldritchHorrorError. If we're getting bored of
 *     said laws, raise BoredomError.
 * @returns {Number} A ratio of link length to overall text length: 0..1
 */
function linkDensity(node) {
  ...
}

…and spit out a nicely formatted block like this:

(the previous comment block, formatted nicely)

Sphinx begins to show its flexibility when you want to do something like adding a series of long examples. Rather than cluttering the source code around linkDensity, the additional material can live in the reStructuredText files that comprise your manual:

.. autofunction:: linkDensity
   
   Anything you type here will be appended to the function's description right
   after its return value. It's a great place for lengthy examples!

There is also a sphinx-js directive for classes, either the ECMAScript 2015 sugared variety or the classic functions-as-constructors kind decorated with @class. It can optionally iterate over class members, documenting as it goes. You can control ordering, turn private members on or off, or even include or exclude specific ones by name—all the well-thought-out corner cases Sphinx supports for Python code. Here’s a real-world example that shows a few truly public methods while hiding some framework-only “friend” ones:

.. autoclass:: Ruleset(rule[, rule, ...])
   :members: against, rules

(Ruleset class with extracted documentation, including member functions)

Going beyond the well-established Python conventions, sphinx-js supports references to same-named JS entities that would otherwise collide: for example, one foo that is a static method on an object and another foo that is an instance method on the same. It does this using a variant of JSDoc’s namepaths. For example…

  • someObject#foo is the instance method.
  • someObject.foo is the static method.
  • And someObject~foo is an inner member, the third possible kind of overlapping thing.

Because JSDoc is still doing the analysis behind the scenes, we get to take advantage of its understanding of these JS intricacies.

Of course, JS is a language of heavy nesting, so things can get deep and dark in a hurry. Who wants to type this full path in order to document innerMember?

some/file.SomeClass#someInstanceMethod.staticMethod~innerMember

Yuck! Fortunately, sphinx-js indexes all such object paths using a suffix tree, so you can use any suffix that unambiguously refers to an object. You could likely say just innerMember. Or, if there were 2 objects called “innerMember” in your codebase, you could disambiguate by saying staticMethod~innerMember and so on, moving to the left until you have a unique hit. This delivers brevity and, as a bonus, saves you having to touch your docs as things move around your codebase.

With the maturity and power of Sphinx, backed by the ubiquitous syntactical conventions and proven analytic machinery of JSDoc, sphinx-js is an excellent way to document any large JS project. To get started, see the readme. Or, for a large-scale example, see the Fathom documentation. A particularly juicy page is the Rule and Ruleset Reference, which intersperses tutorial paragraphs with extracted class and function docs; its source code is available behind a link in its upper right, as for all such pages.

I look forward to your success stories and bug reports—and to the coming growth of rich, comprehensibly organized JS documentation!


1JSDoc has tutorials, but they are little more than single HTML pages. They have no particular ability to cross-link with the rest of the documentation nor to call in extracted comments.

About Erik Rose

Erik chips away at the barrier between human cognition and machine execution, through projects like DXR (search & static analysis on Mozilla codebases), Fathom (semantic extraction from web pages), parsers, new languages, and a whole mess of Python libraries.

More articles by Erik Rose…


15 comments

  1. Rudolf Olah

    Really excited about this. Always loved using Sphinx for Python projects and introduced it in almost every Django project because it’s just so damn useful. JSDoc left a lot to be desired. In the Ruby world using YARD has been pretty great but it isn’t Sphinx.

    One thing that’s great about Sphinx is the multiple output formats, I’ve had clients who wanted a PDF of the API and being able to create an EPUB is pretty sweet too. The single page vs multiple page HTML builders/output formats are also immensely useful.

    …anyway, thanks for the great work!

    July 17th, 2017 at 08:57

    Reply

    1. Carol Willing

      Thanks Erik and other Mozilla contributors. I just tried out sphinx-js on one of our Jupyter projects. Took about 15 minutes to get started. Thanks for the great documentation and Fathom example. :D

      July 17th, 2017 at 19:36

      Reply

      1. Erik Rose

        Thanks for giving it a shot and letting us know! (And thanks for your PSF, CPython, and Jupyter work, while I’m at it!)

        July 18th, 2017 at 06:44

        Reply

  2. Roy Sutton

    I was pretty excited when I started reading this. Then I got to this line:

    “In fact, it delegates the parsing and extraction to JSDoc itself, letting it weather future changes smoothly.”

    JSDoc has been dead in the water for quite a long time (unless something has changed recently) and doesn’t support modern JavaScript libraries. Have you looked at something like documentation.js as a parser back-end instead?

    July 17th, 2017 at 14:29

    Reply

    1. Erik Rose

      I wrote Fathom in ES6, and JSDoc seems to do just fine: classes, generators, spread operators, destructuring assignment, getters and setters, etc. However, I’d be happy to swap it out for something that supports even more and is well-maintained. We did look into documentation.js just the other day (https://github.com/erikrose/sphinx-js/issues/18) and found that its author is having existential doubts (https://macwright.org/2017/06/06/documentation-js.html). The other contender, which was brought to my attention via the Hacker News thread (https://news.ycombinator.com/item?id=14786554) is esdoc (https://github.com/esdoc/esdoc). It seems mostly the work of a single person but may go even further with its support for modern syntactical features. Do you have any thoughts on it?

      [Edit: To be fair, jsdoc also looks to be largely the work of one person.]

      July 18th, 2017 at 06:50

      Reply

      1. Tom MacWright

        Hi there (Tom of documentation.js here) – just to clarify, my existential doubts are about contributors, not the code in the project. I think the esdoc fellow has the same feeling, and I’m sure JSDoc does too – they have even more users than my project or esdoc, and not many contributors.

        No shade from here: I think sphinx-js is a great and original take on documentation generation. But I want to clarify that, well, this problem of maintainer burnout from their maintainer-bases not growing is pretty much the undercurrent of modern JavaScript, and the only way we’ll make progress or even survive is more folks contributing. So, well, if anyone wants to contribute or is thinking about helping, please do!

        July 20th, 2017 at 19:58

        Reply

        1. Erik Rose

          Thanks for clarifying!

          As the author of several open-source projects, I empathize. A few of my projects have attracted “champion committers” who I’ve drafted as maintainers, like Bo Bayles on more-itertools. That’s such a pleasure. But most projects just go at the speed I can manage.

          I hesitate to venture advice, but this is an important enough subject that I’ll throw something at the wall for what it’s worth. A lot of us have this problem; it’s been a rising theme in many open-source communities. My recent strategy has been to improve my response time on ticket comments, get my own docs in order, and promptly hand over commit rights and authorship credit to the people whose involvement I see accelerating. Fast is better than comprehensive. That seems to help build a community of clueful committers who can mutually support. You covered some of that in your article.

          I suspect the problem is worse in JavaScript Land, which has a technical inducement to splintering. It lacks even the most basic affordances that one would find in a standard library. This leads everyone to make their own (a problem you also see in Lisps), leading to a profusion of even primitives. Two examples are the Map and Set data types (which are *still* only half baked, lacking things like intersection) and iteration (which is such a late addition to the language that the existing Array-assuming codebase will act as a brake on its use for years to come). Incompatible primitives lead to non-interoperability of higher-level libs and the ecosystem at large, with outworkings even into documentation systems, which are further plagued by competing attempts to mitigate JS’s morbid combination of silent failures and a weak type system. It would be great if somehow we could set up the sorts of “attractors” that occur in the Python world, like the requests package (which everyone uses for HTTP 1.x) or the protocols defined in the stdlib (iteration, maps, sequences). I’m just not sure how to go about that.

          July 21st, 2017 at 13:23

          Reply

  3. Veera

    Looks promising.

    July 18th, 2017 at 21:54

    Reply

  4. Shrike

    How about support for TypeScript?

    July 20th, 2017 at 02:47

    Reply

    1. Erik Rose

      I’d be happy to entertain a patch for TypeScript support. Or if you merely know of a TypeScript metadata extractor that emits a tractable data format, that would be valuable to know. It would be trivial to make the extractor we use pluggable (see https://github.com/erikrose/sphinx-js/issues/17#issuecomment-316668901). Then supporting other sources would be as simply as writing an adapter that massages their output into JSDoc’s doclet format.

      July 20th, 2017 at 06:57

      Reply

  5. yahiko

    I’m quite reluctant to install some Python stuffs to document my JS sources… Can’t it be fully in JS/NodeJS?

    July 20th, 2017 at 04:02

    Reply

    1. Erik Rose

      Easy to say, harder to do. Sphinx has 10 years of development behind it and is very mature. So I chose to write a short adapter rather than try to recreate 360,000 lines of Sphinx in JS. The Linux kernel people recently made the same decision, adopting Sphinx rather than writing their own tool in C. :-)

      July 20th, 2017 at 07:03

      Reply

  6. Axel Rauschmayer

    I take it, Sphinx also dictates reStructuredText? I’d prefer Markdown.

    July 20th, 2017 at 11:42

    Reply

    1. Erik Rose

      reStructured text has several advantages over Markdown, like a single, well-defined syntax that allows for nesting of arbitrary constructs. It also has “directives” for expanding the syntax. There is http://www.sphinx-doc.org/en/stable/markdown.html, but I’m not sure if it supports an equivalent to ReST directives, which you’d need in order to call sphinx-js’s functionality (or make anything else beyond a simple formatted page).

      July 20th, 2017 at 12:06

      Reply

    2. Erik Rose

      I dug in a bit, and you can use directives in recommonmark, but you have to wrap them in a bulky block, e.g.

      “`eval_rst
      .. autoclass:: recommonmark.transform.AutoStructify
      “`

      For more, see https://recommonmark.readthedocs.io/en/latest/auto_structify.html#embed-restructuredtext.

      July 20th, 2017 at 12:14

      Reply

Post Your Comment