aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md290
1 files changed, 290 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 00000000..77b7df66
--- /dev/null
+++ b/README.md
@@ -0,0 +1,290 @@
+> **Tl;dr** — I want to make UglifyJS2 faster, better, easier to maintain
+> and more useful than version 1. If you enjoy using UglifyJS v1, I can
+> promise you that you will love its successor.
+
+> Please help me make this happen by funding the development!
+
+> <a href='http://www.pledgie.com/campaigns/18110'><img alt='Click here to lend your support to: Funding development of UglifyJS 2.0 and make a donation at www.pledgie.com !' src='http://www.pledgie.com/campaigns/18110.png?skin_name=chrome' border='0' /></a>
+
+UglifyJS v2
+===========
+
+[UglifyJS](https://github.com/mishoo/UglifyJS) is a popular JavaScript
+parser/compressor/beautifier and it's itself written in JavaScript. Version
+1 is battle-tested and used in many production systems. The parser is
+[included in WebKit](http://src.chromium.org/multivm/trunk/webkit/Source/WebCore/inspector/front-end/UglifyJS/parse-js.js).
+In two years UglifyJS got over 3000 stars at Github and hundreds of bugs
+have been identified and fixed, thanks to a great and expanding community.
+
+I'd say version 1 is rock stable. However, its architecture can't be
+stretched much further. Some features are hard to add, such as source maps
+or keeping comments in the compressed AST. I started work on version 2 in
+May, but I gave up quickly because I lacked time. What prompted me to
+resume it was investigating the difficulty of adding source maps (an
+[increasingly popular](https://github.com/mishoo/UglifyJS/issues/315)
+feature request).
+
+Status and goals
+----------------
+
+In short, the goals for v2 are:
+
+- better modularity, cleaner and more maintainable code; (✓ it's better already)
+- parser generates objects instead of arrays for nodes; (✓ done)
+- store location information in all nodes; (✓ done)
+- better scope representation and mangler; (✓ done)
+- better code generator; (✓ done)
+- compression options at least as good as in v1; (⌛ in progress)
+- support for generating source maps;
+- better regression tests; (⌛ in progress)
+- ability to keep certain comments;
+- command-line utility compatible with UglifyJS v1;
+- documentation for the `AST` node hierarchy and the API.
+
+Longer term goals—beyond compressing JavaScript:
+
+- provide a linter; (started)
+- feature to dump an AST in a simple JSON format, along with information
+ that could be useful for an editor (such as Emacs);
+- write a minor JS mode for Emacs to highlight obvious errors, locate symbol
+ definition or warn about accidental globals;
+- support type annotations like Closure does (though I'm thinking of a
+ syntax different from comments; no big plans for this yet).
+
+### Objects for nodes
+
+Version 1 uses arrays to represent AST nodes. This model worked well for
+most operations, but adding additional information in nodes could only be
+done with hacks I don't really like (you _can_ add properties to an array
+just as if it were an object, but that's just a dirty hack; also, such
+properties were not propagated in the compressor).
+
+In v2 I switched to a more “object oriented” approach. Nodes are objects
+and there's also an inheritance tree that aims to be useful in practice.
+For example in v1 in order to see if a node is an aborting statement, we
+might do something like this:
+
+ if (node[0] == "return"
+ || node[0] == "throw"
+ || node[0] == "break"
+ || node[0] == "continue") aborts();
+
+In v2 they all inherit from the base class `AST_Jump`, so I can say:
+
+ if (node instanceof AST_Jump) aborts();
+
+The parser was _heavily_ modified to support the new node types, however you
+can still find the same code layout as in v1, and I trust it's just as
+stable. Except for the parser, all other parts of UglifyJS are rewritten
+from scratch.
+
+The parser itself got a bit slower (430ms instead of 330ms on my usual 650K
+test file).
+
+#### A word about Esprima
+
+[Esprima](http://esprima.org/) is a really nice JavaScript parser. It
+supports EcmaScript 5.1 and it claims to be “up to 3x faster than UglifyJS's
+parse-js”. I thought that's quite cool and I considered using Esprima in
+UglifyJS v2, but then I did some tests.
+
+On my 650K test file, UglifyJS v1's parser takes 330ms and Esprima about
+250ms. That's not exactly “3x faster” but very good indeed! However, I
+noticed that in the default configuration Esprima does not keep location
+information in the nodes. Enabled that, and parse time grew to 680ms.
+
+Some would claim it's a fair
+[comparison](http://esprima.org/test/compare.html), because UglifyJS doesn't
+keep location information either, but that's not entirely accurate. It's
+true that the `parse()` function will not propagate location into the AST
+unless you set `embed_tokens`, but the lexer _always_ stores it in the
+tokens.
+
+Enabling `embed_tokens` makes UglifyJS do it in 400ms, which is still a lot
+better than Esprima's 680ms.
+
+In version 2 we always maintain location info and comments in the AST nodes,
+which is why the parser in v2 takes about 430ms on that file (some
+milliseconds get lost because it's more work to create object nodes than
+arrays). I might try to speed it up, though I'm not sure it's worth the
+trouble (parsing 650K in 430ms (on my rather outdated machine) to get an
+objectual AST with full location/range info and comments seems good enough
+for me).
+
+### The code generator, V2 vs. V1
+
+The code generator in v1 is a big function that takes a node and applies
+various walkers on it in order to generate code. The code was _returned_
+from each walker function, and finally assembled into a big string by
+concatenation or array.join, and further returned. It is impossible there
+to know what's the current line/column of the output, which would be
+necessary for source maps. For the same reason, v1 required an additional
+step to split very long lines (that includes an additional run of the
+tokenizer). It's _slow_.
+
+The rules for inserting parentheses in v1 are an unholy mess; we know at
+least [one case](https://github.com/mishoo/UglifyJS/issues/368) where it
+inserts unnecessary parens (non-trivial to fix), and I just discovered one
+case where it generates invalid code—UglifyJS can properly parse the
+following (valid) statement:
+
+ for (var a = ("foo" in bar), i = 0; i < 5; ++i);
+
+however, the code generator in version 1 will break it by not including the
+parens (the `in` operator is not allowed in a `for` initializer, unless it's
+parenthesized).
+
+The codegen in V2 is a thing of beauty. Since I now use objects for AST
+nodes, I defined a "print" method on each object type. This method takes an
+object (an OutputStream) and instead of returning the source code for the
+node, it prints it in the output stream. The stream object keeps track of
+current line/colum in the output and provides helper functions to insert
+semicolons, to indent etc. The code is somewhat bigger than the `gen_code`
+in v1, but it's much easier to understand, it's faster and does not require
+an additional pass for splitting long lines. Also the rules for inserting
+parens are nicely separated from the `print` method definitions.
+
+### More aggressive compressing
+
+As I
+[blogged](http://lisperator.net/blog/javascript-minification-is-it-worth-it/)
+a few days ago, it seems to me that the squeezer works really hard for not
+too much benefit. On my test file, passing `--no-squeeze` to UglifyJS v1
+adds only 500 bytes after `gzip`, that is 0.68% of the gzipped file size;
+every byte counts, but to be frank, that's not a very big deal either.
+
+Beyond doing what V1 does, I'd like to make it smarter in certain
+situations, for example:
+
+ function foo() {
+ var something = compute_something();
+ var something_else = compute_something_else(something);
+ return something_else;
+ }
+
+I sometimes write this kind of code because it's cleaner, it nests less and
+it avoids the need to add explanatory comments. It could _safely_ compress
+into:
+
+ function foo() {
+ return compute_something_else(compute_something());
+ }
+
+which makes it a single statement (further compressable into sequences and
+allowing to drop brackets in other cases) and it avoids the `var`
+declarations. That's one tricky optimization to do in V1, but I feel with
+the new architecture is doable, at least for the simple cases.
+
+Currently the compressor in V2 is far from complete (where by “complete” I
+mean as good as V1), and I'll actually put it on hold to add support for
+generating source maps first. However the mangler is complete (seems to be
+working properly) as well as the code generator, so V2 is already usable for
+achieving pretty good compression.
+
+### Better regression test suite
+
+The existing test suite in UglifyJS v1 has been contributed (thanks!).
+Unfortunately it's not great because it employs all the compression
+techniques in each test. Eventually I'd like to port all existing tests to
+v2, but for now I started it from scratch.
+
+Tests broke many times for no good reason as I added new features; for
+example the feature that transforms consecutive simple statements into
+sequences:
+
+ INPUT → function f(){ if (x) { foo(); bar(); baz(); }}
+ OUTPUT → function f(){ x && foo(), bar(), baz() }
+
+It's an useful technique; without meshing consecutive statements into an
+`AST_Seq` we would have to keep the `if` and the brackets.
+
+Having a test only for this feature is fine; but if the feature is applied
+to all tests, then tests where the “expected” file contains consecutive
+statements will break, although the output is perfectly fine.
+
+In v2 I started a new test suite (I actually took the “test driven
+development” approach: I'm progressing on both compressor and test suite at
+once; for each new compressor option I add a test case). Tests look like
+this:
+
+ keep_debugger: {
+ options = {
+ drop_debugger: false
+ };
+ input: {
+ debugger;
+ }
+ expect: {
+ debugger;
+ }
+ }
+
+ drop_debugger: {
+ options = {
+ drop_debugger: true
+ };
+ input: {
+ debugger;
+ if (foo) debugger;
+ }
+ expect: {
+ if (foo);
+ }
+ }
+
+That might look funny, but it's syntactically valid JS. A test file
+consists of a sequence of labeled block statements. Each label names a test
+in that file. In each block you can assign to the `options` variable to
+override compressor options (for the purpose of running the tests, all
+compression options are turned off, so you just enable the stuff you test).
+Then you include two other labeled statements: `input` and `expect`. The
+compressor test suite simply parses these statements to get two AST-s. It
+applies the compressor on the `input` AST, then the `codegen` on the
+compressed AST. It applies the `codegen` to the `expect` AST (without
+compressing it). Then it compares the results and if they match, the test
+passes.
+
+I expect this model to give a lot less false negatives, and it would work
+quite well for the name mangling too (no tests for that yet).
+
+For the code generator we'll need something more fine-tuned, since we care
+exactly how the output is going to look like. I don't yet have any plans
+about code generator tests.
+
+
+Play with it
+------------
+
+We don't yet have a nice command line utility, but there's a test script for
+NodeJS in tmp/test-node.js. To play with UglifyJS v2 just clone the
+repository anywhere you like and run `tmp/test-node.js script.js` (script.js
+being the script that you'd like to compress). Take a look at the source of
+`test-node.js` to see how the API looks like, to enable/disable steps or
+compressor options.
+
+To run the existing tests, run `test/run-tests.js`
+
+
+Status of UglifyJS v1
+---------------------
+
+We didn't have any significant new features in the last few months; most
+commits are about bug fixes. I plan to continue to fix show-stopper bugs in
+v1 for a while, depending on how time permits, but there won't be any new
+development.
+
+
+Help me complete the new version
+--------------------------------
+
+I've put a lot of energy already into this project and I think it comes out
+nicely. It's based on all my previous experience from working on version 1
+and I'm working carefully, trying not to introduce bugs that were already
+fixed, trying to keep it fast and clean. If you'd like to help me dedicate
+more time to it, please consider making a donation!
+
+<a href='http://www.pledgie.com/campaigns/18110'><img alt='Click here to
+lend your support to: Funding development of UglifyJS 2.0 and make a
+donation at www.pledgie.com !'
+src='http://www.pledgie.com/campaigns/18110.png?skin_name=chrome' border='0'
+/></a>