diff options
author | Mihai Bazon <mihai@bazon.net> | 2012-10-03 12:22:59 +0300 |
---|---|---|
committer | Mihai Bazon <mihai@bazon.net> | 2012-10-03 12:22:59 +0300 |
commit | 7e8880be1cb24d628eed183c87479ff01f7b7848 (patch) | |
tree | af962c21451e58a188e2a5db56982b351f076877 | |
parent | 0678ae2076a9f817ecb8837279c24cf54ea819c2 (diff) | |
download | tracifyjs-7e8880be1cb24d628eed183c87479ff01f7b7848.tar.gz tracifyjs-7e8880be1cb24d628eed183c87479ff01f7b7848.zip |
document the CLI tool
-rw-r--r-- | README.md | 530 |
1 files changed, 239 insertions, 291 deletions
@@ -1,296 +1,244 @@ -> **Tl;dr** — I want to make UglifyJS2 faster, better, easier to maintain -> and more useful than version 1. If you enjoy using UglifyJS v1, I can -> promise you that you will love its successor. - -> Please help me make this happen by funding the development! - -> <a href='http://www.pledgie.com/campaigns/18110'><img alt='Click here to lend your support to: Funding development of UglifyJS 2.0 and make a donation at www.pledgie.com !' src='http://www.pledgie.com/campaigns/18110.png?skin_name=chrome' border='0' /></a> - -UglifyJS v2 -=========== - -[UglifyJS](https://github.com/mishoo/UglifyJS) is a popular JavaScript -parser/compressor/beautifier and it's itself written in JavaScript. Version -1 is battle-tested and used in many production systems. The parser is -[included in WebKit](http://src.chromium.org/multivm/trunk/webkit/Source/WebCore/inspector/front-end/UglifyJS/parse-js.js). -In two years UglifyJS got over 3000 stars at Github and hundreds of bugs -have been identified and fixed, thanks to a great and expanding community. - -I'd say version 1 is rock stable. However, its architecture can't be -stretched much further. Some features are hard to add, such as source maps -or keeping comments in the compressed AST. I started work on version 2 in -May, but I gave up quickly because I lacked time. What prompted me to -resume it was investigating the difficulty of adding source maps (an -[increasingly popular](https://github.com/mishoo/UglifyJS/issues/315) -feature request). - -Status and goals ----------------- - -In short, the goals for v2 are: - -- better modularity, cleaner and more maintainable code; (✓ it's better already) -- parser generates objects instead of arrays for nodes; (✓ done) -- store location information in all nodes; (✓ done) -- better scope representation and mangler; (✓ done) -- better code generator; (✓ done) -- compression options at least as good as in v1; (⌛ in progress) -- support for generating source maps; -- better regression tests; (⌛ in progress) -- ability to keep certain comments; -- command-line utility compatible with UglifyJS v1; -- documentation for the `AST` node hierarchy and the API. - -Longer term goals—beyond compressing JavaScript: - -- provide a linter; (started) -- feature to dump an AST in a simple JSON format, along with information - that could be useful for an editor (such as Emacs); -- write a minor JS mode for Emacs to highlight obvious errors, locate symbol - definition or warn about accidental globals; -- support type annotations like Closure does (though I'm thinking of a - syntax different from comments; no big plans for this yet). - -### Objects for nodes - -Version 1 uses arrays to represent AST nodes. This model worked well for -most operations, but adding additional information in nodes could only be -done with hacks I don't really like (you _can_ add properties to an array -just as if it were an object, but that's just a dirty hack; also, such -properties were not propagated in the compressor). - -In v2 I switched to a more “object oriented” approach. Nodes are objects -and there's also an inheritance tree that aims to be useful in practice. -For example in v1 in order to see if a node is an aborting statement, we -might do something like this: - - if (node[0] == "return" - || node[0] == "throw" - || node[0] == "break" - || node[0] == "continue") aborts(); - -In v2 they all inherit from the base class `AST_Jump`, so I can say: - - if (node instanceof AST_Jump) aborts(); - -The parser was _heavily_ modified to support the new node types, however you -can still find the same code layout as in v1, and I trust it's just as -stable. Except for the parser, all other parts of UglifyJS are rewritten -from scratch. - -The parser itself got a bit slower (430ms instead of 330ms on my usual 650K -test file). - -#### A word about Esprima - -**UPDATE**: A -[discussion in my commit](https://github.com/mishoo/UglifyJS2/commit/ce8e8d57c0d346dba9527b7a11b03364ce9ad1bb#commitcomment-1771586) -suggests that Esprima is not as slow as I thought even when requesting -location information. YMMV. In any case, we're going to keep the -battle-tested parser in UglifyJS. - -[Esprima](http://esprima.org/) is a really nice JavaScript parser. It -supports EcmaScript 5.1 and it claims to be “up to 3x faster than UglifyJS's -parse-js”. I thought that's quite cool and I considered using Esprima in -UglifyJS v2, but then I did some tests. - -On my 650K test file, UglifyJS v1's parser takes 330ms and Esprima about -250ms. That's not exactly “3x faster” but very good indeed! However, I -noticed that in the default configuration Esprima does not keep location -information in the nodes. Enabled that, and parse time grew to 680ms. - -Some would claim it's a fair -[comparison](http://esprima.org/test/compare.html), because UglifyJS doesn't -keep location information either, but that's not entirely accurate. It's -true that the `parse()` function will not propagate location into the AST -unless you set `embed_tokens`, but the lexer _always_ stores it in the -tokens. - -Enabling `embed_tokens` makes UglifyJS do it in 400ms, which is still a lot -better than Esprima's 680ms. - -In version 2 we always maintain location info and comments in the AST nodes, -which is why the parser in v2 takes about 430ms on that file (some -milliseconds get lost because it's more work to create object nodes than -arrays). I might try to speed it up, though I'm not sure it's worth the -trouble (parsing 650K in 430ms (on my rather outdated machine) to get an -objectual AST with full location/range info and comments seems good enough -for me). - -### The code generator, V2 vs. V1 - -The code generator in v1 is a big function that takes a node and applies -various walkers on it in order to generate code. The code was _returned_ -from each walker function, and finally assembled into a big string by -concatenation or array.join, and further returned. It is impossible there -to know what's the current line/column of the output, which would be -necessary for source maps. For the same reason, v1 required an additional -step to split very long lines (that includes an additional run of the -tokenizer). It's _slow_. - -The rules for inserting parentheses in v1 are an unholy mess; we know at -least [one case](https://github.com/mishoo/UglifyJS/issues/368) where it -inserts unnecessary parens (non-trivial to fix), and I just discovered one -case where it generates invalid code—UglifyJS can properly parse the -following (valid) statement: - - for (var a = ("foo" in bar), i = 0; i < 5; ++i); - -however, the code generator in version 1 will break it by not including the -parens (the `in` operator is not allowed in a `for` initializer, unless it's -parenthesized). - -The codegen in V2 is a thing of beauty. Since I now use objects for AST -nodes, I defined a "print" method on each object type. This method takes an -object (an OutputStream) and instead of returning the source code for the -node, it prints it in the output stream. The stream object keeps track of -current line/colum in the output and provides helper functions to insert -semicolons, to indent etc. The code is somewhat bigger than the `gen_code` -in v1, but it's much easier to understand, it's faster and does not require -an additional pass for splitting long lines. Also the rules for inserting -parens are nicely separated from the `print` method definitions. - -### More aggressive compressing - -As I -[blogged](http://lisperator.net/blog/javascript-minification-is-it-worth-it/) -a few days ago, it seems to me that the squeezer works really hard for not -too much benefit. On my test file, passing `--no-squeeze` to UglifyJS v1 -adds only 500 bytes after `gzip`, that is 0.68% of the gzipped file size; -every byte counts, but to be frank, that's not a very big deal either. - -Beyond doing what V1 does, I'd like to make it smarter in certain -situations, for example: - - function foo() { - var something = compute_something(); - var something_else = compute_something_else(something); - return something_else; +UglifyJS 2 +========== + +UglifyJS is a JavaScript parser, minifier, compressor or beautifier toolkit. + +For now this page documents the command line utility. More advanced +API documentation will be made available later. + +Usage +----- + + uglifyjs2 [input files] [options] + +UglifyJS2 can take multiple input files. It's recommended that you pass the +input files first, then pass the options. UglifyJS will parse input files +in sequence and apply any compression options. The files are parsed in the +same global scope, that is, a reference from a file to some +variable/function declared in another file will be matched properly. + +If you want to read from STDIN instead, pass a single dash instead of input +files. + +The available options are: + + --source-map Specify an output file where to generate source map. + --source-map-root The path to the original source to be included in the + source map. + --in-source-map Input source map, useful if you're compressing JS that was + generated from some other original code. + -p, --prefix Skip prefix for original filenames that appear in source + maps. For example -p 3 will drop 3 directories from file + names and ensure they are relative paths. + -o, --output Output file (default STDOUT). + -b, --beautify Beautify output/specify output options. [string] + -m, --mangle Mangle names/pass mangler options. [string] + -r, --reserved Reserved names to exclude from mangling. + -c, --compress Enable compressor/pass compressor options. Pass options + like -c hoist_vars=false,if_return=false. Use -c with no + argument to use the default compression options. [string] + -d, --define Global definitions [string] + --comments Preserve copyright comments in the output. By default this + works like Google Closure, keeping JSDoc-style comments + that contain "@license" or "@preserve". You can optionally + pass one of the following arguments to this flag: + - "all" to keep all comments + - a valid JS regexp (needs to start with a slash) to keep + only comments that match. + Note that currently not *all* comments can be kept when + compression is on, because of dead code removal or + cascading statements into sequences. [string] + --stats Display operations run time on STDERR. [boolean] + -v, --verbose Verbose [boolean] + +Specify `--output` (`-o`) to declare the output file. Otherwise the output +goes to STDOUT. + +### Source map options + +UglifyJS2 can generate a source map file, which is highly useful for +debugging your compressed JavaScript. To get a source map, pass +`--source-map output.js.map` (full path to the file where you want the +source map dumped). + +Additionally you might need `--source-map-root` to pass the URL where the +original files can be found. In case you are passing full paths to input +files to UglifyJS, you can use `--prefix` (`-p`) to specify the number of +directories to drop from the path prefix when declaring files in the source +map. + +For example: + + uglifyjs2 /home/doe/work/foo/src/js/file1.js \ + /home/doe/work/foo/src/js/file2.js \ + -o foo.min.js \ + --source-map foo.min.js.map \ + --source-map-root http://foo.com/src \ + -p 5 -c -m + +The above will compress and mangle `file1.js` and `file2.js`, will drop the +output in `foo.min.js` and the source map in `foo.min.js.map`. The source +mapping will refer to `http://foo.com/src/js/file1.js` and +`http://foo.com/src/js/file2.js` (in fact it will list `http://foo.com/src` +as the source map root, and the original files as `js/file1.js` and +`js/file2.js`). + +#### Composed source map + +When you're compressing JS code that was output by a compiler such as +CoffeeScript, mapping to the JS code won't be too helpful. Instead, you'd +like to map back to the original code (i.e. CoffeeScript). UglifyJS has an +option to take an input source map. Assuming you have a mapping from +CoffeeScript → compiled JS, UglifyJS can generate a map from CoffeeScript → +compressed JS by mapping every token in the compiled JS to its original +location. + +To use this feature you need to pass `--in-source-map +/path/to/input/source.map`. Normally the input source map should also point +to the file containing the generated JS, so if that's correct you can omit +input files from the command line. + +### Mangler options + +To enable the mangler you need to pass `--mangle` (`-m`). Optionally you +can pass `-m sort` (we'll possibly have other flags in the future) in order +to assign shorter names to most frequently used variables. This saves a few +hundred bytes on jQuery before gzip, but the output is _bigger_ after gzip +(and seems to happen for other libraries I tried it on) therefore it's not +enabled by default. + +When mangling is enabled but you want to prevent certain names from being +mangled, you can declare those names with `--reserved` (`-r`) — pass a +comma-separated list of names. For example: + + uglifyjs2 ... -m -r '$,require,exports' + +to prevent the `require`, `exports` and `$` names from being changed. + +### Compressor options + +You need to pass `--compress` (`-c`) to enable the compressor. Optionally +you can pass a comma-separated list of options. Options are in the form +`foo=bar`, or just `foo` (the latter implies a boolean option that you want +to set `true`; it's effectively a shortcut for `foo=true`). + +The defaults should be tuned for maximum compression on most code. Here are +the available options (all are `true` by default, except `hoist_vars`): + +- `sequences` -- join consecutive simple statements using the comma operator +- `properties` -- rewrite property access using the dot notation, for + example `foo["bar"] → foo.bar` +- `dead-code` -- remove unreachable code +- `drop-debugger` -- remove `debugger;` statements +- `unsafe` -- apply "unsafe" transformations (discussion below) +- `conditionals` -- apply optimizations for `if`-s and conditional + expressions +- `comparisons` -- apply certain optimizations to binary nodes, for example: + `!(a <= b) → a > b` (only when `unsafe`), attempts to negate binary nodes, + e.g. `a = !b && !c && !d && !e → a=!(b||c||d||e)` etc. +- `evaluate` -- attempt to evaluate constant expressions +- `booleans` -- various optimizations for boolean context, for example `!!a + ? b : c → a ? b : c` +- `loops` -- optimizations for `do`, `while` and `for` loops when we can + statically determine the condition +- `unused` -- drop unreferenced functions and variables +- `hoist-funs` -- hoist function declarations +- `hoist-vars` -- hoist `var` declarations (this is `false` by default + because it seems to increase the size of the output in general) +- `if-return` -- optimizations for if/return and if/continue +- `join-vars` -- join consecutive `var` statements +- `cascade` -- small optimization for sequences, transform `x, x` into `x` + and `x = something(), x` into `x = something()` +- `warnings` -- display warnings when dropping unreachable code or unused + declarations etc. + +### Conditional compilation + +You can use the `--define` (`-d`) switch in order to declare global +variables that UglifyJS will assume to be constants (unless defined in +scope). For example if you pass `--define DEBUG=false` then, coupled with +dead code removal UglifyJS will discard the following from the output: + + if (DEBUG) { + console.log("debug stuff"); } -I sometimes write this kind of code because it's cleaner, it nests less and -it avoids the need to add explanatory comments. It could _safely_ compress -into: - - function foo() { - return compute_something_else(compute_something()); - } - -which makes it a single statement (further compressable into sequences and -allowing to drop brackets in other cases) and it avoids the `var` -declarations. That's one tricky optimization to do in V1, but I feel with -the new architecture is doable, at least for the simple cases. - -Currently the compressor in V2 is far from complete (where by “complete” I -mean as good as V1), and I'll actually put it on hold to add support for -generating source maps first. However the mangler is complete (seems to be -working properly) as well as the code generator, so V2 is already usable for -achieving pretty good compression. - -### Better regression test suite - -The existing test suite in UglifyJS v1 has been contributed (thanks!). -Unfortunately it's not great because it employs all the compression -techniques in each test. Eventually I'd like to port all existing tests to -v2, but for now I started it from scratch. - -Tests broke many times for no good reason as I added new features; for -example the feature that transforms consecutive simple statements into -sequences: - - INPUT → function f(){ if (x) { foo(); bar(); baz(); }} - OUTPUT → function f(){ x && foo(), bar(), baz() } - -It's an useful technique; without meshing consecutive statements into an -`AST_Seq` we would have to keep the `if` and the brackets. - -Having a test only for this feature is fine; but if the feature is applied -to all tests, then tests where the “expected” file contains consecutive -statements will break, although the output is perfectly fine. - -In v2 I started a new test suite (I actually took the “test driven -development” approach: I'm progressing on both compressor and test suite at -once; for each new compressor option I add a test case). Tests look like -this: - - keep_debugger: { - options = { - drop_debugger: false - }; - input: { - debugger; - } - expect: { - debugger; - } - } - - drop_debugger: { - options = { - drop_debugger: true - }; - input: { - debugger; - if (foo) debugger; - } - expect: { - if (foo); - } +UglifyJS will warn about the condition being always false and about dropping +unreachable code; for now there is no option to turn off only this specific +warning, you can pass `warnings=false` to turn off *all* warnings. + +Another way of doing that is to declare your globals as constants in a +separate file and include it into the build. For example you can have a +`build/defines.js` file with the following: + + const DEBUG = false; + const PRODUCTION = true; + // etc. + +and build your code like this: + + uglifyjs2 build/defines.js js/foo.js js/bar.js... -c + +UglifyJS will notice the constants and, since they cannot be altered, it +will evaluate references to them to the value itself and drop unreachable +code as usual. The possible downside of this approach is that the build +will contain the `const` declarations. + +### Beautifier options + +The code generator tries to output shortest code possible by default. In +case you want beautified output, pass `--beautify` (`-b`). Optionally you +can pass additional arguments that control the code output: + +- `beautify` (default `true`) -- whether to actually beautify the output. + Passing `-b` will set this to true, but you might need to pass `-b` even + when you want to generate minified code, in order to specify additional + arguments, so you can use `-b beautify=false` to override it. +- `indent-level` (default 4) +- `indent-start` (default 0) -- prefix all lines by that many spaces +- `quote-keys` (default `false`) -- pass `true` to quote all keys in literal + objects +- `space-colon` (default `true`) -- insert a space after the colon signs +- `ascii-only` (default `false`) -- escape Unicode characters in strings and + regexps +- `inline-script` (default `false`) -- escape the slash in occurrences of + `</script` in strings +- `width` (default 80) -- only takes effect when beautification is on, this + specifies an (orientative) line width that the beautifier will try to + obey. It refers to the width of the line text (excluding indentation). + It doesn't work very well currently, but it does make the code generated + by UglifyJS more readable. +- `max-line-len` (default 32000) -- maximum line length (for uglified code) +- `ie-proof` (default `true`) -- generate “IE-proof” code (for now this + means add brackets around the do/while in code like this: `if (foo) do + something(); while (bar); else ...`. +- `bracketize` (default `false`) -- always insert brackets in `if`, `for`, + `do`, `while` or `with` statements, even if their body is a single + statement. + +### Keeping copyright notices or other comments + +You can pass `--comments` to retain certain comments in the output. By +default it will keep JSDoc-style comments that contain "@preserve" or +"@license". You can pass `--comments all` to keep all the comments, or a +valid JavaScript regexp to keep only comments that match this regexp. For +example `--comments '/foo|bar/'` will keep only comments that contain "foo" +or "bar". + +Note, however, that there might be situations where comments are lost. For +example: + + function f() { + /** @preserve Foo Bar */ + function g() { + // this function is never called + } + return something(); } -That might look funny, but it's syntactically valid JS. A test file -consists of a sequence of labeled block statements. Each label names a test -in that file. In each block you can assign to the `options` variable to -override compressor options (for the purpose of running the tests, all -compression options are turned off, so you just enable the stuff you test). -Then you include two other labeled statements: `input` and `expect`. The -compressor test suite simply parses these statements to get two AST-s. It -applies the compressor on the `input` AST, then the `codegen` on the -compressed AST. It applies the `codegen` to the `expect` AST (without -compressing it). Then it compares the results and if they match, the test -passes. - -I expect this model to give a lot less false negatives, and it would work -quite well for the name mangling too (no tests for that yet). - -For the code generator we'll need something more fine-tuned, since we care -exactly how the output is going to look like. I don't yet have any plans -about code generator tests. - - -Play with it ------------- - -We don't yet have a nice command line utility, but there's a test script for -NodeJS in tmp/test-node.js. To play with UglifyJS v2 just clone the -repository anywhere you like and run `tmp/test-node.js script.js` (script.js -being the script that you'd like to compress). Take a look at the source of -`test-node.js` to see how the API looks like, to enable/disable steps or -compressor options. - -To run the existing tests, run `test/run-tests.js` - - -Status of UglifyJS v1 ---------------------- - -We didn't have any significant new features in the last few months; most -commits are about bug fixes. I plan to continue to fix show-stopper bugs in -v1 for a while, depending on how time permits, but there won't be any new -development. - - -Help me complete the new version --------------------------------- - -I've put a lot of energy already into this project and I think it comes out -nicely. It's based on all my previous experience from working on version 1 -and I'm working carefully, trying not to introduce bugs that were already -fixed, trying to keep it fast and clean. If you'd like to help me dedicate -more time to it, please consider making a donation! +Even though it has "@preserve", the comment will be lost because the inner +function `g` (which is the AST node to which the comment is attached to) is +discarded by the compressor as not referenced. -<a href='http://www.pledgie.com/campaigns/18110'><img alt='Click here to -lend your support to: Funding development of UglifyJS 2.0 and make a -donation at www.pledgie.com !' -src='http://www.pledgie.com/campaigns/18110.png?skin_name=chrome' border='0' -/></a> +The safest comments where to place copyright information (or other info that +needs to me kept in the output) are comments attached to toplevel nodes. |