UglifyJS – a JavaScript parser/compressor/beautifier
\n\n\nTable of Contents
\n\n1 UglifyJS — a JavaScript parser/compressor/beautifier
\n\nThis package implements a general-purpose JavaScript\nparser/compressor/beautifier toolkit. It is developed on NodeJS, but it\nshould work on any JavaScript platform supporting the CommonJS module system\n(and if your platform of choice doesn\'t support CommonJS, you can easily\nimplement it, or discard the exports.*
lines from UglifyJS sources).\n
\nThe tokenizer/parser generates an abstract syntax tree from JS code. You\ncan then traverse the AST to learn more about the code, or do various\nmanipulations on it. This part is implemented in parse-js.js and it\'s a\nport to JavaScript of the excellent parse-js Common Lisp library from Marijn Haverbeke.\n
\n\n( See cl-uglify-js if you\'re looking for the Common Lisp version of\nUglifyJS. )\n
\n\nThe second part of this package, implemented in process.js, inspects and\nmanipulates the AST generated by the parser to provide the following:\n
\n- \n
- ability to re-generate JavaScript code from the AST. Optionally\n indented—you can use this if you want to “beautify” a program that has\n been compressed, so that you can inspect the source. But you can also run\n our code generator to print out an AST without any whitespace, so you\n achieve compression as well.\n\n \n
- shorten variable names (usually to single characters). Our mangler will\n analyze the code and generate proper variable names, depending on scope\n and usage, and is smart enough to deal with globals defined elsewhere, or\n with
eval()
calls orwith{}
statements. In short, ifeval()
or\nwith{}
are used in some scope, then all variables in that scope and any\n variables in the parent scopes will remain unmangled, and any references\n to such variables remain unmangled as well.\n\n \n - various small optimizations that may lead to faster code but certainly\n lead to smaller code. Where possible, we do the following:\n\n
- \n
- foo["bar"] ==> foo.bar\n\n \n
- remove block brackets
{}
\n\n \n - join consecutive var declarations:\n var a = 10; var b = 20; ==> var a=10,b=20;\n\n \n
- resolve simple constant expressions: 1 +2 * 3 ==> 7. We only do the\n replacement if the result occupies less bytes; for example 1/3 would\n translate to 0.333333333333, so in this case we don\'t replace it.\n\n \n
- consecutive statements in blocks are merged into a sequence; in many\n cases, this leaves blocks with a single statement, so then we can remove\n the block brackets.\n\n \n
- various optimizations for IF statements:\n\n
- \n
- if (foo) bar(); else baz(); ==> foo?bar():baz();\n \n
- if (!foo) bar(); else baz(); ==> foo?baz():bar();\n \n
- if (foo) bar(); ==> foo&&bar();\n \n
- if (!foo) bar(); ==> foo||bar();\n \n
- if (foo) return bar(); else return baz(); ==> return foo?bar():baz();\n \n
- if (foo) return bar(); else something(); ==> {if(foo)return bar();something()}\n\n \n
\n - remove some unreachable code and warn about it (code that follows a\n
return
,throw
,break
orcontinue
statement, except\n function/variable declarations).\n\n \n - act a limited version of a pre-processor (c.f. the pre-processor of\n C/C++) to allow you to safely replace selected global symbols with\n specified values. When combined with the optimisations above this can\n make UglifyJS operate slightly more like a compilation process, in\n that when certain symbols are replaced by constant values, entire code\n blocks may be optimised away as unreachable.\n \n
\n
1.1 Unsafe transformations
\n\nThe following transformations can in theory break code, although they\'re\nprobably safe in most practical cases. To enable them you need to pass the\n--unsafe
flag.\n
1.1.1 Calls involving the global Array constructor
\n\nThe following transformations occur:\n
\n\n\n\nnew Array(1, 2, 3, 4) => [1,2,3,4]\nArray(a, b, c) => [a,b,c]\nnew Array(5) => Array(5)\nnew Array(a) => Array(a)\n\n\n\n
\nThese are all safe if the Array name isn\'t redefined. JavaScript does allow\none to globally redefine Array (and pretty much everything, in fact) but I\npersonally don\'t see why would anyone do that.\n
\n\nUglifyJS does handle the case where Array is redefined locally, or even\nglobally but with a function
or var
declaration. Therefore, in the\nfollowing cases UglifyJS doesn\'t touch calls or instantiations of Array:\n
// case 1. globally declared variable\n var Array;\n new Array(1, 2, 3);\n Array(a, b);\n\n // or (can be declared later)\n new Array(1, 2, 3);\n var Array;\n\n // or (can be a function)\n new Array(1, 2, 3);\n function Array() { ... }\n\n// case 2. declared in a function\n (function(){\n a = new Array(1, 2, 3);\n b = Array(5, 6);\n var Array;\n })();\n\n // or\n (function(Array){\n return Array(5, 6, 7);\n })();\n\n // or\n (function(){\n return new Array(1, 2, 3, 4);\n function Array() { ... }\n })();\n\n // etc.\n\n\n\n
1.1.2 obj.toString()
==> obj+“”
\n1.2 Install (NPM)
\n\nUglifyJS is now available through NPM — npm install uglify-js
should do\nthe job.\n
1.3 Install latest code from GitHub
\n## clone the repository\nmkdir -p /where/you/wanna/put/it\ncd /where/you/wanna/put/it\ngit clone git://github.com/mishoo/UglifyJS.git\n\n## make the module available to Node\nmkdir -p ~/.node_libraries/\ncd ~/.node_libraries/\nln -s /where/you/wanna/put/it/UglifyJS/uglify-js.js\n\n## and if you want the CLI script too:\nmkdir -p ~/bin\ncd ~/bin\nln -s /where/you/wanna/put/it/UglifyJS/bin/uglifyjs\n # (then add ~/bin to your $PATH if it\'s not there already)\n\n\n\n
1.4 Usage
\n\nThere is a command-line tool that exposes the functionality of this library\nfor your shell-scripting needs:\n
\n\n\n\nuglifyjs [ options... ] [ filename ]\n\n\n\n
\nfilename
should be the last argument and should name the file from which\nto read the JavaScript code. If you don\'t specify it, it will read code\nfrom STDIN.\n
\nSupported options:\n
\n- \n
-b
or--beautify
— output indented code; when passed, additional\n options control the beautifier:\n\n- \n
-i N
or--indent N
— indentation level (number of spaces)\n\n \n-q
or--quote-keys
— quote keys in literal objects (by default,\n only keys that cannot be identifier names will be quotes).\n\n \n
\n--ascii
— pass this argument to encode non-ASCII characters as\n\\uXXXX
sequences. By default UglifyJS won\'t bother to do it and will\n output Unicode characters instead. (the output is always encoded in UTF8,\n but if you pass this option you\'ll only get ASCII).\n\n \n-nm
or--no-mangle
— don\'t mangle names.\n\n \n-nmf
or--no-mangle-functions
– in case you want to mangle variable\n names, but not touch function names.\n\n \n-ns
or--no-squeeze
— don\'t callast_squeeze()
(which does various\n optimizations that result in smaller, less readable code).\n\n \n-mt
or--mangle-toplevel
— mangle names in the toplevel scope too\n (by default we don\'t do this).\n\n \n--no-seqs
— whenast_squeeze()
is called (thus, unless you pass\n--no-squeeze
) it will reduce consecutive statements in blocks into a\n sequence. For example, "a = 10; b = 20; foo();" will be written as\n "a=10,b=20,foo();". In various occasions, this allows us to discard the\n block brackets (since the block becomes a single statement). This is ON\n by default because it seems safe and saves a few hundred bytes on some\n libs that I tested it on, but pass--no-seqs
to disable it.\n\n \n--no-dead-code
— by default, UglifyJS will remove code that is\n obviously unreachable (code that follows areturn
,throw
,break
or\ncontinue
statement and is not a function/variable declaration). Pass\n this option to disable this optimization.\n\n \n-nc
or--no-copyright
— by default,uglifyjs
will keep the initial\n comment tokens in the generated code (assumed to be copyright information\n etc.). If you pass this it will discard it.\n\n \n-o filename
or--output filename
— put the result infilename
. If\n this isn\'t given, the result goes to standard output (or see next one).\n\n \n--overwrite
— if the code is read from a file (not from STDIN) and you\n pass--overwrite
then the output will be written in the same file.\n\n \n--ast
— pass this if you want to get the Abstract Syntax Tree instead\n of JavaScript as output. Useful for debugging or learning more about the\n internals.\n\n \n-v
or--verbose
— output some notes on STDERR (for now just how long\n each operation takes).\n\n \n-d SYMBOL[=VALUE]
or--define SYMBOL[=VALUE]
— will replace\n all instances of the specified symbol where used as an identifier\n (except where symbol has properly declared by a var declaration or\n use as function parameter or similar) with the specified value. This\n argument may be specified multiple times to define multiple\n symbols - if no value is specified the symbol will be replaced with\n the valuetrue
, or you can specify a numeric value (such as\n1024
), a quoted string value (such as ="object"= or\n =\'https://github.com\'), or the name of another symbol or keyword (such as =null
ordocument
).\n This allows you, for example, to assign meaningful names to key\n constant values but discard the symbolic names in the uglified\n version for brevity/efficiency, or when used wth care, allows\n UglifyJS to operate as a form of conditional compilation\n whereby defining appropriate values may, by dint of the constant\n folding and dead code removal features above, remove entire\n superfluous code blocks (e.g. completely remove instrumentation or\n trace code for production use).\n Where string values are being defined, the handling of quotes are\n likely to be subject to the specifics of your command shell\n environment, so you may need to experiment with quoting styles\n depending on your platform, or you may find the option\n--define-from-module
more suitable for use.\n\n \n-define-from-module SOMEMODULE
— will load the named module (as\n per the NodeJSrequire()
function) and iterate all the exported\n properties of the module defining them as symbol names to be defined\n (as if by the--define
option) per the name of each property\n (i.e. without the module name prefix) and given the value of the\n property. This is a much easier way to handle and document groups of\n symbols to be defined rather than a large number of--define
\n options.\n\n \n--unsafe
— enable other additional optimizations that are known to be\n unsafe in some contrived situations, but could still be generally useful.\n For now only these:\n\n- \n
- foo.toString() ==> foo+""\n \n
- new Array(x,…) ==> [x,…]\n \n
- new Array(x) ==> Array(x)\n\n \n
\n--max-line-len
(default 32K characters) — add a newline after around\n 32K characters. I\'ve seen both FF and Chrome croak when all the code was\n on a single line of around 670K. Pass –max-line-len 0 to disable this\n safety feature.\n\n \n--reserved-names
— some libraries rely on certain names to be used, as\n pointed out in issue #92 and #81, so this option allow you to exclude such\n names from the mangler. For example, to keep namesrequire
and$super
\n intact you\'d specify –reserved-names "require,$super".\n\n \n--inline-script
– when you want to include the output literally in an\n HTML<script>
tag you can use this option to prevent</script
from\n showing up in the output.\n\n \n--lift-vars
– when you pass this, UglifyJS will apply the following\n transformations (see the notes in API,ast_lift_variables
):\n\n- \n
- put all
var
declarations at the start of the scope\n \n - make sure a variable is declared only once\n \n
- discard unused function arguments\n \n
- discard unused inner (named) functions\n \n
- finally, try to merge assignments into that one
var
declaration, if\n possible.\n \n
\n- put all
1.4.1 API
\n\nTo use the library from JavaScript, you\'d do the following (example for\nNodeJS):\n
\n\n\n\nvar jsp = require("uglify-js").parser;\nvar pro = require("uglify-js").uglify;\n\nvar orig_code = "... JS code here";\nvar ast = jsp.parse(orig_code); // parse code and get the initial AST\nast = pro.ast_mangle(ast); // get a new AST with mangled names\nast = pro.ast_squeeze(ast); // get an AST with compression optimizations\nvar final_code = pro.gen_code(ast); // compressed code here\n\n\n\n
\nThe above performs the full compression that is possible right now. As you\ncan see, there are a sequence of steps which you can apply. For example if\nyou want compressed output but for some reason you don\'t want to mangle\nvariable names, you would simply skip the line that calls\npro.ast_mangle(ast)
.\n
\nSome of these functions take optional arguments. Here\'s a description:\n
\n- \n
jsp.parse(code, strict_semicolons)
– parses JS code and returns an AST.\nstrict_semicolons
is optional and defaults tofalse
. If you pass\ntrue
then the parser will throw an error when it expects a semicolon and\n it doesn\'t find it. For most JS code you don\'t want that, but it\'s useful\n if you want to strictly sanitize your code.\n\n \npro.ast_lift_variables(ast)
– merge and movevar
declarations to the\n scop of the scope; discard unused function arguments or variables; discard\n unused (named) inner functions. It also tries to merge assignments\n following thevar
declaration into it.\n\n\n If your code is very hand-optimized concerning
\nvar
declarations, this\n lifting variable declarations might actually increase size. For me it\n helps out. On jQuery it adds 865 bytes (243 after gzip). YMMV. Also\n note that (since it\'s not enabled by default) this operation isn\'t yet\n heavily tested (please report if you find issues!).\n\n Note that although it might increase the image size (on jQuery it gains\n 865 bytes, 243 after gzip) it\'s technically more correct: in certain\n situations, dead code removal might drop variable declarations, which\n would not happen if the variables are lifted in advance.\n
\n\n Here\'s an example of what it does:\n
\n
function f(a, b, c, d, e) {\n var q;\n var w;\n w = 10;\n q = 20;\n for (var i = 1; i < 10; ++i) {\n var boo = foo(a);\n }\n for (var i = 0; i < 1; ++i) {\n var boo = bar(c);\n }\n function foo(){ ... }\n function bar(){ ... }\n function baz(){ ... }\n}\n\n// transforms into ==>\n\nfunction f(a, b, c) {\n var i, boo, w = 10, q = 20;\n for (i = 1; i < 10; ++i) {\n boo = foo(a);\n }\n for (i = 0; i < 1; ++i) {\n boo = bar(c);\n }\n function foo() { ... }\n function bar() { ... }\n}\n\n\n\n
- \n
pro.ast_mangle(ast, options)
– generates a new AST containing mangled\n (compressed) variable and function names. It supports the following\n options:\n\n- \n
toplevel
– mangle toplevel names (by default we don\'t touch them).\n \nexcept
– an array of names to exclude from compression.\n \ndefines
– an object with properties named after symbols to\n replace (see the--define
option for the script) and the values\n representing the AST replacement value.\n\n \n
\npro.ast_squeeze(ast, options)
– employs further optimizations designed\n to reduce the size of the code thatgen_code
would generate from the\n AST. Returns a new AST.options
can be a hash; the supported options\n are:\n\n- \n
make_seqs
(default true) which will cause consecutive statements in a\n block to be merged using the "sequence" (comma) operator\n\n \ndead_code
(default true) which will remove unreachable code.\n\n \n
\npro.gen_code(ast, options)
– generates JS code from the AST. By\n default it\'s minified, but using theoptions
argument you can get nicely\n formatted output.options
is, well, optional :-) and if you pass it it\n must be an object and supports the following properties (below you can see\n the default values):\n\n- \n
beautify: false
– passtrue
if you want indented output\n \nindent_start: 0
(only applies whenbeautify
istrue
) – initial\n indentation in spaces\n \nindent_level: 4
(only applies whenbeautify
istrue
) --\n indentation level, in spaces (pass an even number)\n \nquote_keys: false
– if you passtrue
it will quote all keys in\n literal objects\n \nspace_colon: false
(only applies whenbeautify
istrue
) – wether\n to put a space before the colon in object literals\n \nascii_only: false
– passtrue
if you want to encode non-ASCII\n characters as\\uXXXX
.\n \ninline_script: false
– passtrue
to escape occurrences of\n</script
in strings\n \n
\n
1.4.2 Beautifier shortcoming – no more comments
\n\nThe beautifier can be used as a general purpose indentation tool. It\'s\nuseful when you want to make a minified file readable. One limitation,\nthough, is that it discards all comments, so you don\'t really want to use it\nto reformat your code, unless you don\'t have, or don\'t care about, comments.\n
\n\nIn fact it\'s not the beautifier who discards comments — they are dumped at\nthe parsing stage, when we build the initial AST. Comments don\'t really\nmake sense in the AST, and while we could add nodes for them, it would be\ninconvenient because we\'d have to add special rules to ignore them at all\nthe processing stages.\n
\n1.4.3 Use as a code pre-processor
\n\nThe --define
option can be used, particularly when combined with the\nconstant folding logic, as a form of pre-processor to enable or remove\nparticular constructions, such as might be used for instrumenting\ndevelopment code, or to produce variations aimed at a specific\nplatform.\n
\nThe code below illustrates the way this can be done, and how the\nsymbol replacement is performed.\n
\n\n\n\nCLAUSE1: if (typeof DEVMODE === \'undefined\') {\n DEVMODE = true;\n}\n\nCLAUSE2: function init() {\n if (DEVMODE) {\n console.log("init() called");\n }\n ....\n DEVMODE && console.log("init() complete");\n}\n\nCLAUSE3: function reportDeviceStatus(device) {\n var DEVMODE = device.mode, DEVNAME = device.name;\n if (DEVMODE === \'open\') {\n ....\n }\n}\n\n\n\n
\nWhen the above code is normally executed, the undeclared global\nvariable DEVMODE
will be assigned the value true (see CLAUSE1
)\nand so the init()
function (CLAUSE2
) will write messages to the\nconsole log when executed, but in CLAUSE3
a locally declared\nvariable will mask access to the DEVMODE
global symbol.\n
\nIf the above code is processed by UglifyJS with an argument of\n--define DEVMODE=false
then UglifyJS will replace DEVMODE
with the\nboolean constant value false within CLAUSE1
and CLAUSE2
, but it\nwill leave CLAUSE3
as it stands because there DEVMODE
resolves to\na validly declared variable.\n
\nAnd more so, the constant-folding features of UglifyJS will recognise\nthat the if
condition of CLAUSE1
is thus always false, and so will\nremove the test and body of CLAUSE1
altogether (including the\notherwise slightly problematical statement false = true;
which it\nwill have formed by replacing DEVMODE
in the body). Similarly,\nwithin CLAUSE2
both calls to console.log()
will be removed\naltogether.\n
\nIn this way you can mimic, to a limited degree, the functionality of\nthe C/C++ pre-processor to enable or completely remove blocks\ndepending on how certain symbols are defined - perhaps using UglifyJS\nto generate different versions of source aimed at different\nenvironments\n
\n\nIt is recommmended (but not made mandatory) that symbols designed for\nthis purpose are given names consisting of UPPER_CASE_LETTERS
to\ndistinguish them from other (normal) symbols and avoid the sort of\nclash that CLAUSE3
above illustrates.\n
1.5 Compression – how good is it?
\n\nHere are updated statistics. (I also updated my Google Closure and YUI\ninstallations).\n
\n\nWe\'re still a lot better than YUI in terms of compression, though slightly\nslower. We\'re still a lot faster than Closure, and compression after gzip\nis comparable.\n
\nFile | UglifyJS | UglifyJS+gzip | Closure | Closure+gzip | YUI | YUI+gzip |
---|---|---|---|---|---|---|
jquery-1.6.2.js | 91001 (0:01.59) | 31896 | 90678 (0:07.40) | 31979 | 101527 (0:01.82) | 34646 |
paper.js | 142023 (0:01.65) | 43334 | 134301 (0:07.42) | 42495 | 173383 (0:01.58) | 48785 |
prototype.js | 88544 (0:01.09) | 26680 | 86955 (0:06.97) | 26326 | 92130 (0:00.79) | 28624 |
thelib-full.js (DynarchLIB) | 251939 (0:02.55) | 72535 | 249911 (0:09.05) | 72696 | 258869 (0:01.94) | 76584 |
1.6 Bugs?
\n\nUnfortunately, for the time being there is no automated test suite. But I\nran the compressor manually on non-trivial code, and then I tested that the\ngenerated code works as expected. A few hundred times.\n
\n\nDynarchLIB was started in times when there was no good JS minifier.\nTherefore I was quite religious about trying to write short code manually,\nand as such DL contains a lot of syntactic hacks1 such as “foo == bar ? a\n= 10 : b = 20”, though the more readable version would clearly be to use\n“if/else”.\n
\n\nSince the parser/compressor runs fine on DL and jQuery, I\'m quite confident\nthat it\'s solid enough for production use. If you can identify any bugs,\nI\'d love to hear about them (use the Google Group or email me directly).\n
\n1.7 Links
\n- \n
- Twitter: @UglifyJS\n \n
- Project at GitHub: http://github.com/mishoo/UglifyJS\n \n
- Google Group: http://groups.google.com/group/uglifyjs\n \n
- Common Lisp JS parser: http://marijn.haverbeke.nl/parse-js/\n \n
- JS-to-Lisp compiler: http://github.com/marijnh/js\n \n
- Common Lisp JS uglifier: http://github.com/mishoo/cl-uglify-js\n \n
1.8 License
\n\nUglifyJS is released under the BSD license:\n
\n\n\n\nCopyright 2010 (c) Mihai Bazon <mihai.bazon@gmail.com>\nBased on parse-js (http://marijn.haverbeke.nl/parse-js/).\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions\nare met:\n\n * Redistributions of source code must retain the above\n copyright notice, this list of conditions and the following\n disclaimer.\n\n * Redistributions in binary form must reproduce the above\n copyright notice, this list of conditions and the following\n disclaimer in the documentation and/or other materials\n provided with the distribution.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER “AS IS” AND ANY\nEXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\nPURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE\nLIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,\nOR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,\nPROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR\nPROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\nTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\nTORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF\nTHE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF\nSUCH DAMAGE.\n\n\n\n\n