Guest User

Untitled

a guest
May 13th, 2017
377
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 118.84 KB | None | 0 0
  1. WGET(1) GNU Wget WGET(1)
  2.  
  3.  
  4.  
  5. NNAAMMEE
  6. Wget - The non-interactive network downloader.
  7.  
  8. SSYYNNOOPPSSIISS
  9. wget [_o_p_t_i_o_n]... [_U_R_L]...
  10.  
  11. DDEESSCCRRIIPPTTIIOONN
  12. GNU Wget is a free utility for non-interactive download of files from
  13. the Web. It supports HTTP, HTTPS, and FTP protocols, as well as
  14. retrieval through HTTP proxies.
  15.  
  16. Wget is non-interactive, meaning that it can work in the background,
  17. while the user is not logged on. This allows you to start a retrieval
  18. and disconnect from the system, letting Wget finish the work. By
  19. contrast, most of the Web browsers require constant user's presence,
  20. which can be a great hindrance when transferring a lot of data.
  21.  
  22. Wget can follow links in HTML, XHTML, and CSS pages, to create local
  23. versions of remote web sites, fully recreating the directory structure
  24. of the original site. This is sometimes referred to as "recursive
  25. downloading." While doing that, Wget respects the Robot Exclusion
  26. Standard (_/_r_o_b_o_t_s_._t_x_t). Wget can be instructed to convert the links in
  27. downloaded files to point at the local files, for offline viewing.
  28.  
  29. Wget has been designed for robustness over slow or unstable network
  30. connections; if a download fails due to a network problem, it will keep
  31. retrying until the whole file has been retrieved. If the server
  32. supports regetting, it will instruct the server to continue the
  33. download from where it left off.
  34.  
  35. OOPPTTIIOONNSS
  36. OOppttiioonn SSyynnttaaxx
  37. Since Wget uses GNU getopt to process command-line arguments, every
  38. option has a long form along with the short one. Long options are more
  39. convenient to remember, but take time to type. You may freely mix
  40. different option styles, or specify options after the command-line
  41. arguments. Thus you may write:
  42.  
  43. wget -r --tries=10 http://fly.srk.fer.hr/ -o log
  44.  
  45. The space between the option accepting an argument and the argument may
  46. be omitted. Instead of --oo lloogg you can write --oolloogg.
  47.  
  48. You may put several options that do not require arguments together,
  49. like:
  50.  
  51. wget -drc <URL>
  52.  
  53. This is completely equivalent to:
  54.  
  55. wget -d -r -c <URL>
  56.  
  57. Since the options can be specified after the arguments, you may
  58. terminate them with ----. So the following will try to download URL --xx,
  59. reporting failure to _l_o_g:
  60.  
  61. wget -o log -- -x
  62.  
  63. The options that accept comma-separated lists all respect the
  64. convention that specifying an empty list clears its value. This can be
  65. useful to clear the _._w_g_e_t_r_c settings. For instance, if your _._w_g_e_t_r_c
  66. sets "exclude_directories" to _/_c_g_i_-_b_i_n, the following example will
  67. first reset it, and then set it to exclude _/_~_n_o_b_o_d_y and _/_~_s_o_m_e_b_o_d_y.
  68. You can also clear the lists in _._w_g_e_t_r_c.
  69.  
  70. wget -X " -X /~nobody,/~somebody
  71.  
  72. Most options that do not accept arguments are _b_o_o_l_e_a_n options, so named
  73. because their state can be captured with a yes-or-no ("boolean")
  74. variable. For example, ----ffoollllooww--ffttpp tells Wget to follow FTP links
  75. from HTML files and, on the other hand, ----nnoo--gglloobb tells it not to
  76. perform file globbing on FTP URLs. A boolean option is either
  77. _a_f_f_i_r_m_a_t_i_v_e or _n_e_g_a_t_i_v_e (beginning with ----nnoo). All such options share
  78. several properties.
  79.  
  80. Unless stated otherwise, it is assumed that the default behavior is the
  81. opposite of what the option accomplishes. For example, the documented
  82. existence of ----ffoollllooww--ffttpp assumes that the default is to _n_o_t follow FTP
  83. links from HTML pages.
  84.  
  85. Affirmative options can be negated by prepending the ----nnoo-- to the
  86. option name; negative options can be negated by omitting the ----nnoo--
  87. prefix. This might seem superfluous---if the default for an
  88. affirmative option is to not do something, then why provide a way to
  89. explicitly turn it off? But the startup file may in fact change the
  90. default. For instance, using "follow_ftp = on" in _._w_g_e_t_r_c makes Wget
  91. _f_o_l_l_o_w FTP links by default, and using ----nnoo--ffoollllooww--ffttpp is the only way
  92. to restore the factory default from the command line.
  93.  
  94. BBaassiicc SSttaarrttuupp OOppttiioonnss
  95. --VV
  96. ----vveerrssiioonn
  97. Display the version of Wget.
  98.  
  99. --hh
  100. ----hheellpp
  101. Print a help message describing all of Wget's command-line options.
  102.  
  103. --bb
  104. ----bbaacckkggrroouunndd
  105. Go to background immediately after startup. If no output file is
  106. specified via the --oo, output is redirected to _w_g_e_t_-_l_o_g.
  107.  
  108. --ee _c_o_m_m_a_n_d
  109. ----eexxeeccuuttee _c_o_m_m_a_n_d
  110. Execute _c_o_m_m_a_n_d as if it were a part of _._w_g_e_t_r_c. A command thus
  111. invoked will be executed _a_f_t_e_r the commands in _._w_g_e_t_r_c, thus taking
  112. precedence over them. If you need to specify more than one wgetrc
  113. command, use multiple instances of --ee.
  114.  
  115. LLooggggiinngg aanndd IInnppuutt FFiillee OOppttiioonnss
  116. --oo _l_o_g_f_i_l_e
  117. ----oouuttppuutt--ffiillee==_l_o_g_f_i_l_e
  118. Log all messages to _l_o_g_f_i_l_e. The messages are normally reported to
  119. standard error.
  120.  
  121. --aa _l_o_g_f_i_l_e
  122. ----aappppeenndd--oouuttppuutt==_l_o_g_f_i_l_e
  123. Append to _l_o_g_f_i_l_e. This is the same as --oo, only it appends to
  124. _l_o_g_f_i_l_e instead of overwriting the old log file. If _l_o_g_f_i_l_e does
  125. not exist, a new file is created.
  126.  
  127. --dd
  128. ----ddeebbuugg
  129. Turn on debug output, meaning various information important to the
  130. developers of Wget if it does not work properly. Your system
  131. administrator may have chosen to compile Wget without debug
  132. support, in which case --dd will not work. Please note that
  133. compiling with debug support is always safe---Wget compiled with
  134. the debug support will _n_o_t print any debug info unless requested
  135. with --dd.
  136.  
  137. --qq
  138. ----qquuiieett
  139. Turn off Wget's output.
  140.  
  141. --vv
  142. ----vveerrbboossee
  143. Turn on verbose output, with all the available data. The default
  144. output is verbose.
  145.  
  146. --nnvv
  147. ----nnoo--vveerrbboossee
  148. Turn off verbose without being completely quiet (use --qq for that),
  149. which means that error messages and basic information still get
  150. printed.
  151.  
  152. ----rreeppoorrtt--ssppeeeedd==_t_y_p_e
  153. Output bandwidth as _t_y_p_e. The only accepted value is bbiittss.
  154.  
  155. --ii _f_i_l_e
  156. ----iinnppuutt--ffiillee==_f_i_l_e
  157. Read URLs from a local or external _f_i_l_e. If -- is specified as
  158. _f_i_l_e, URLs are read from the standard input. (Use ..//-- to read from
  159. a file literally named --.)
  160.  
  161. If this function is used, no URLs need be present on the command
  162. line. If there are URLs both on the command line and in an input
  163. file, those on the command lines will be the first ones to be
  164. retrieved. If ----ffoorrccee--hhttmmll is not specified, then _f_i_l_e should
  165. consist of a series of URLs, one per line.
  166.  
  167. However, if you specify ----ffoorrccee--hhttmmll, the document will be regarded
  168. as hhttmmll. In that case you may have problems with relative links,
  169. which you can solve either by adding "<base href="url">" to the
  170. documents or by specifying ----bbaassee==_u_r_l on the command line.
  171.  
  172. If the _f_i_l_e is an external one, the document will be automatically
  173. treated as hhttmmll if the Content-Type matches tteexxtt//hhttmmll.
  174. Furthermore, the _f_i_l_e's location will be implicitly used as base
  175. href if none was specified.
  176.  
  177. ----iinnppuutt--mmeettaalliinnkk==_f_i_l_e
  178. Downloads files covered in local Metalink _f_i_l_e. Metalink version 3
  179. and 4 are supported.
  180.  
  181. ----kkeeeepp--bbaaddhhaasshh
  182. Keeps downloaded Metalink's files with a bad hash. It appends
  183. .badhash to the name of Metalink's files which have a checksum
  184. mismatch, except without overwriting existing files.
  185.  
  186. ----mmeettaalliinnkk--oovveerr--hhttttpp
  187. Issues HTTP HEAD request instead of GET and extracts Metalink
  188. metadata from response headers. Then it switches to Metalink
  189. download. If no valid Metalink metadata is found, it falls back to
  190. ordinary HTTP download. Enables CCoonntteenntt--TTyyppee::
  191. aapppplliiccaattiioonn//mmeettaalliinnkk44++xxmmll files download/processing.
  192.  
  193. ----mmeettaalliinnkk--iinnddeexx==_n_u_m_b_e_r
  194. Set the Metalink aapppplliiccaattiioonn//mmeettaalliinnkk44++xxmmll metaurl ordinal NUMBER.
  195. From 1 to the total number of "application/metalink4+xml"
  196. available. Specify 0 or iinnff to choose the first good one.
  197. Metaurls, such as those from a ----mmeettaalliinnkk--oovveerr--hhttttpp, may have been
  198. sorted by priority key's value; keep this in mind to choose the
  199. right NUMBER.
  200.  
  201. ----pprreeffeerrrreedd--llooccaattiioonn
  202. Set preferred location for Metalink resources. This has effect if
  203. multiple resources with same priority are available.
  204.  
  205. --FF
  206. ----ffoorrccee--hhttmmll
  207. When input is read from a file, force it to be treated as an HTML
  208. file. This enables you to retrieve relative links from existing
  209. HTML files on your local disk, by adding "<base href="url">" to
  210. HTML, or using the ----bbaassee command-line option.
  211.  
  212. --BB _U_R_L
  213. ----bbaassee==_U_R_L
  214. Resolves relative links using _U_R_L as the point of reference, when
  215. reading links from an HTML file specified via the --ii/----iinnppuutt--ffiillee
  216. option (together with ----ffoorrccee--hhttmmll, or when the input file was
  217. fetched remotely from a server describing it as HTML). This is
  218. equivalent to the presence of a "BASE" tag in the HTML input file,
  219. with _U_R_L as the value for the "href" attribute.
  220.  
  221. For instance, if you specify hhttttpp::////ffoooo//bbaarr//aa..hhttmmll for _U_R_L, and
  222. Wget reads ....//bbaazz//bb..hhttmmll from the input file, it would be resolved
  223. to hhttttpp::////ffoooo//bbaazz//bb..hhttmmll.
  224.  
  225. ----ccoonnffiigg==_F_I_L_E
  226. Specify the location of a startup file you wish to use.
  227.  
  228. ----rreejjeecctteedd--lloogg==_l_o_g_f_i_l_e
  229. Logs all URL rejections to _l_o_g_f_i_l_e as comma separated values. The
  230. values include the reason of rejection, the URL and the parent URL
  231. it was found in.
  232.  
  233. DDoowwnnllooaadd OOppttiioonnss
  234. ----bbiinndd--aaddddrreessss==_A_D_D_R_E_S_S
  235. When making client TCP/IP connections, bind to _A_D_D_R_E_S_S on the local
  236. machine. _A_D_D_R_E_S_S may be specified as a hostname or IP address.
  237. This option can be useful if your machine is bound to multiple IPs.
  238.  
  239. ----bbiinndd--ddnnss--aaddddrreessss==_A_D_D_R_E_S_S
  240. [libcares only] This address overrides the route for DNS requests.
  241. If you ever need to circumvent the standard settings from
  242. /etc/resolv.conf, this option together with ----ddnnss--sseerrvveerrss is your
  243. friend. _A_D_D_R_E_S_S must be specified either as IPv4 or IPv6 address.
  244. Wget needs to be built with libcares for this option to be
  245. available.
  246.  
  247. ----ddnnss--sseerrvveerrss==_A_D_D_R_E_S_S_E_S
  248. [libcares only] The given address(es) override the standard
  249. nameserver addresses, e.g. as configured in /etc/resolv.conf.
  250. _A_D_D_R_E_S_S_E_S may be specified either as IPv4 or IPv6 addresses, comma-
  251. separated. Wget needs to be built with libcares for this option to
  252. be available.
  253.  
  254. --tt _n_u_m_b_e_r
  255. ----ttrriieess==_n_u_m_b_e_r
  256. Set number of tries to _n_u_m_b_e_r. Specify 0 or iinnff for infinite
  257. retrying. The default is to retry 20 times, with the exception of
  258. fatal errors like "connection refused" or "not found" (404), which
  259. are not retried.
  260.  
  261. --OO _f_i_l_e
  262. ----oouuttppuutt--ddooccuummeenntt==_f_i_l_e
  263. The documents will not be written to the appropriate files, but all
  264. will be concatenated together and written to _f_i_l_e. If -- is used as
  265. _f_i_l_e, documents will be printed to standard output, disabling link
  266. conversion. (Use ..//-- to print to a file literally named --.)
  267.  
  268. Use of --OO is _n_o_t intended to mean simply "use the name _f_i_l_e instead
  269. of the one in the URL;" rather, it is analogous to shell
  270. redirection: wwggeett --OO ffiillee hhttttpp::////ffoooo is intended to work like wwggeett
  271. --OO -- hhttttpp::////ffoooo >> ffiillee; _f_i_l_e will be truncated immediately, and _a_l_l
  272. downloaded content will be written there.
  273.  
  274. For this reason, --NN (for timestamp-checking) is not supported in
  275. combination with --OO: since _f_i_l_e is always newly created, it will
  276. always have a very new timestamp. A warning will be issued if this
  277. combination is used.
  278.  
  279. Similarly, using --rr or --pp with --OO may not work as you expect: Wget
  280. won't just download the first file to _f_i_l_e and then download the
  281. rest to their normal names: _a_l_l downloaded content will be placed
  282. in _f_i_l_e. This was disabled in version 1.11, but has been reinstated
  283. (with a warning) in 1.11.2, as there are some cases where this
  284. behavior can actually have some use.
  285.  
  286. A combination with --nncc is only accepted if the given output file
  287. does not exist.
  288.  
  289. Note that a combination with --kk is only permitted when downloading
  290. a single document, as in that case it will just convert all
  291. relative URIs to external ones; --kk makes no sense for multiple URIs
  292. when they're all being downloaded to a single file; --kk can be used
  293. only when the output is a regular file.
  294.  
  295. --nncc
  296. ----nnoo--cclloobbbbeerr
  297. If a file is downloaded more than once in the same directory,
  298. Wget's behavior depends on a few options, including --nncc. In
  299. certain cases, the local file will be _c_l_o_b_b_e_r_e_d, or overwritten,
  300. upon repeated download. In other cases it will be preserved.
  301.  
  302. When running Wget without --NN, --nncc, --rr, or --pp, downloading the same
  303. file in the same directory will result in the original copy of _f_i_l_e
  304. being preserved and the second copy being named _f_i_l_e..11. If that
  305. file is downloaded yet again, the third copy will be named _f_i_l_e..22,
  306. and so on. (This is also the behavior with --nndd, even if --rr or --pp
  307. are in effect.) When --nncc is specified, this behavior is
  308. suppressed, and Wget will refuse to download newer copies of _f_i_l_e.
  309. Therefore, ""no-clobber"" is actually a misnomer in this
  310. mode---it's not clobbering that's prevented (as the numeric
  311. suffixes were already preventing clobbering), but rather the
  312. multiple version saving that's prevented.
  313.  
  314. When running Wget with --rr or --pp, but without --NN, --nndd, or --nncc, re-
  315. downloading a file will result in the new copy simply overwriting
  316. the old. Adding --nncc will prevent this behavior, instead causing
  317. the original version to be preserved and any newer copies on the
  318. server to be ignored.
  319.  
  320. When running Wget with --NN, with or without --rr or --pp, the decision
  321. as to whether or not to download a newer copy of a file depends on
  322. the local and remote timestamp and size of the file. --nncc may not
  323. be specified at the same time as --NN.
  324.  
  325. A combination with --OO/----oouuttppuutt--ddooccuummeenntt is only accepted if the
  326. given output file does not exist.
  327.  
  328. Note that when --nncc is specified, files with the suffixes ..hhttmmll or
  329. ..hhttmm will be loaded from the local disk and parsed as if they had
  330. been retrieved from the Web.
  331.  
  332. ----bbaacckkuuppss==_b_a_c_k_u_p_s
  333. Before (over)writing a file, back up an existing file by adding a
  334. ..11 suffix (__11 on VMS) to the file name. Such backup files are
  335. rotated to ..22, ..33, and so on, up to _b_a_c_k_u_p_s (and lost beyond that).
  336.  
  337. --cc
  338. ----ccoonnttiinnuuee
  339. Continue getting a partially-downloaded file. This is useful when
  340. you want to finish up a download started by a previous instance of
  341. Wget, or by another program. For instance:
  342.  
  343. wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
  344.  
  345. If there is a file named _l_s_-_l_R_._Z in the current directory, Wget
  346. will assume that it is the first portion of the remote file, and
  347. will ask the server to continue the retrieval from an offset equal
  348. to the length of the local file.
  349.  
  350. Note that you don't need to specify this option if you just want
  351. the current invocation of Wget to retry downloading a file should
  352. the connection be lost midway through. This is the default
  353. behavior. --cc only affects resumption of downloads started _p_r_i_o_r to
  354. this invocation of Wget, and whose local files are still sitting
  355. around.
  356.  
  357. Without --cc, the previous example would just download the remote
  358. file to _l_s_-_l_R_._Z_._1, leaving the truncated _l_s_-_l_R_._Z file alone.
  359.  
  360. If you use --cc on a non-empty file, and the server does not support
  361. continued downloading, Wget will restart the download from scratch
  362. and overwrite the existing file entirely.
  363.  
  364. Beginning with Wget 1.7, if you use --cc on a file which is of equal
  365. size as the one on the server, Wget will refuse to download the
  366. file and print an explanatory message. The same happens when the
  367. file is smaller on the server than locally (presumably because it
  368. was changed on the server since your last download
  369. attempt)---because "continuing" is not meaningful, no download
  370. occurs.
  371.  
  372. On the other side of the coin, while using --cc, any file that's
  373. bigger on the server than locally will be considered an incomplete
  374. download and only "(length(remote) - length(local))" bytes will be
  375. downloaded and tacked onto the end of the local file. This
  376. behavior can be desirable in certain cases---for instance, you can
  377. use wwggeett --cc to download just the new portion that's been appended
  378. to a data collection or log file.
  379.  
  380. However, if the file is bigger on the server because it's been
  381. _c_h_a_n_g_e_d, as opposed to just _a_p_p_e_n_d_e_d to, you'll end up with a
  382. garbled file. Wget has no way of verifying that the local file is
  383. really a valid prefix of the remote file. You need to be
  384. especially careful of this when using --cc in conjunction with --rr,
  385. since every file will be considered as an "incomplete download"
  386. candidate.
  387.  
  388. Another instance where you'll get a garbled file if you try to use
  389. --cc is if you have a lame HTTP proxy that inserts a "transfer
  390. interrupted" string into the local file. In the future a
  391. "rollback" option may be added to deal with this case.
  392.  
  393. Note that --cc only works with FTP servers and with HTTP servers that
  394. support the "Range" header.
  395.  
  396. ----ssttaarrtt--ppooss==_O_F_F_S_E_T
  397. Start downloading at zero-based position _O_F_F_S_E_T. Offset may be
  398. expressed in bytes, kilobytes with the `k' suffix, or megabytes
  399. with the `m' suffix, etc.
  400.  
  401. ----ssttaarrtt--ppooss has higher precedence over ----ccoonnttiinnuuee. When
  402. ----ssttaarrtt--ppooss and ----ccoonnttiinnuuee are both specified, wget will emit a
  403. warning then proceed as if ----ccoonnttiinnuuee was absent.
  404.  
  405. Server support for continued download is required, otherwise
  406. ----ssttaarrtt--ppooss cannot help. See --cc for details.
  407.  
  408. ----pprrooggrreessss==_t_y_p_e
  409. Select the type of the progress indicator you wish to use. Legal
  410. indicators are "dot" and "bar".
  411.  
  412. The "bar" indicator is used by default. It draws an ASCII progress
  413. bar graphics (a.k.a "thermometer" display) indicating the status of
  414. retrieval. If the output is not a TTY, the "dot" bar will be used
  415. by default.
  416.  
  417. Use ----pprrooggrreessss==ddoott to switch to the "dot" display. It traces the
  418. retrieval by printing dots on the screen, each dot representing a
  419. fixed amount of downloaded data.
  420.  
  421. The progress _t_y_p_e can also take one or more parameters. The
  422. parameters vary based on the _t_y_p_e selected. Parameters to _t_y_p_e are
  423. passed by appending them to the type sperated by a colon (:) like
  424. this: ----pprrooggrreessss==_t_y_p_e::_p_a_r_a_m_e_t_e_r_1::_p_a_r_a_m_e_t_e_r_2.
  425.  
  426. When using the dotted retrieval, you may set the _s_t_y_l_e by
  427. specifying the type as ddoott::_s_t_y_l_e. Different styles assign
  428. different meaning to one dot. With the "default" style each dot
  429. represents 1K, there are ten dots in a cluster and 50 dots in a
  430. line. The "binary" style has a more "computer"-like
  431. orientation---8K dots, 16-dots clusters and 48 dots per line (which
  432. makes for 384K lines). The "mega" style is suitable for
  433. downloading large files---each dot represents 64K retrieved, there
  434. are eight dots in a cluster, and 48 dots on each line (so each line
  435. contains 3M). If "mega" is not enough then you can use the "giga"
  436. style---each dot represents 1M retrieved, there are eight dots in a
  437. cluster, and 32 dots on each line (so each line contains 32M).
  438.  
  439. With ----pprrooggrreessss==bbaarr, there are currently two possible parameters,
  440. _f_o_r_c_e and _n_o_s_c_r_o_l_l.
  441.  
  442. When the output is not a TTY, the progress bar always falls back to
  443. "dot", even if ----pprrooggrreessss==bbaarr was passed to Wget during invocation.
  444. This behaviour can be overridden and the "bar" output forced by
  445. using the "force" parameter as ----pprrooggrreessss==bbaarr::ffoorrccee.
  446.  
  447. By default, the bbaarr style progress bar scroll the name of the file
  448. from left to right for the file being downloaded if the filename
  449. exceeds the maximum length allotted for its display. In certain
  450. cases, such as with ----pprrooggrreessss==bbaarr::ffoorrccee, one may not want the
  451. scrolling filename in the progress bar. By passing the "noscroll"
  452. parameter, Wget can be forced to display as much of the filename as
  453. possible without scrolling through it.
  454.  
  455. Note that you can set the default style using the "progress"
  456. command in _._w_g_e_t_r_c. That setting may be overridden from the
  457. command line. For example, to force the bar output without
  458. scrolling, use ----pprrooggrreessss==bbaarr::ffoorrccee::nnoossccrroollll.
  459.  
  460. ----sshhooww--pprrooggrreessss
  461. Force wget to display the progress bar in any verbosity.
  462.  
  463. By default, wget only displays the progress bar in verbose mode.
  464. One may however, want wget to display the progress bar on screen in
  465. conjunction with any other verbosity modes like ----nnoo--vveerrbboossee or
  466. ----qquuiieett. This is often a desired a property when invoking wget to
  467. download several small/large files. In such a case, wget could
  468. simply be invoked with this parameter to get a much cleaner output
  469. on the screen.
  470.  
  471. This option will also force the progress bar to be printed to
  472. _s_t_d_e_r_r when used alongside the ----llooggffiillee option.
  473.  
  474. --NN
  475. ----ttiimmeessttaammppiinngg
  476. Turn on time-stamping.
  477.  
  478. ----nnoo--iiff--mmooddiiffiieedd--ssiinnccee
  479. Do not send If-Modified-Since header in --NN mode. Send preliminary
  480. HEAD request instead. This has only effect in --NN mode.
  481.  
  482. ----nnoo--uussee--sseerrvveerr--ttiimmeessttaammppss
  483. Don't set the local file's timestamp by the one on the server.
  484.  
  485. By default, when a file is downloaded, its timestamps are set to
  486. match those from the remote file. This allows the use of
  487. ----ttiimmeessttaammppiinngg on subsequent invocations of wget. However, it is
  488. sometimes useful to base the local file's timestamp on when it was
  489. actually downloaded; for that purpose, the
  490. ----nnoo--uussee--sseerrvveerr--ttiimmeessttaammppss option has been provided.
  491.  
  492. --SS
  493. ----sseerrvveerr--rreessppoonnssee
  494. Print the headers sent by HTTP servers and responses sent by FTP
  495. servers.
  496.  
  497. ----ssppiiddeerr
  498. When invoked with this option, Wget will behave as a Web _s_p_i_d_e_r,
  499. which means that it will not download the pages, just check that
  500. they are there. For example, you can use Wget to check your
  501. bookmarks:
  502.  
  503. wget --spider --force-html -i bookmarks.html
  504.  
  505. This feature needs much more work for Wget to get close to the
  506. functionality of real web spiders.
  507.  
  508. --TT sseeccoonnddss
  509. ----ttiimmeeoouutt==_s_e_c_o_n_d_s
  510. Set the network timeout to _s_e_c_o_n_d_s seconds. This is equivalent to
  511. specifying ----ddnnss--ttiimmeeoouutt, ----ccoonnnneecctt--ttiimmeeoouutt, and ----rreeaadd--ttiimmeeoouutt,
  512. all at the same time.
  513.  
  514. When interacting with the network, Wget can check for timeout and
  515. abort the operation if it takes too long. This prevents anomalies
  516. like hanging reads and infinite connects. The only timeout enabled
  517. by default is a 900-second read timeout. Setting a timeout to 0
  518. disables it altogether. Unless you know what you are doing, it is
  519. best not to change the default timeout settings.
  520.  
  521. All timeout-related options accept decimal values, as well as
  522. subsecond values. For example, 00..11 seconds is a legal (though
  523. unwise) choice of timeout. Subsecond timeouts are useful for
  524. checking server response times or for testing network latency.
  525.  
  526. ----ddnnss--ttiimmeeoouutt==_s_e_c_o_n_d_s
  527. Set the DNS lookup timeout to _s_e_c_o_n_d_s seconds. DNS lookups that
  528. don't complete within the specified time will fail. By default,
  529. there is no timeout on DNS lookups, other than that implemented by
  530. system libraries.
  531.  
  532. ----ccoonnnneecctt--ttiimmeeoouutt==_s_e_c_o_n_d_s
  533. Set the connect timeout to _s_e_c_o_n_d_s seconds. TCP connections that
  534. take longer to establish will be aborted. By default, there is no
  535. connect timeout, other than that implemented by system libraries.
  536.  
  537. ----rreeaadd--ttiimmeeoouutt==_s_e_c_o_n_d_s
  538. Set the read (and write) timeout to _s_e_c_o_n_d_s seconds. The "time" of
  539. this timeout refers to _i_d_l_e _t_i_m_e: if, at any point in the download,
  540. no data is received for more than the specified number of seconds,
  541. reading fails and the download is restarted. This option does not
  542. directly affect the duration of the entire download.
  543.  
  544. Of course, the remote server may choose to terminate the connection
  545. sooner than this option requires. The default read timeout is 900
  546. seconds.
  547.  
  548. ----lliimmiitt--rraattee==_a_m_o_u_n_t
  549. Limit the download speed to _a_m_o_u_n_t bytes per second. Amount may be
  550. expressed in bytes, kilobytes with the kk suffix, or megabytes with
  551. the mm suffix. For example, ----lliimmiitt--rraattee==2200kk will limit the
  552. retrieval rate to 20KB/s. This is useful when, for whatever
  553. reason, you don't want Wget to consume the entire available
  554. bandwidth.
  555.  
  556. This option allows the use of decimal numbers, usually in
  557. conjunction with power suffixes; for example, ----lliimmiitt--rraattee==22..55kk is
  558. a legal value.
  559.  
  560. Note that Wget implements the limiting by sleeping the appropriate
  561. amount of time after a network read that took less time than
  562. specified by the rate. Eventually this strategy causes the TCP
  563. transfer to slow down to approximately the specified rate.
  564. However, it may take some time for this balance to be achieved, so
  565. don't be surprised if limiting the rate doesn't work well with very
  566. small files.
  567.  
  568. --ww _s_e_c_o_n_d_s
  569. ----wwaaiitt==_s_e_c_o_n_d_s
  570. Wait the specified number of seconds between the retrievals. Use
  571. of this option is recommended, as it lightens the server load by
  572. making the requests less frequent. Instead of in seconds, the time
  573. can be specified in minutes using the "m" suffix, in hours using
  574. "h" suffix, or in days using "d" suffix.
  575.  
  576. Specifying a large value for this option is useful if the network
  577. or the destination host is down, so that Wget can wait long enough
  578. to reasonably expect the network error to be fixed before the
  579. retry. The waiting interval specified by this function is
  580. influenced by "--random-wait", which see.
  581.  
  582. ----wwaaiittrreettrryy==_s_e_c_o_n_d_s
  583. If you don't want Wget to wait between _e_v_e_r_y retrieval, but only
  584. between retries of failed downloads, you can use this option. Wget
  585. will use _l_i_n_e_a_r _b_a_c_k_o_f_f, waiting 1 second after the first failure
  586. on a given file, then waiting 2 seconds after the second failure on
  587. that file, up to the maximum number of _s_e_c_o_n_d_s you specify.
  588.  
  589. By default, Wget will assume a value of 10 seconds.
  590.  
  591. ----rraannddoomm--wwaaiitt
  592. Some web sites may perform log analysis to identify retrieval
  593. programs such as Wget by looking for statistically significant
  594. similarities in the time between requests. This option causes the
  595. time between requests to vary between 0.5 and 1.5 * _w_a_i_t seconds,
  596. where _w_a_i_t was specified using the ----wwaaiitt option, in order to mask
  597. Wget's presence from such analysis.
  598.  
  599. A 2001 article in a publication devoted to development on a popular
  600. consumer platform provided code to perform this analysis on the
  601. fly. Its author suggested blocking at the class C address level to
  602. ensure automated retrieval programs were blocked despite changing
  603. DHCP-supplied addresses.
  604.  
  605. The ----rraannddoomm--wwaaiitt option was inspired by this ill-advised
  606. recommendation to block many unrelated users from a web site due to
  607. the actions of one.
  608.  
  609. ----nnoo--pprrooxxyy
  610. Don't use proxies, even if the appropriate *_proxy environment
  611. variable is defined.
  612.  
  613. --QQ _q_u_o_t_a
  614. ----qquuoottaa==_q_u_o_t_a
  615. Specify download quota for automatic retrievals. The value can be
  616. specified in bytes (default), kilobytes (with kk suffix), or
  617. megabytes (with mm suffix).
  618.  
  619. Note that quota will never affect downloading a single file. So if
  620. you specify wwggeett --QQ1100kk hhttttppss::////eexxaammppllee..ccoomm//llss--llRR..ggzz, all of the
  621. _l_s_-_l_R_._g_z will be downloaded. The same goes even when several URLs
  622. are specified on the command-line. However, quota is respected
  623. when retrieving either recursively, or from an input file. Thus
  624. you may safely type wwggeett --QQ22mm --ii ssiitteess---download will be aborted
  625. when the quota is exceeded.
  626.  
  627. Setting quota to 0 or to iinnff unlimits the download quota.
  628.  
  629. ----nnoo--ddnnss--ccaacchhee
  630. Turn off caching of DNS lookups. Normally, Wget remembers the IP
  631. addresses it looked up from DNS so it doesn't have to repeatedly
  632. contact the DNS server for the same (typically small) set of hosts
  633. it retrieves from. This cache exists in memory only; a new Wget
  634. run will contact DNS again.
  635.  
  636. However, it has been reported that in some situations it is not
  637. desirable to cache host names, even for the duration of a short-
  638. running application like Wget. With this option Wget issues a new
  639. DNS lookup (more precisely, a new call to "gethostbyname" or
  640. "getaddrinfo") each time it makes a new connection. Please note
  641. that this option will _n_o_t affect caching that might be performed by
  642. the resolving library or by an external caching layer, such as
  643. NSCD.
  644.  
  645. If you don't understand exactly what this option does, you probably
  646. won't need it.
  647.  
  648. ----rreessttrriicctt--ffiillee--nnaammeess==_m_o_d_e_s
  649. Change which characters found in remote URLs must be escaped during
  650. generation of local filenames. Characters that are _r_e_s_t_r_i_c_t_e_d by
  651. this option are escaped, i.e. replaced with %%HHHH, where HHHH is the
  652. hexadecimal number that corresponds to the restricted character.
  653. This option may also be used to force all alphabetical cases to be
  654. either lower- or uppercase.
  655.  
  656. By default, Wget escapes the characters that are not valid or safe
  657. as part of file names on your operating system, as well as control
  658. characters that are typically unprintable. This option is useful
  659. for changing these defaults, perhaps because you are downloading to
  660. a non-native partition, or because you want to disable escaping of
  661. the control characters, or you want to further restrict characters
  662. to only those in the ASCII range of values.
  663.  
  664. The _m_o_d_e_s are a comma-separated set of text values. The acceptable
  665. values are uunniixx, wwiinnddoowwss, nnooccoonnttrrooll, aasscciiii, lloowweerrccaassee, and
  666. uuppppeerrccaassee. The values uunniixx and wwiinnddoowwss are mutually exclusive (one
  667. will override the other), as are lloowweerrccaassee and uuppppeerrccaassee. Those
  668. last are special cases, as they do not change the set of characters
  669. that would be escaped, but rather force local file paths to be
  670. converted either to lower- or uppercase.
  671.  
  672. When "unix" is specified, Wget escapes the character // and the
  673. control characters in the ranges 0--31 and 128--159. This is the
  674. default on Unix-like operating systems.
  675.  
  676. When "windows" is given, Wget escapes the characters \\, ||, //, ::, ??,
  677. "", **, <<, >>, and the control characters in the ranges 0--31 and
  678. 128--159. In addition to this, Wget in Windows mode uses ++ instead
  679. of :: to separate host and port in local file names, and uses @@
  680. instead of ?? to separate the query portion of the file name from
  681. the rest. Therefore, a URL that would be saved as
  682. wwwwww..xxeemmaaccss..oorrgg::44330000//sseeaarrcchh..ppll??iinnppuutt==bbllaahh in Unix mode would be
  683. saved as wwwwww..xxeemmaaccss..oorrgg++44330000//sseeaarrcchh..ppll@@iinnppuutt==bbllaahh in Windows mode.
  684. This mode is the default on Windows.
  685.  
  686. If you specify nnooccoonnttrrooll, then the escaping of the control
  687. characters is also switched off. This option may make sense when
  688. you are downloading URLs whose names contain UTF-8 characters, on a
  689. system which can save and display filenames in UTF-8 (some possible
  690. byte values used in UTF-8 byte sequences fall in the range of
  691. values designated by Wget as "controls").
  692.  
  693. The aasscciiii mode is used to specify that any bytes whose values are
  694. outside the range of ASCII characters (that is, greater than 127)
  695. shall be escaped. This can be useful when saving filenames whose
  696. encoding does not match the one used locally.
  697.  
  698. --44
  699. ----iinneett44--oonnllyy
  700. --66
  701. ----iinneett66--oonnllyy
  702. Force connecting to IPv4 or IPv6 addresses. With ----iinneett44--oonnllyy or
  703. --44, Wget will only connect to IPv4 hosts, ignoring AAAA records in
  704. DNS, and refusing to connect to IPv6 addresses specified in URLs.
  705. Conversely, with ----iinneett66--oonnllyy or --66, Wget will only connect to IPv6
  706. hosts and ignore A records and IPv4 addresses.
  707.  
  708. Neither options should be needed normally. By default, an
  709. IPv6-aware Wget will use the address family specified by the host's
  710. DNS record. If the DNS responds with both IPv4 and IPv6 addresses,
  711. Wget will try them in sequence until it finds one it can connect
  712. to. (Also see "--prefer-family" option described below.)
  713.  
  714. These options can be used to deliberately force the use of IPv4 or
  715. IPv6 address families on dual family systems, usually to aid
  716. debugging or to deal with broken network configuration. Only one
  717. of ----iinneett66--oonnllyy and ----iinneett44--oonnllyy may be specified at the same time.
  718. Neither option is available in Wget compiled without IPv6 support.
  719.  
  720. ----pprreeffeerr--ffaammiillyy==nnoonnee//IIPPvv44//IIPPvv66
  721. When given a choice of several addresses, connect to the addresses
  722. with specified address family first. The address order returned by
  723. DNS is used without change by default.
  724.  
  725. This avoids spurious errors and connect attempts when accessing
  726. hosts that resolve to both IPv6 and IPv4 addresses from IPv4
  727. networks. For example, wwwwww..kkaammee..nneett resolves to
  728. 22000011::220000::00::88000022::220033::4477ffff::ffeeaa55::33008855 and to 220033..117788..114411..119944. When
  729. the preferred family is "IPv4", the IPv4 address is used first;
  730. when the preferred family is "IPv6", the IPv6 address is used
  731. first; if the specified value is "none", the address order returned
  732. by DNS is used without change.
  733.  
  734. Unlike --44 and --66, this option doesn't inhibit access to any address
  735. family, it only changes the _o_r_d_e_r in which the addresses are
  736. accessed. Also note that the reordering performed by this option
  737. is _s_t_a_b_l_e---it doesn't affect order of addresses of the same
  738. family. That is, the relative order of all IPv4 addresses and of
  739. all IPv6 addresses remains intact in all cases.
  740.  
  741. ----rreettrryy--ccoonnnnrreeffuusseedd
  742. Consider "connection refused" a transient error and try again.
  743. Normally Wget gives up on a URL when it is unable to connect to the
  744. site because failure to connect is taken as a sign that the server
  745. is not running at all and that retries would not help. This option
  746. is for mirroring unreliable sites whose servers tend to disappear
  747. for short periods of time.
  748.  
  749. ----uusseerr==_u_s_e_r
  750. ----ppaasssswwoorrdd==_p_a_s_s_w_o_r_d
  751. Specify the username _u_s_e_r and password _p_a_s_s_w_o_r_d for both FTP and
  752. HTTP file retrieval. These parameters can be overridden using the
  753. ----ffttpp--uusseerr and ----ffttpp--ppaasssswwoorrdd options for FTP connections and the
  754. ----hhttttpp--uusseerr and ----hhttttpp--ppaasssswwoorrdd options for HTTP connections.
  755.  
  756. ----aasskk--ppaasssswwoorrdd
  757. Prompt for a password for each connection established. Cannot be
  758. specified when ----ppaasssswwoorrdd is being used, because they are mutually
  759. exclusive.
  760.  
  761. ----uussee--aasskkppaassss==_c_o_m_m_a_n_d
  762. Prompt for a user and password using the specified command. If no
  763. command is specified then the command in the environment variable
  764. WGET_ASKPASS is used. If WGET_ASKPASS is not set then the command
  765. in the environment variable SSH_ASKPASS is used.
  766.  
  767. You can set the default command for use-askpass in the _._w_g_e_t_r_c.
  768. That setting may be overridden from the command line.
  769.  
  770. ----nnoo--iirrii
  771. Turn off internationalized URI (IRI) support. Use ----iirrii to turn it
  772. on. IRI support is activated by default.
  773.  
  774. You can set the default state of IRI support using the "iri"
  775. command in _._w_g_e_t_r_c. That setting may be overridden from the command
  776. line.
  777.  
  778. ----llooccaall--eennccooddiinngg==_e_n_c_o_d_i_n_g
  779. Force Wget to use _e_n_c_o_d_i_n_g as the default system encoding. That
  780. affects how Wget converts URLs specified as arguments from locale
  781. to UTF-8 for IRI support.
  782.  
  783. Wget use the function "nl_langinfo()" and then the "CHARSET"
  784. environment variable to get the locale. If it fails, ASCII is used.
  785.  
  786. You can set the default local encoding using the "local_encoding"
  787. command in _._w_g_e_t_r_c. That setting may be overridden from the command
  788. line.
  789.  
  790. ----rreemmoottee--eennccooddiinngg==_e_n_c_o_d_i_n_g
  791. Force Wget to use _e_n_c_o_d_i_n_g as the default remote server encoding.
  792. That affects how Wget converts URIs found in files from remote
  793. encoding to UTF-8 during a recursive fetch. This options is only
  794. useful for IRI support, for the interpretation of non-ASCII
  795. characters.
  796.  
  797. For HTTP, remote encoding can be found in HTTP "Content-Type"
  798. header and in HTML "Content-Type http-equiv" meta tag.
  799.  
  800. You can set the default encoding using the "remoteencoding" command
  801. in _._w_g_e_t_r_c. That setting may be overridden from the command line.
  802.  
  803. ----uunnlliinnkk
  804. Force Wget to unlink file instead of clobbering existing file. This
  805. option is useful for downloading to the directory with hardlinks.
  806.  
  807. DDiirreeccttoorryy OOppttiioonnss
  808. --nndd
  809. ----nnoo--ddiirreeccttoorriieess
  810. Do not create a hierarchy of directories when retrieving
  811. recursively. With this option turned on, all files will get saved
  812. to the current directory, without clobbering (if a name shows up
  813. more than once, the filenames will get extensions ..nn).
  814.  
  815. --xx
  816. ----ffoorrccee--ddiirreeccttoorriieess
  817. The opposite of --nndd---create a hierarchy of directories, even if
  818. one would not have been created otherwise. E.g. wwggeett --xx
  819. hhttttpp::////ffllyy..ssrrkk..ffeerr..hhrr//rroobboottss..ttxxtt will save the downloaded file to
  820. _f_l_y_._s_r_k_._f_e_r_._h_r_/_r_o_b_o_t_s_._t_x_t.
  821.  
  822. --nnHH
  823. ----nnoo--hhoosstt--ddiirreeccttoorriieess
  824. Disable generation of host-prefixed directories. By default,
  825. invoking Wget with --rr hhttttpp::////ffllyy..ssrrkk..ffeerr..hhrr// will create a
  826. structure of directories beginning with _f_l_y_._s_r_k_._f_e_r_._h_r_/. This
  827. option disables such behavior.
  828.  
  829. ----pprroottooccooll--ddiirreeccttoorriieess
  830. Use the protocol name as a directory component of local file names.
  831. For example, with this option, wwggeett --rr hhttttpp::////_h_o_s_t will save to
  832. hhttttpp//_h_o_s_t//...... rather than just to _h_o_s_t//.......
  833.  
  834. ----ccuutt--ddiirrss==_n_u_m_b_e_r
  835. Ignore _n_u_m_b_e_r directory components. This is useful for getting a
  836. fine-grained control over the directory where recursive retrieval
  837. will be saved.
  838.  
  839. Take, for example, the directory at
  840. ffttpp::////ffttpp..xxeemmaaccss..oorrgg//ppuubb//xxeemmaaccss//. If you retrieve it with --rr, it
  841. will be saved locally under _f_t_p_._x_e_m_a_c_s_._o_r_g_/_p_u_b_/_x_e_m_a_c_s_/. While the
  842. --nnHH option can remove the _f_t_p_._x_e_m_a_c_s_._o_r_g_/ part, you are still stuck
  843. with _p_u_b_/_x_e_m_a_c_s. This is where ----ccuutt--ddiirrss comes in handy; it makes
  844. Wget not "see" _n_u_m_b_e_r remote directory components. Here are
  845. several examples of how ----ccuutt--ddiirrss option works.
  846.  
  847. No options -> ftp.xemacs.org/pub/xemacs/
  848. -nH -> pub/xemacs/
  849. -nH --cut-dirs=1 -> xemacs/
  850. -nH --cut-dirs=2 -> .
  851.  
  852. --cut-dirs=1 -> ftp.xemacs.org/xemacs/
  853. ...
  854.  
  855. If you just want to get rid of the directory structure, this option
  856. is similar to a combination of --nndd and --PP. However, unlike --nndd,
  857. ----ccuutt--ddiirrss does not lose with subdirectories---for instance, with
  858. --nnHH ----ccuutt--ddiirrss==11, a _b_e_t_a_/ subdirectory will be placed to
  859. _x_e_m_a_c_s_/_b_e_t_a, as one would expect.
  860.  
  861. --PP _p_r_e_f_i_x
  862. ----ddiirreeccttoorryy--pprreeffiixx==_p_r_e_f_i_x
  863. Set directory prefix to _p_r_e_f_i_x. The _d_i_r_e_c_t_o_r_y _p_r_e_f_i_x is the
  864. directory where all other files and subdirectories will be saved
  865. to, i.e. the top of the retrieval tree. The default is .. (the
  866. current directory).
  867.  
  868. HHTTTTPP OOppttiioonnss
  869. ----ddeeffaauulltt--ppaaggee==_n_a_m_e
  870. Use _n_a_m_e as the default file name when it isn't known (i.e., for
  871. URLs that end in a slash), instead of _i_n_d_e_x_._h_t_m_l.
  872.  
  873. --EE
  874. ----aaddjjuusstt--eexxtteennssiioonn
  875. If a file of type aapppplliiccaattiioonn//xxhhttmmll++xxmmll or tteexxtt//hhttmmll is downloaded
  876. and the URL does not end with the regexp \\..[[HHhh]][[TTtt]][[MMmm]][[LLll]]??, this
  877. option will cause the suffix ..hhttmmll to be appended to the local
  878. filename. This is useful, for instance, when you're mirroring a
  879. remote site that uses ..aasspp pages, but you want the mirrored pages
  880. to be viewable on your stock Apache server. Another good use for
  881. this is when you're downloading CGI-generated materials. A URL
  882. like hhttttpp::////ssiittee..ccoomm//aarrttiiccllee..ccggii??2255 will be saved as
  883. _a_r_t_i_c_l_e_._c_g_i_?_2_5_._h_t_m_l.
  884.  
  885. Note that filenames changed in this way will be re-downloaded every
  886. time you re-mirror a site, because Wget can't tell that the local
  887. _X_._h_t_m_l file corresponds to remote URL _X (since it doesn't yet know
  888. that the URL produces output of type tteexxtt//hhttmmll or
  889. aapppplliiccaattiioonn//xxhhttmmll++xxmmll.
  890.  
  891. As of version 1.12, Wget will also ensure that any downloaded files
  892. of type tteexxtt//ccssss end in the suffix ..ccssss, and the option was renamed
  893. from ----hhttmmll--eexxtteennssiioonn, to better reflect its new behavior. The old
  894. option name is still acceptable, but should now be considered
  895. deprecated.
  896.  
  897. At some point in the future, this option may well be expanded to
  898. include suffixes for other types of content, including content
  899. types that are not parsed by Wget.
  900.  
  901. ----hhttttpp--uusseerr==_u_s_e_r
  902. ----hhttttpp--ppaasssswwoorrdd==_p_a_s_s_w_o_r_d
  903. Specify the username _u_s_e_r and password _p_a_s_s_w_o_r_d on an HTTP server.
  904. According to the type of the challenge, Wget will encode them using
  905. either the "basic" (insecure), the "digest", or the Windows "NTLM"
  906. authentication scheme.
  907.  
  908. Another way to specify username and password is in the URL itself.
  909. Either method reveals your password to anyone who bothers to run
  910. "ps". To prevent the passwords from being seen, use the
  911. ----uussee--aasskkppaassss or store them in _._w_g_e_t_r_c or _._n_e_t_r_c, and make sure to
  912. protect those files from other users with "chmod". If the
  913. passwords are really important, do not leave them lying in those
  914. files either---edit the files and delete them after Wget has
  915. started the download.
  916.  
  917. ----nnoo--hhttttpp--kkeeeepp--aalliivvee
  918. Turn off the "keep-alive" feature for HTTP downloads. Normally,
  919. Wget asks the server to keep the connection open so that, when you
  920. download more than one document from the same server, they get
  921. transferred over the same TCP connection. This saves time and at
  922. the same time reduces the load on the server.
  923.  
  924. This option is useful when, for some reason, persistent (keep-
  925. alive) connections don't work for you, for example due to a server
  926. bug or due to the inability of server-side scripts to cope with the
  927. connections.
  928.  
  929. ----nnoo--ccaacchhee
  930. Disable server-side cache. In this case, Wget will send the remote
  931. server an appropriate directive (PPrraaggmmaa:: nnoo--ccaacchhee) to get the file
  932. from the remote service, rather than returning the cached version.
  933. This is especially useful for retrieving and flushing out-of-date
  934. documents on proxy servers.
  935.  
  936. Caching is allowed by default.
  937.  
  938. ----nnoo--ccooookkiieess
  939. Disable the use of cookies. Cookies are a mechanism for
  940. maintaining server-side state. The server sends the client a
  941. cookie using the "Set-Cookie" header, and the client responds with
  942. the same cookie upon further requests. Since cookies allow the
  943. server owners to keep track of visitors and for sites to exchange
  944. this information, some consider them a breach of privacy. The
  945. default is to use cookies; however, _s_t_o_r_i_n_g cookies is not on by
  946. default.
  947.  
  948. ----llooaadd--ccooookkiieess _f_i_l_e
  949. Load cookies from _f_i_l_e before the first HTTP retrieval. _f_i_l_e is a
  950. textual file in the format originally used by Netscape's
  951. _c_o_o_k_i_e_s_._t_x_t file.
  952.  
  953. You will typically use this option when mirroring sites that
  954. require that you be logged in to access some or all of their
  955. content. The login process typically works by the web server
  956. issuing an HTTP cookie upon receiving and verifying your
  957. credentials. The cookie is then resent by the browser when
  958. accessing that part of the site, and so proves your identity.
  959.  
  960. Mirroring such a site requires Wget to send the same cookies your
  961. browser sends when communicating with the site. This is achieved
  962. by ----llooaadd--ccooookkiieess---simply point Wget to the location of the
  963. _c_o_o_k_i_e_s_._t_x_t file, and it will send the same cookies your browser
  964. would send in the same situation. Different browsers keep textual
  965. cookie files in different locations:
  966.  
  967. "Netscape 4.x."
  968. The cookies are in _~_/_._n_e_t_s_c_a_p_e_/_c_o_o_k_i_e_s_._t_x_t.
  969.  
  970. "Mozilla and Netscape 6.x."
  971. Mozilla's cookie file is also named _c_o_o_k_i_e_s_._t_x_t, located
  972. somewhere under _~_/_._m_o_z_i_l_l_a, in the directory of your profile.
  973. The full path usually ends up looking somewhat like
  974. _~_/_._m_o_z_i_l_l_a_/_d_e_f_a_u_l_t_/_s_o_m_e_-_w_e_i_r_d_-_s_t_r_i_n_g_/_c_o_o_k_i_e_s_._t_x_t.
  975.  
  976. "Internet Explorer."
  977. You can produce a cookie file Wget can use by using the File
  978. menu, Import and Export, Export Cookies. This has been tested
  979. with Internet Explorer 5; it is not guaranteed to work with
  980. earlier versions.
  981.  
  982. "Other browsers."
  983. If you are using a different browser to create your cookies,
  984. ----llooaadd--ccooookkiieess will only work if you can locate or produce a
  985. cookie file in the Netscape format that Wget expects.
  986.  
  987. If you cannot use ----llooaadd--ccooookkiieess, there might still be an
  988. alternative. If your browser supports a "cookie manager", you can
  989. use it to view the cookies used when accessing the site you're
  990. mirroring. Write down the name and value of the cookie, and
  991. manually instruct Wget to send those cookies, bypassing the
  992. "official" cookie support:
  993.  
  994. wget --no-cookies --header "Cookie: <name>=<value>"
  995.  
  996. ----ssaavvee--ccooookkiieess _f_i_l_e
  997. Save cookies to _f_i_l_e before exiting. This will not save cookies
  998. that have expired or that have no expiry time (so-called "session
  999. cookies"), but also see ----kkeeeepp--sseessssiioonn--ccooookkiieess.
  1000.  
  1001. ----kkeeeepp--sseessssiioonn--ccooookkiieess
  1002. When specified, causes ----ssaavvee--ccooookkiieess to also save session cookies.
  1003. Session cookies are normally not saved because they are meant to be
  1004. kept in memory and forgotten when you exit the browser. Saving
  1005. them is useful on sites that require you to log in or to visit the
  1006. home page before you can access some pages. With this option,
  1007. multiple Wget runs are considered a single browser session as far
  1008. as the site is concerned.
  1009.  
  1010. Since the cookie file format does not normally carry session
  1011. cookies, Wget marks them with an expiry timestamp of 0. Wget's
  1012. ----llooaadd--ccooookkiieess recognizes those as session cookies, but it might
  1013. confuse other browsers. Also note that cookies so loaded will be
  1014. treated as other session cookies, which means that if you want
  1015. ----ssaavvee--ccooookkiieess to preserve them again, you must use
  1016. ----kkeeeepp--sseessssiioonn--ccooookkiieess again.
  1017.  
  1018. ----iiggnnoorree--lleennggtthh
  1019. Unfortunately, some HTTP servers (CGI programs, to be more precise)
  1020. send out bogus "Content-Length" headers, which makes Wget go wild,
  1021. as it thinks not all the document was retrieved. You can spot this
  1022. syndrome if Wget retries getting the same document again and again,
  1023. each time claiming that the (otherwise normal) connection has
  1024. closed on the very same byte.
  1025.  
  1026. With this option, Wget will ignore the "Content-Length" header---as
  1027. if it never existed.
  1028.  
  1029. ----hheeaaddeerr==_h_e_a_d_e_r_-_l_i_n_e
  1030. Send _h_e_a_d_e_r_-_l_i_n_e along with the rest of the headers in each HTTP
  1031. request. The supplied header is sent as-is, which means it must
  1032. contain name and value separated by colon, and must not contain
  1033. newlines.
  1034.  
  1035. You may define more than one additional header by specifying
  1036. ----hheeaaddeerr more than once.
  1037.  
  1038. wget --header='Accept-Charset: iso-8859-2' \
  1039. --header='Accept-Language: hr' \
  1040. http://fly.srk.fer.hr/
  1041.  
  1042. Specification of an empty string as the header value will clear all
  1043. previous user-defined headers.
  1044.  
  1045. As of Wget 1.10, this option can be used to override headers
  1046. otherwise generated automatically. This example instructs Wget to
  1047. connect to localhost, but to specify ffoooo..bbaarr in the "Host" header:
  1048.  
  1049. wget --header="Host: foo.bar" http://localhost/
  1050.  
  1051. In versions of Wget prior to 1.10 such use of ----hheeaaddeerr caused
  1052. sending of duplicate headers.
  1053.  
  1054. ----mmaaxx--rreeddiirreecctt==_n_u_m_b_e_r
  1055. Specifies the maximum number of redirections to follow for a
  1056. resource. The default is 20, which is usually far more than
  1057. necessary. However, on those occasions where you want to allow more
  1058. (or fewer), this is the option to use.
  1059.  
  1060. ----pprrooxxyy--uusseerr==_u_s_e_r
  1061. ----pprrooxxyy--ppaasssswwoorrdd==_p_a_s_s_w_o_r_d
  1062. Specify the username _u_s_e_r and password _p_a_s_s_w_o_r_d for authentication
  1063. on a proxy server. Wget will encode them using the "basic"
  1064. authentication scheme.
  1065.  
  1066. Security considerations similar to those with ----hhttttpp--ppaasssswwoorrdd
  1067. pertain here as well.
  1068.  
  1069. ----rreeffeerreerr==_u_r_l
  1070. Include `Referer: _u_r_l' header in HTTP request. Useful for
  1071. retrieving documents with server-side processing that assume they
  1072. are always being retrieved by interactive web browsers and only
  1073. come out properly when Referer is set to one of the pages that
  1074. point to them.
  1075.  
  1076. ----ssaavvee--hheeaaddeerrss
  1077. Save the headers sent by the HTTP server to the file, preceding the
  1078. actual contents, with an empty line as the separator.
  1079.  
  1080. --UU _a_g_e_n_t_-_s_t_r_i_n_g
  1081. ----uusseerr--aaggeenntt==_a_g_e_n_t_-_s_t_r_i_n_g
  1082. Identify as _a_g_e_n_t_-_s_t_r_i_n_g to the HTTP server.
  1083.  
  1084. The HTTP protocol allows the clients to identify themselves using a
  1085. "User-Agent" header field. This enables distinguishing the WWW
  1086. software, usually for statistical purposes or for tracing of
  1087. protocol violations. Wget normally identifies as WWggeett//_v_e_r_s_i_o_n,
  1088. _v_e_r_s_i_o_n being the current version number of Wget.
  1089.  
  1090. However, some sites have been known to impose the policy of
  1091. tailoring the output according to the "User-Agent"-supplied
  1092. information. While this is not such a bad idea in theory, it has
  1093. been abused by servers denying information to clients other than
  1094. (historically) Netscape or, more frequently, Microsoft Internet
  1095. Explorer. This option allows you to change the "User-Agent" line
  1096. issued by Wget. Use of this option is discouraged, unless you
  1097. really know what you are doing.
  1098.  
  1099. Specifying empty user agent with ----uusseerr--aaggeenntt=="""" instructs Wget not
  1100. to send the "User-Agent" header in HTTP requests.
  1101.  
  1102. ----ppoosstt--ddaattaa==_s_t_r_i_n_g
  1103. ----ppoosstt--ffiillee==_f_i_l_e
  1104. Use POST as the method for all HTTP requests and send the specified
  1105. data in the request body. ----ppoosstt--ddaattaa sends _s_t_r_i_n_g as data,
  1106. whereas ----ppoosstt--ffiillee sends the contents of _f_i_l_e. Other than that,
  1107. they work in exactly the same way. In particular, they _b_o_t_h expect
  1108. content of the form "key1=value1&key2=value2", with percent-
  1109. encoding for special characters; the only difference is that one
  1110. expects its content as a command-line parameter and the other
  1111. accepts its content from a file. In particular, ----ppoosstt--ffiillee is _n_o_t
  1112. for transmitting files as form attachments: those must appear as
  1113. "key=value" data (with appropriate percent-coding) just like
  1114. everything else. Wget does not currently support
  1115. "multipart/form-data" for transmitting POST data; only
  1116. "application/x-www-form-urlencoded". Only one of ----ppoosstt--ddaattaa and
  1117. ----ppoosstt--ffiillee should be specified.
  1118.  
  1119. Please note that wget does not require the content to be of the
  1120. form "key1=value1&key2=value2", and neither does it test for it.
  1121. Wget will simply transmit whatever data is provided to it. Most
  1122. servers however expect the POST data to be in the above format when
  1123. processing HTML Forms.
  1124.  
  1125. When sending a POST request using the ----ppoosstt--ffiillee option, Wget
  1126. treats the file as a binary file and will send every character in
  1127. the POST request without stripping trailing newline or formfeed
  1128. characters. Any other control characters in the text will also be
  1129. sent as-is in the POST request.
  1130.  
  1131. Please be aware that Wget needs to know the size of the POST data
  1132. in advance. Therefore the argument to "--post-file" must be a
  1133. regular file; specifying a FIFO or something like _/_d_e_v_/_s_t_d_i_n won't
  1134. work. It's not quite clear how to work around this limitation
  1135. inherent in HTTP/1.0. Although HTTP/1.1 introduces _c_h_u_n_k_e_d
  1136. transfer that doesn't require knowing the request length in
  1137. advance, a client can't use chunked unless it knows it's talking to
  1138. an HTTP/1.1 server. And it can't know that until it receives a
  1139. response, which in turn requires the request to have been completed
  1140. -- a chicken-and-egg problem.
  1141.  
  1142. Note: As of version 1.15 if Wget is redirected after the POST
  1143. request is completed, its behaviour will depend on the response
  1144. code returned by the server. In case of a 301 Moved Permanently,
  1145. 302 Moved Temporarily or 307 Temporary Redirect, Wget will, in
  1146. accordance with RFC2616, continue to send a POST request. In case
  1147. a server wants the client to change the Request method upon
  1148. redirection, it should send a 303 See Other response code.
  1149.  
  1150. This example shows how to log in to a server using POST and then
  1151. proceed to download the desired pages, presumably only accessible
  1152. to authorized users:
  1153.  
  1154. # Log in to the server. This can be done only once.
  1155. wget --save-cookies cookies.txt \
  1156. --post-data 'user=foo&password=bar' \
  1157. http://example.com/auth.php
  1158.  
  1159. # Now grab the page or pages we care about.
  1160. wget --load-cookies cookies.txt \
  1161. -p http://example.com/interesting/article.php
  1162.  
  1163. If the server is using session cookies to track user
  1164. authentication, the above will not work because ----ssaavvee--ccooookkiieess will
  1165. not save them (and neither will browsers) and the _c_o_o_k_i_e_s_._t_x_t file
  1166. will be empty. In that case use ----kkeeeepp--sseessssiioonn--ccooookkiieess along with
  1167. ----ssaavvee--ccooookkiieess to force saving of session cookies.
  1168.  
  1169. ----mmeetthhoodd==_H_T_T_P_-_M_e_t_h_o_d
  1170. For the purpose of RESTful scripting, Wget allows sending of other
  1171. HTTP Methods without the need to explicitly set them using
  1172. ----hheeaaddeerr==HHeeaaddeerr--LLiinnee. Wget will use whatever string is passed to
  1173. it after ----mmeetthhoodd as the HTTP Method to the server.
  1174.  
  1175. ----bbooddyy--ddaattaa==_D_a_t_a_-_S_t_r_i_n_g
  1176. ----bbooddyy--ffiillee==_D_a_t_a_-_F_i_l_e
  1177. Must be set when additional data needs to be sent to the server
  1178. along with the Method specified using ----mmeetthhoodd. ----bbooddyy--ddaattaa sends
  1179. _s_t_r_i_n_g as data, whereas ----bbooddyy--ffiillee sends the contents of _f_i_l_e.
  1180. Other than that, they work in exactly the same way.
  1181.  
  1182. Currently, ----bbooddyy--ffiillee is _n_o_t for transmitting files as a whole.
  1183. Wget does not currently support "multipart/form-data" for
  1184. transmitting data; only "application/x-www-form-urlencoded". In the
  1185. future, this may be changed so that wget sends the ----bbooddyy--ffiillee as a
  1186. complete file instead of sending its contents to the server. Please
  1187. be aware that Wget needs to know the contents of BODY Data in
  1188. advance, and hence the argument to ----bbooddyy--ffiillee should be a regular
  1189. file. See ----ppoosstt--ffiillee for a more detailed explanation. Only one of
  1190. ----bbooddyy--ddaattaa and ----bbooddyy--ffiillee should be specified.
  1191.  
  1192. If Wget is redirected after the request is completed, Wget will
  1193. suspend the current method and send a GET request till the
  1194. redirection is completed. This is true for all redirection
  1195. response codes except 307 Temporary Redirect which is used to
  1196. explicitly specify that the request method should _n_o_t change.
  1197. Another exception is when the method is set to "POST", in which
  1198. case the redirection rules specified under ----ppoosstt--ddaattaa are
  1199. followed.
  1200.  
  1201. ----ccoonntteenntt--ddiissppoossiittiioonn
  1202. If this is set to on, experimental (not fully-functional) support
  1203. for "Content-Disposition" headers is enabled. This can currently
  1204. result in extra round-trips to the server for a "HEAD" request, and
  1205. is known to suffer from a few bugs, which is why it is not
  1206. currently enabled by default.
  1207.  
  1208. This option is useful for some file-downloading CGI programs that
  1209. use "Content-Disposition" headers to describe what the name of a
  1210. downloaded file should be.
  1211.  
  1212. When combined with ----mmeettaalliinnkk--oovveerr--hhttttpp and ----ttrruusstt--sseerrvveerr--nnaammeess, a
  1213. CCoonntteenntt--TTyyppee:: aapppplliiccaattiioonn//mmeettaalliinnkk44++xxmmll file is named using the
  1214. "Content-Disposition" filename field, if available.
  1215.  
  1216. ----ccoonntteenntt--oonn--eerrrroorr
  1217. If this is set to on, wget will not skip the content when the
  1218. server responds with a http status code that indicates error.
  1219.  
  1220. ----ttrruusstt--sseerrvveerr--nnaammeess
  1221. If this is set, on a redirect, the local file name will be based on
  1222. the redirection URL. By default the local file name is based on
  1223. the original URL. When doing recursive retrieving this can be
  1224. helpful because in many web sites redirected URLs correspond to an
  1225. underlying file structure, while link URLs do not.
  1226.  
  1227. ----aauutthh--nnoo--cchhaalllleennggee
  1228. If this option is given, Wget will send Basic HTTP authentication
  1229. information (plaintext username and password) for all requests,
  1230. just like Wget 1.10.2 and prior did by default.
  1231.  
  1232. Use of this option is not recommended, and is intended only to
  1233. support some few obscure servers, which never send HTTP
  1234. authentication challenges, but accept unsolicited auth info, say,
  1235. in addition to form-based authentication.
  1236.  
  1237. ----rreettrryy--oonn--hhttttpp--eerrrroorr==_c_o_d_e_[_,_c_o_d_e_,_._._._]
  1238. Consider given HTTP response codes as non-fatal, transient errors.
  1239. Supply a comma-separated list of 3-digit HTTP response codes as
  1240. argument. Useful to work around special circumstances where retries
  1241. are required, but the server responds with an error code normally
  1242. not retried by Wget. Such errors might be 503 (Service Unavailable)
  1243. and 429 (Too Many Requests). Retries enabled by this option are
  1244. performed subject to the normal retry timing and retry count
  1245. limitations of Wget.
  1246.  
  1247. Using this option is intended to support special use cases only and
  1248. is generally not recommended, as it can force retries even in cases
  1249. where the server is actually trying to decrease its load. Please
  1250. use wisely and only if you know what you are doing.
  1251.  
  1252. HHTTTTPPSS ((SSSSLL//TTLLSS)) OOppttiioonnss
  1253. To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with
  1254. an external SSL library. The current default is GnuTLS. In addition,
  1255. Wget also supports HSTS (HTTP Strict Transport Security). If Wget is
  1256. compiled without SSL support, none of these options are available.
  1257.  
  1258. ----sseeccuurree--pprroottooccooll==_p_r_o_t_o_c_o_l
  1259. Choose the secure protocol to be used. Legal values are aauuttoo,
  1260. SSSSLLvv22, SSSSLLvv33, TTLLSSvv11, TTLLSSvv11__11, TTLLSSvv11__22 and PPFFSS. If aauuttoo is used,
  1261. the SSL library is given the liberty of choosing the appropriate
  1262. protocol automatically, which is achieved by sending a TLSv1
  1263. greeting. This is the default.
  1264.  
  1265. Specifying SSSSLLvv22, SSSSLLvv33, TTLLSSvv11, TTLLSSvv11__11 or TTLLSSvv11__22 forces the use
  1266. of the corresponding protocol. This is useful when talking to old
  1267. and buggy SSL server implementations that make it hard for the
  1268. underlying SSL library to choose the correct protocol version.
  1269. Fortunately, such servers are quite rare.
  1270.  
  1271. Specifying PPFFSS enforces the use of the so-called Perfect Forward
  1272. Security cipher suites. In short, PFS adds security by creating a
  1273. one-time key for each SSL connection. It has a bit more CPU impact
  1274. on client and server. We use known to be secure ciphers (e.g. no
  1275. MD4) and the TLS protocol.
  1276.  
  1277. ----hhttttppss--oonnllyy
  1278. When in recursive mode, only HTTPS links are followed.
  1279.  
  1280. ----nnoo--cchheecckk--cceerrttiiffiiccaattee
  1281. Don't check the server certificate against the available
  1282. certificate authorities. Also don't require the URL host name to
  1283. match the common name presented by the certificate.
  1284.  
  1285. As of Wget 1.10, the default is to verify the server's certificate
  1286. against the recognized certificate authorities, breaking the SSL
  1287. handshake and aborting the download if the verification fails.
  1288. Although this provides more secure downloads, it does break
  1289. interoperability with some sites that worked with previous Wget
  1290. versions, particularly those using self-signed, expired, or
  1291. otherwise invalid certificates. This option forces an "insecure"
  1292. mode of operation that turns the certificate verification errors
  1293. into warnings and allows you to proceed.
  1294.  
  1295. If you encounter "certificate verification" errors or ones saying
  1296. that "common name doesn't match requested host name", you can use
  1297. this option to bypass the verification and proceed with the
  1298. download. _O_n_l_y _u_s_e _t_h_i_s _o_p_t_i_o_n _i_f _y_o_u _a_r_e _o_t_h_e_r_w_i_s_e _c_o_n_v_i_n_c_e_d _o_f
  1299. _t_h_e _s_i_t_e_'_s _a_u_t_h_e_n_t_i_c_i_t_y_, _o_r _i_f _y_o_u _r_e_a_l_l_y _d_o_n_'_t _c_a_r_e _a_b_o_u_t _t_h_e
  1300. _v_a_l_i_d_i_t_y _o_f _i_t_s _c_e_r_t_i_f_i_c_a_t_e_. It is almost always a bad idea not to
  1301. check the certificates when transmitting confidential or important
  1302. data. For self-signed/internal certificates, you should download
  1303. the certificate and verify against that instead of forcing this
  1304. insecure mode. If you are really sure of not desiring any
  1305. certificate verification, you can specify --check-certificate=quiet
  1306. to tell wget to not print any warning about invalid certificates,
  1307. albeit in most cases this is the wrong thing to do.
  1308.  
  1309. ----cceerrttiiffiiccaattee==_f_i_l_e
  1310. Use the client certificate stored in _f_i_l_e. This is needed for
  1311. servers that are configured to require certificates from the
  1312. clients that connect to them. Normally a certificate is not
  1313. required and this switch is optional.
  1314.  
  1315. ----cceerrttiiffiiccaattee--ttyyppee==_t_y_p_e
  1316. Specify the type of the client certificate. Legal values are PPEEMM
  1317. (assumed by default) and DDEERR, also known as AASSNN11.
  1318.  
  1319. ----pprriivvaattee--kkeeyy==_f_i_l_e
  1320. Read the private key from _f_i_l_e. This allows you to provide the
  1321. private key in a file separate from the certificate.
  1322.  
  1323. ----pprriivvaattee--kkeeyy--ttyyppee==_t_y_p_e
  1324. Specify the type of the private key. Accepted values are PPEEMM (the
  1325. default) and DDEERR.
  1326.  
  1327. ----ccaa--cceerrttiiffiiccaattee==_f_i_l_e
  1328. Use _f_i_l_e as the file with the bundle of certificate authorities
  1329. ("CA") to verify the peers. The certificates must be in PEM
  1330. format.
  1331.  
  1332. Without this option Wget looks for CA certificates at the system-
  1333. specified locations, chosen at OpenSSL installation time.
  1334.  
  1335. ----ccaa--ddiirreeccttoorryy==_d_i_r_e_c_t_o_r_y
  1336. Specifies directory containing CA certificates in PEM format. Each
  1337. file contains one CA certificate, and the file name is based on a
  1338. hash value derived from the certificate. This is achieved by
  1339. processing a certificate directory with the "c_rehash" utility
  1340. supplied with OpenSSL. Using ----ccaa--ddiirreeccttoorryy is more efficient than
  1341. ----ccaa--cceerrttiiffiiccaattee when many certificates are installed because it
  1342. allows Wget to fetch certificates on demand.
  1343.  
  1344. Without this option Wget looks for CA certificates at the system-
  1345. specified locations, chosen at OpenSSL installation time.
  1346.  
  1347. ----ccrrll--ffiillee==_f_i_l_e
  1348. Specifies a CRL file in _f_i_l_e. This is needed for certificates that
  1349. have been revocated by the CAs.
  1350.  
  1351. ----ppiinnnneeddppuubbkkeeyy==ffiillee//hhaasshheess
  1352. Tells wget to use the specified public key file (or hashes) to
  1353. verify the peer. This can be a path to a file which contains a
  1354. single public key in PEM or DER format, or any number of base64
  1355. encoded sha256 hashes preceded by "sha256//" and separated by ";"
  1356.  
  1357. When negotiating a TLS or SSL connection, the server sends a
  1358. certificate indicating its identity. A public key is extracted from
  1359. this certificate and if it does not exactly match the public key(s)
  1360. provided to this option, wget will abort the connection before
  1361. sending or receiving any data.
  1362.  
  1363. ----rraannddoomm--ffiillee==_f_i_l_e
  1364. [OpenSSL and LibreSSL only] Use _f_i_l_e as the source of random data
  1365. for seeding the pseudo-random number generator on systems without
  1366. _/_d_e_v_/_u_r_a_n_d_o_m.
  1367.  
  1368. On such systems the SSL library needs an external source of
  1369. randomness to initialize. Randomness may be provided by EGD (see
  1370. ----eeggdd--ffiillee below) or read from an external source specified by the
  1371. user. If this option is not specified, Wget looks for random data
  1372. in $RANDFILE or, if that is unset, in _$_H_O_M_E_/_._r_n_d.
  1373.  
  1374. If you're getting the "Could not seed OpenSSL PRNG; disabling SSL."
  1375. error, you should provide random data using some of the methods
  1376. described above.
  1377.  
  1378. ----eeggdd--ffiillee==_f_i_l_e
  1379. [OpenSSL only] Use _f_i_l_e as the EGD socket. EGD stands for _E_n_t_r_o_p_y
  1380. _G_a_t_h_e_r_i_n_g _D_a_e_m_o_n, a user-space program that collects data from
  1381. various unpredictable system sources and makes it available to
  1382. other programs that might need it. Encryption software, such as
  1383. the SSL library, needs sources of non-repeating randomness to seed
  1384. the random number generator used to produce cryptographically
  1385. strong keys.
  1386.  
  1387. OpenSSL allows the user to specify his own source of entropy using
  1388. the "RAND_FILE" environment variable. If this variable is unset,
  1389. or if the specified file does not produce enough randomness,
  1390. OpenSSL will read random data from EGD socket specified using this
  1391. option.
  1392.  
  1393. If this option is not specified (and the equivalent startup command
  1394. is not used), EGD is never contacted. EGD is not needed on modern
  1395. Unix systems that support _/_d_e_v_/_u_r_a_n_d_o_m.
  1396.  
  1397. ----nnoo--hhssttss
  1398. Wget supports HSTS (HTTP Strict Transport Security, RFC 6797) by
  1399. default. Use ----nnoo--hhssttss to make Wget act as a non-HSTS-compliant
  1400. UA. As a consequence, Wget would ignore all the
  1401. "Strict-Transport-Security" headers, and would not enforce any
  1402. existing HSTS policy.
  1403.  
  1404. ----hhssttss--ffiillee==_f_i_l_e
  1405. By default, Wget stores its HSTS database in _~_/_._w_g_e_t_-_h_s_t_s. You can
  1406. use ----hhssttss--ffiillee to override this. Wget will use the supplied file
  1407. as the HSTS database. Such file must conform to the correct HSTS
  1408. database format used by Wget. If Wget cannot parse the provided
  1409. file, the behaviour is unspecified.
  1410.  
  1411. The Wget's HSTS database is a plain text file. Each line contains
  1412. an HSTS entry (ie. a site that has issued a
  1413. "Strict-Transport-Security" header and that therefore has specified
  1414. a concrete HSTS policy to be applied). Lines starting with a dash
  1415. ("#") are ignored by Wget. Please note that in spite of this
  1416. convenient human-readability hand-hacking the HSTS database is
  1417. generally not a good idea.
  1418.  
  1419. An HSTS entry line consists of several fields separated by one or
  1420. more whitespace:
  1421.  
  1422. "<hostname> SP [<port>] SP <include subdomains> SP <created> SP
  1423. <max-age>"
  1424.  
  1425. The _h_o_s_t_n_a_m_e and _p_o_r_t fields indicate the hostname and port to
  1426. which the given HSTS policy applies. The _p_o_r_t field may be zero,
  1427. and it will, in most of the cases. That means that the port number
  1428. will not be taken into account when deciding whether such HSTS
  1429. policy should be applied on a given request (only the hostname will
  1430. be evaluated). When _p_o_r_t is different to zero, both the target
  1431. hostname and the port will be evaluated and the HSTS policy will
  1432. only be applied if both of them match. This feature has been
  1433. included for testing/development purposes only. The Wget testsuite
  1434. (in _t_e_s_t_e_n_v_/) creates HSTS databases with explicit ports with the
  1435. purpose of ensuring Wget's correct behaviour. Applying HSTS
  1436. policies to ports other than the default ones is discouraged by RFC
  1437. 6797 (see Appendix B "Differences between HSTS Policy and Same-
  1438. Origin Policy"). Thus, this functionality should not be used in
  1439. production environments and _p_o_r_t will typically be zero. The last
  1440. three fields do what they are expected to. The field
  1441. _i_n_c_l_u_d_e___s_u_b_d_o_m_a_i_n_s can either be 1 or 0 and it signals whether the
  1442. subdomains of the target domain should be part of the given HSTS
  1443. policy as well. The _c_r_e_a_t_e_d and _m_a_x_-_a_g_e fields hold the timestamp
  1444. values of when such entry was created (first seen by Wget) and the
  1445. HSTS-defined value 'max-age', which states how long should that
  1446. HSTS policy remain active, measured in seconds elapsed since the
  1447. timestamp stored in _c_r_e_a_t_e_d. Once that time has passed, that HSTS
  1448. policy will no longer be valid and will eventually be removed from
  1449. the database.
  1450.  
  1451. If you supply your own HSTS database via ----hhssttss--ffiillee, be aware that
  1452. Wget may modify the provided file if any change occurs between the
  1453. HSTS policies requested by the remote servers and those in the
  1454. file. When Wget exists, it effectively updates the HSTS database by
  1455. rewriting the database file with the new entries.
  1456.  
  1457. If the supplied file does not exist, Wget will create one. This
  1458. file will contain the new HSTS entries. If no HSTS entries were
  1459. generated (no "Strict-Transport-Security" headers were sent by any
  1460. of the servers) then no file will be created, not even an empty
  1461. one. This behaviour applies to the default database file
  1462. (_~_/_._w_g_e_t_-_h_s_t_s) as well: it will not be created until some server
  1463. enforces an HSTS policy.
  1464.  
  1465. Care is taken not to override possible changes made by other Wget
  1466. processes at the same time over the HSTS database. Before dumping
  1467. the updated HSTS entries on the file, Wget will re-read it and
  1468. merge the changes.
  1469.  
  1470. Using a custom HSTS database and/or modifying an existing one is
  1471. discouraged. For more information about the potential security
  1472. threats arised from such practice, see section 14 "Security
  1473. Considerations" of RFC 6797, specially section 14.9 "Creative
  1474. Manipulation of HSTS Policy Store".
  1475.  
  1476. ----wwaarrcc--ffiillee==_f_i_l_e
  1477. Use _f_i_l_e as the destination WARC file.
  1478.  
  1479. ----wwaarrcc--hheeaaddeerr==_s_t_r_i_n_g
  1480. Use _s_t_r_i_n_g into as the warcinfo record.
  1481.  
  1482. ----wwaarrcc--mmaaxx--ssiizzee==_s_i_z_e
  1483. Set the maximum size of the WARC files to _s_i_z_e.
  1484.  
  1485. ----wwaarrcc--ccddxx
  1486. Write CDX index files.
  1487.  
  1488. ----wwaarrcc--ddeedduupp==_f_i_l_e
  1489. Do not store records listed in this CDX file.
  1490.  
  1491. ----nnoo--wwaarrcc--ccoommpprreessssiioonn
  1492. Do not compress WARC files with GZIP.
  1493.  
  1494. ----nnoo--wwaarrcc--ddiiggeessttss
  1495. Do not calculate SHA1 digests.
  1496.  
  1497. ----nnoo--wwaarrcc--kkeeeepp--lloogg
  1498. Do not store the log file in a WARC record.
  1499.  
  1500. ----wwaarrcc--tteemmppddiirr==_d_i_r
  1501. Specify the location for temporary files created by the WARC
  1502. writer.
  1503.  
  1504. FFTTPP OOppttiioonnss
  1505. ----ffttpp--uusseerr==_u_s_e_r
  1506. ----ffttpp--ppaasssswwoorrdd==_p_a_s_s_w_o_r_d
  1507. Specify the username _u_s_e_r and password _p_a_s_s_w_o_r_d on an FTP server.
  1508. Without this, or the corresponding startup option, the password
  1509. defaults to --wwggeett@@, normally used for anonymous FTP.
  1510.  
  1511. Another way to specify username and password is in the URL itself.
  1512. Either method reveals your password to anyone who bothers to run
  1513. "ps". To prevent the passwords from being seen, store them in
  1514. _._w_g_e_t_r_c or _._n_e_t_r_c, and make sure to protect those files from other
  1515. users with "chmod". If the passwords are really important, do not
  1516. leave them lying in those files either---edit the files and delete
  1517. them after Wget has started the download.
  1518.  
  1519. ----nnoo--rreemmoovvee--lliissttiinngg
  1520. Don't remove the temporary _._l_i_s_t_i_n_g files generated by FTP
  1521. retrievals. Normally, these files contain the raw directory
  1522. listings received from FTP servers. Not removing them can be
  1523. useful for debugging purposes, or when you want to be able to
  1524. easily check on the contents of remote server directories (e.g. to
  1525. verify that a mirror you're running is complete).
  1526.  
  1527. Note that even though Wget writes to a known filename for this
  1528. file, this is not a security hole in the scenario of a user making
  1529. _._l_i_s_t_i_n_g a symbolic link to _/_e_t_c_/_p_a_s_s_w_d or something and asking
  1530. "root" to run Wget in his or her directory. Depending on the
  1531. options used, either Wget will refuse to write to _._l_i_s_t_i_n_g, making
  1532. the globbing/recursion/time-stamping operation fail, or the
  1533. symbolic link will be deleted and replaced with the actual _._l_i_s_t_i_n_g
  1534. file, or the listing will be written to a _._l_i_s_t_i_n_g_._n_u_m_b_e_r file.
  1535.  
  1536. Even though this situation isn't a problem, though, "root" should
  1537. never run Wget in a non-trusted user's directory. A user could do
  1538. something as simple as linking _i_n_d_e_x_._h_t_m_l to _/_e_t_c_/_p_a_s_s_w_d and asking
  1539. "root" to run Wget with --NN or --rr so the file will be overwritten.
  1540.  
  1541. ----nnoo--gglloobb
  1542. Turn off FTP globbing. Globbing refers to the use of shell-like
  1543. special characters (_w_i_l_d_c_a_r_d_s), like **, ??, [[ and ]] to retrieve more
  1544. than one file from the same directory at once, like:
  1545.  
  1546. wget ftp://gnjilux.srk.fer.hr/*.msg
  1547.  
  1548. By default, globbing will be turned on if the URL contains a
  1549. globbing character. This option may be used to turn globbing on or
  1550. off permanently.
  1551.  
  1552. You may have to quote the URL to protect it from being expanded by
  1553. your shell. Globbing makes Wget look for a directory listing,
  1554. which is system-specific. This is why it currently works only with
  1555. Unix FTP servers (and the ones emulating Unix "ls" output).
  1556.  
  1557. ----nnoo--ppaassssiivvee--ffttpp
  1558. Disable the use of the _p_a_s_s_i_v_e FTP transfer mode. Passive FTP
  1559. mandates that the client connect to the server to establish the
  1560. data connection rather than the other way around.
  1561.  
  1562. If the machine is connected to the Internet directly, both passive
  1563. and active FTP should work equally well. Behind most firewall and
  1564. NAT configurations passive FTP has a better chance of working.
  1565. However, in some rare firewall configurations, active FTP actually
  1566. works when passive FTP doesn't. If you suspect this to be the
  1567. case, use this option, or set "passive_ftp=off" in your init file.
  1568.  
  1569. ----pprreesseerrvvee--ppeerrmmiissssiioonnss
  1570. Preserve remote file permissions instead of permissions set by
  1571. umask.
  1572.  
  1573. ----rreettrr--ssyymmlliinnkkss
  1574. By default, when retrieving FTP directories recursively and a
  1575. symbolic link is encountered, the symbolic link is traversed and
  1576. the pointed-to files are retrieved. Currently, Wget does not
  1577. traverse symbolic links to directories to download them
  1578. recursively, though this feature may be added in the future.
  1579.  
  1580. When ----rreettrr--ssyymmlliinnkkss==nnoo is specified, the linked-to file is not
  1581. downloaded. Instead, a matching symbolic link is created on the
  1582. local filesystem. The pointed-to file will not be retrieved unless
  1583. this recursive retrieval would have encountered it separately and
  1584. downloaded it anyway. This option poses a security risk where a
  1585. malicious FTP Server may cause Wget to write to files outside of
  1586. the intended directories through a specially crafted .LISTING file.
  1587.  
  1588. Note that when retrieving a file (not a directory) because it was
  1589. specified on the command-line, rather than because it was recursed
  1590. to, this option has no effect. Symbolic links are always traversed
  1591. in this case.
  1592.  
  1593. FFTTPPSS OOppttiioonnss
  1594. ----ffttppss--iimmpplliicciitt
  1595. This option tells Wget to use FTPS implicitly. Implicit FTPS
  1596. consists of initializing SSL/TLS from the very beginning of the
  1597. control connection. This option does not send an "AUTH TLS"
  1598. command: it assumes the server speaks FTPS and directly starts an
  1599. SSL/TLS connection. If the attempt is successful, the session
  1600. continues just like regular FTPS ("PBSZ" and "PROT" are sent,
  1601. etc.). Implicit FTPS is no longer a requirement for FTPS
  1602. implementations, and thus many servers may not support it. If
  1603. ----ffttppss--iimmpplliicciitt is passed and no explicit port number specified,
  1604. the default port for implicit FTPS, 990, will be used, instead of
  1605. the default port for the "normal" (explicit) FTPS which is the same
  1606. as that of FTP, 21.
  1607.  
  1608. ----nnoo--ffttppss--rreessuummee--ssssll
  1609. Do not resume the SSL/TLS session in the data channel. When
  1610. starting a data connection, Wget tries to resume the SSL/TLS
  1611. session previously started in the control connection. SSL/TLS
  1612. session resumption avoids performing an entirely new handshake by
  1613. reusing the SSL/TLS parameters of a previous session. Typically,
  1614. the FTPS servers want it that way, so Wget does this by default.
  1615. Under rare circumstances however, one might want to start an
  1616. entirely new SSL/TLS session in every data connection. This is
  1617. what ----nnoo--ffttppss--rreessuummee--ssssll is for.
  1618.  
  1619. ----ffttppss--cclleeaarr--ddaattaa--ccoonnnneeccttiioonn
  1620. All the data connections will be in plain text. Only the control
  1621. connection will be under SSL/TLS. Wget will send a "PROT C" command
  1622. to achieve this, which must be approved by the server.
  1623.  
  1624. ----ffttppss--ffaallllbbaacckk--ttoo--ffttpp
  1625. Fall back to FTP if FTPS is not supported by the target server. For
  1626. security reasons, this option is not asserted by default. The
  1627. default behaviour is to exit with an error. If a server does not
  1628. successfully reply to the initial "AUTH TLS" command, or in the
  1629. case of implicit FTPS, if the initial SSL/TLS connection attempt is
  1630. rejected, it is considered that such server does not support FTPS.
  1631.  
  1632. RReeccuurrssiivvee RReettrriieevvaall OOppttiioonnss
  1633. --rr
  1634. ----rreeccuurrssiivvee
  1635. Turn on recursive retrieving. The default maximum depth is 5.
  1636.  
  1637. --ll _d_e_p_t_h
  1638. ----lleevveell==_d_e_p_t_h
  1639. Specify recursion maximum depth level _d_e_p_t_h.
  1640.  
  1641. ----ddeelleettee--aafftteerr
  1642. This option tells Wget to delete every single file it downloads,
  1643. _a_f_t_e_r having done so. It is useful for pre-fetching popular pages
  1644. through a proxy, e.g.:
  1645.  
  1646. wget -r -nd --delete-after http://whatever.com/~popular/page/
  1647.  
  1648. The --rr option is to retrieve recursively, and --nndd to not create
  1649. directories.
  1650.  
  1651. Note that ----ddeelleettee--aafftteerr deletes files on the local machine. It
  1652. does not issue the DDEELLEE command to remote FTP sites, for instance.
  1653. Also note that when ----ddeelleettee--aafftteerr is specified, ----ccoonnvveerrtt--lliinnkkss is
  1654. ignored, so ..oorriigg files are simply not created in the first place.
  1655.  
  1656. --kk
  1657. ----ccoonnvveerrtt--lliinnkkss
  1658. After the download is complete, convert the links in the document
  1659. to make them suitable for local viewing. This affects not only the
  1660. visible hyperlinks, but any part of the document that links to
  1661. external content, such as embedded images, links to style sheets,
  1662. hyperlinks to non-HTML content, etc.
  1663.  
  1664. Each link will be changed in one of the two ways:
  1665.  
  1666. +o The links to files that have been downloaded by Wget will be
  1667. changed to refer to the file they point to as a relative link.
  1668.  
  1669. Example: if the downloaded file _/_f_o_o_/_d_o_c_._h_t_m_l links to
  1670. _/_b_a_r_/_i_m_g_._g_i_f, also downloaded, then the link in _d_o_c_._h_t_m_l will
  1671. be modified to point to ....//bbaarr//iimmgg..ggiiff. This kind of
  1672. transformation works reliably for arbitrary combinations of
  1673. directories.
  1674.  
  1675. +o The links to files that have not been downloaded by Wget will
  1676. be changed to include host name and absolute path of the
  1677. location they point to.
  1678.  
  1679. Example: if the downloaded file _/_f_o_o_/_d_o_c_._h_t_m_l links to
  1680. _/_b_a_r_/_i_m_g_._g_i_f (or to _._._/_b_a_r_/_i_m_g_._g_i_f), then the link in _d_o_c_._h_t_m_l
  1681. will be modified to point to _h_t_t_p_:_/_/_h_o_s_t_n_a_m_e_/_b_a_r_/_i_m_g_._g_i_f.
  1682.  
  1683. Because of this, local browsing works reliably: if a linked file
  1684. was downloaded, the link will refer to its local name; if it was
  1685. not downloaded, the link will refer to its full Internet address
  1686. rather than presenting a broken link. The fact that the former
  1687. links are converted to relative links ensures that you can move the
  1688. downloaded hierarchy to another directory.
  1689.  
  1690. Note that only at the end of the download can Wget know which links
  1691. have been downloaded. Because of that, the work done by --kk will be
  1692. performed at the end of all the downloads.
  1693.  
  1694. ----ccoonnvveerrtt--ffiillee--oonnllyy
  1695. This option converts only the filename part of the URLs, leaving
  1696. the rest of the URLs untouched. This filename part is sometimes
  1697. referred to as the "basename", although we avoid that term here in
  1698. order not to cause confusion.
  1699.  
  1700. It works particularly well in conjunction with ----aaddjjuusstt--eexxtteennssiioonn,
  1701. although this coupling is not enforced. It proves useful to
  1702. populate Internet caches with files downloaded from different
  1703. hosts.
  1704.  
  1705. Example: if some link points to _/_/_f_o_o_._c_o_m_/_b_a_r_._c_g_i_?_x_y_z with
  1706. ----aaddjjuusstt--eexxtteennssiioonn asserted and its local destination is intended
  1707. to be _._/_f_o_o_._c_o_m_/_b_a_r_._c_g_i_?_x_y_z_._c_s_s, then the link would be converted
  1708. to _/_/_f_o_o_._c_o_m_/_b_a_r_._c_g_i_?_x_y_z_._c_s_s. Note that only the filename part has
  1709. been modified. The rest of the URL has been left untouched,
  1710. including the net path ("//") which would otherwise be processed by
  1711. Wget and converted to the effective scheme (ie. "http://").
  1712.  
  1713. --KK
  1714. ----bbaacckkuupp--ccoonnvveerrtteedd
  1715. When converting a file, back up the original version with a ..oorriigg
  1716. suffix. Affects the behavior of --NN.
  1717.  
  1718. --mm
  1719. ----mmiirrrroorr
  1720. Turn on options suitable for mirroring. This option turns on
  1721. recursion and time-stamping, sets infinite recursion depth and
  1722. keeps FTP directory listings. It is currently equivalent to --rr --NN
  1723. --ll iinnff ----nnoo--rreemmoovvee--lliissttiinngg.
  1724.  
  1725. --pp
  1726. ----ppaaggee--rreeqquuiissiitteess
  1727. This option causes Wget to download all the files that are
  1728. necessary to properly display a given HTML page. This includes
  1729. such things as inlined images, sounds, and referenced stylesheets.
  1730.  
  1731. Ordinarily, when downloading a single HTML page, any requisite
  1732. documents that may be needed to display it properly are not
  1733. downloaded. Using --rr together with --ll can help, but since Wget
  1734. does not ordinarily distinguish between external and inlined
  1735. documents, one is generally left with "leaf documents" that are
  1736. missing their requisites.
  1737.  
  1738. For instance, say document _1_._h_t_m_l contains an "<IMG>" tag
  1739. referencing _1_._g_i_f and an "<A>" tag pointing to external document
  1740. _2_._h_t_m_l. Say that _2_._h_t_m_l is similar but that its image is _2_._g_i_f and
  1741. it links to _3_._h_t_m_l. Say this continues up to some arbitrarily high
  1742. number.
  1743.  
  1744. If one executes the command:
  1745.  
  1746. wget -r -l 2 http://<site>/1.html
  1747.  
  1748. then _1_._h_t_m_l, _1_._g_i_f, _2_._h_t_m_l, _2_._g_i_f, and _3_._h_t_m_l will be downloaded.
  1749. As you can see, _3_._h_t_m_l is without its requisite _3_._g_i_f because Wget
  1750. is simply counting the number of hops (up to 2) away from _1_._h_t_m_l in
  1751. order to determine where to stop the recursion. However, with this
  1752. command:
  1753.  
  1754. wget -r -l 2 -p http://<site>/1.html
  1755.  
  1756. all the above files _a_n_d _3_._h_t_m_l's requisite _3_._g_i_f will be
  1757. downloaded. Similarly,
  1758.  
  1759. wget -r -l 1 -p http://<site>/1.html
  1760.  
  1761. will cause _1_._h_t_m_l, _1_._g_i_f, _2_._h_t_m_l, and _2_._g_i_f to be downloaded. One
  1762. might think that:
  1763.  
  1764. wget -r -l 0 -p http://<site>/1.html
  1765.  
  1766. would download just _1_._h_t_m_l and _1_._g_i_f, but unfortunately this is not
  1767. the case, because --ll 00 is equivalent to --ll iinnff---that is, infinite
  1768. recursion. To download a single HTML page (or a handful of them,
  1769. all specified on the command-line or in a --ii URL input file) and
  1770. its (or their) requisites, simply leave off --rr and --ll:
  1771.  
  1772. wget -p http://<site>/1.html
  1773.  
  1774. Note that Wget will behave as if --rr had been specified, but only
  1775. that single page and its requisites will be downloaded. Links from
  1776. that page to external documents will not be followed. Actually, to
  1777. download a single page and all its requisites (even if they exist
  1778. on separate websites), and make sure the lot displays properly
  1779. locally, this author likes to use a few options in addition to --pp:
  1780.  
  1781. wget -E -H -k -K -p http://<site>/<document>
  1782.  
  1783. To finish off this topic, it's worth knowing that Wget's idea of an
  1784. external document link is any URL specified in an "<A>" tag, an
  1785. "<AREA>" tag, or a "<LINK>" tag other than "<LINK
  1786. REL="stylesheet">".
  1787.  
  1788. ----ssttrriicctt--ccoommmmeennttss
  1789. Turn on strict parsing of HTML comments. The default is to
  1790. terminate comments at the first occurrence of ---->>.
  1791.  
  1792. According to specifications, HTML comments are expressed as SGML
  1793. _d_e_c_l_a_r_a_t_i_o_n_s. Declaration is special markup that begins with <<!!
  1794. and ends with >>, such as <<!!DDOOCCTTYYPPEE ......>>, that may contain comments
  1795. between a pair of ---- delimiters. HTML comments are "empty
  1796. declarations", SGML declarations without any non-comment text.
  1797. Therefore, <<!!----ffoooo---->> is a valid comment, and so is <<!!----oonnee----
  1798. ----ttwwoo---->>, but <<!!----11----22---->> is not.
  1799.  
  1800. On the other hand, most HTML writers don't perceive comments as
  1801. anything other than text delimited with <<!!---- and ---->>, which is not
  1802. quite the same. For example, something like <<!!------------------------>> works
  1803. as a valid comment as long as the number of dashes is a multiple of
  1804. four (!). If not, the comment technically lasts until the next ----,
  1805. which may be at the other end of the document. Because of this,
  1806. many popular browsers completely ignore the specification and
  1807. implement what users have come to expect: comments delimited with
  1808. <<!!---- and ---->>.
  1809.  
  1810. Until version 1.9, Wget interpreted comments strictly, which
  1811. resulted in missing links in many web pages that displayed fine in
  1812. browsers, but had the misfortune of containing non-compliant
  1813. comments. Beginning with version 1.9, Wget has joined the ranks of
  1814. clients that implements "naive" comments, terminating each comment
  1815. at the first occurrence of ---->>.
  1816.  
  1817. If, for whatever reason, you want strict comment parsing, use this
  1818. option to turn it on.
  1819.  
  1820. RReeccuurrssiivvee AAcccceepptt//RReejjeecctt OOppttiioonnss
  1821. --AA _a_c_c_l_i_s_t ----aacccceepptt _a_c_c_l_i_s_t
  1822. --RR _r_e_j_l_i_s_t ----rreejjeecctt _r_e_j_l_i_s_t
  1823. Specify comma-separated lists of file name suffixes or patterns to
  1824. accept or reject. Note that if any of the wildcard characters, **,
  1825. ??, [[ or ]], appear in an element of _a_c_c_l_i_s_t or _r_e_j_l_i_s_t, it will be
  1826. treated as a pattern, rather than a suffix. In this case, you have
  1827. to enclose the pattern into quotes to prevent your shell from
  1828. expanding it, like in --AA ""**..mmpp33"" or --AA ''**..mmpp33''.
  1829.  
  1830. ----aacccceepptt--rreeggeexx _u_r_l_r_e_g_e_x
  1831. ----rreejjeecctt--rreeggeexx _u_r_l_r_e_g_e_x
  1832. Specify a regular expression to accept or reject the complete URL.
  1833.  
  1834. ----rreeggeexx--ttyyppee _r_e_g_e_x_t_y_p_e
  1835. Specify the regular expression type. Possible types are ppoossiixx or
  1836. ppccrree. Note that to be able to use ppccrree type, wget has to be
  1837. compiled with libpcre support.
  1838.  
  1839. --DD _d_o_m_a_i_n_-_l_i_s_t
  1840. ----ddoommaaiinnss==_d_o_m_a_i_n_-_l_i_s_t
  1841. Set domains to be followed. _d_o_m_a_i_n_-_l_i_s_t is a comma-separated list
  1842. of domains. Note that it does _n_o_t turn on --HH.
  1843.  
  1844. ----eexxcclluuddee--ddoommaaiinnss _d_o_m_a_i_n_-_l_i_s_t
  1845. Specify the domains that are _n_o_t to be followed.
  1846.  
  1847. ----ffoollllooww--ffttpp
  1848. Follow FTP links from HTML documents. Without this option, Wget
  1849. will ignore all the FTP links.
  1850.  
  1851. ----ffoollllooww--ttaaggss==_l_i_s_t
  1852. Wget has an internal table of HTML tag / attribute pairs that it
  1853. considers when looking for linked documents during a recursive
  1854. retrieval. If a user wants only a subset of those tags to be
  1855. considered, however, he or she should be specify such tags in a
  1856. comma-separated _l_i_s_t with this option.
  1857.  
  1858. ----iiggnnoorree--ttaaggss==_l_i_s_t
  1859. This is the opposite of the ----ffoollllooww--ttaaggss option. To skip certain
  1860. HTML tags when recursively looking for documents to download,
  1861. specify them in a comma-separated _l_i_s_t.
  1862.  
  1863. In the past, this option was the best bet for downloading a single
  1864. page and its requisites, using a command-line like:
  1865.  
  1866. wget --ignore-tags=a,area -H -k -K -r http://<site>/<document>
  1867.  
  1868. However, the author of this option came across a page with tags
  1869. like "<LINK REL="home" HREF="/">" and came to the realization that
  1870. specifying tags to ignore was not enough. One can't just tell Wget
  1871. to ignore "<LINK>", because then stylesheets will not be
  1872. downloaded. Now the best bet for downloading a single page and its
  1873. requisites is the dedicated ----ppaaggee--rreeqquuiissiitteess option.
  1874.  
  1875. ----iiggnnoorree--ccaassee
  1876. Ignore case when matching files and directories. This influences
  1877. the behavior of -R, -A, -I, and -X options, as well as globbing
  1878. implemented when downloading from FTP sites. For example, with
  1879. this option, --AA ""**..ttxxtt"" will match ffiillee11..ttxxtt, but also ffiillee22..TTXXTT,
  1880. ffiillee33..TTxxTT, and so on. The quotes in the example are to prevent the
  1881. shell from expanding the pattern.
  1882.  
  1883. --HH
  1884. ----ssppaann--hhoossttss
  1885. Enable spanning across hosts when doing recursive retrieving.
  1886.  
  1887. --LL
  1888. ----rreellaattiivvee
  1889. Follow relative links only. Useful for retrieving a specific home
  1890. page without any distractions, not even those from the same hosts.
  1891.  
  1892. --II _l_i_s_t
  1893. ----iinncclluuddee--ddiirreeccttoorriieess==_l_i_s_t
  1894. Specify a comma-separated list of directories you wish to follow
  1895. when downloading. Elements of _l_i_s_t may contain wildcards.
  1896.  
  1897. --XX _l_i_s_t
  1898. ----eexxcclluuddee--ddiirreeccttoorriieess==_l_i_s_t
  1899. Specify a comma-separated list of directories you wish to exclude
  1900. from download. Elements of _l_i_s_t may contain wildcards.
  1901.  
  1902. --nnpp
  1903. ----nnoo--ppaarreenntt
  1904. Do not ever ascend to the parent directory when retrieving
  1905. recursively. This is a useful option, since it guarantees that
  1906. only the files _b_e_l_o_w a certain hierarchy will be downloaded.
  1907.  
  1908. EENNVVIIRROONNMMEENNTT
  1909. Wget supports proxies for both HTTP and FTP retrievals. The standard
  1910. way to specify proxy location, which Wget recognizes, is using the
  1911. following environment variables:
  1912.  
  1913. hhttttpp__pprrooxxyy
  1914. hhttttppss__pprrooxxyy
  1915. If set, the hhttttpp__pprrooxxyy and hhttttppss__pprrooxxyy variables should contain the
  1916. URLs of the proxies for HTTP and HTTPS connections respectively.
  1917.  
  1918. ffttpp__pprrooxxyy
  1919. This variable should contain the URL of the proxy for FTP
  1920. connections. It is quite common that hhttttpp__pprrooxxyy and ffttpp__pprrooxxyy are
  1921. set to the same URL.
  1922.  
  1923. nnoo__pprrooxxyy
  1924. This variable should contain a comma-separated list of domain
  1925. extensions proxy should _n_o_t be used for. For instance, if the
  1926. value of nnoo__pprrooxxyy is ..mmiitt..eedduu, proxy will not be used to retrieve
  1927. documents from MIT.
  1928.  
  1929. EEXXIITT SSTTAATTUUSS
  1930. Wget may return one of several error codes if it encounters problems.
  1931.  
  1932. 0 No problems occurred.
  1933.  
  1934. 1 Generic error code.
  1935.  
  1936. 2 Parse error---for instance, when parsing command-line options, the
  1937. ..wwggeettrrcc or ..nneettrrcc...
  1938.  
  1939. 3 File I/O error.
  1940.  
  1941. 4 Network failure.
  1942.  
  1943. 5 SSL verification failure.
  1944.  
  1945. 6 Username/password authentication failure.
  1946.  
  1947. 7 Protocol errors.
  1948.  
  1949. 8 Server issued an error response.
  1950.  
  1951. With the exceptions of 0 and 1, the lower-numbered exit codes take
  1952. precedence over higher-numbered ones, when multiple types of errors are
  1953. encountered.
  1954.  
  1955. In versions of Wget prior to 1.12, Wget's exit status tended to be
  1956. unhelpful and inconsistent. Recursive downloads would virtually always
  1957. return 0 (success), regardless of any issues encountered, and non-
  1958. recursive fetches only returned the status corresponding to the most
  1959. recently-attempted download.
  1960.  
  1961. FFIILLEESS
  1962. //uussrr//llooccaall//eettcc//wwggeettrrcc
  1963. Default location of the _g_l_o_b_a_l startup file.
  1964.  
  1965. ..wwggeettrrcc
  1966. User startup file.
  1967.  
  1968. BBUUGGSS
  1969. You are welcome to submit bug reports via the GNU Wget bug tracker (see
  1970. <hhttttppss::////ssaavvaannnnaahh..ggnnuu..oorrgg//bbuuggss//??ffuunncc==aaddddiitteemm&&ggrroouupp==wwggeett>).
  1971.  
  1972. Before actually submitting a bug report, please try to follow a few
  1973. simple guidelines.
  1974.  
  1975. 1. Please try to ascertain that the behavior you see really is a bug.
  1976. If Wget crashes, it's a bug. If Wget does not behave as
  1977. documented, it's a bug. If things work strange, but you are not
  1978. sure about the way they are supposed to work, it might well be a
  1979. bug, but you might want to double-check the documentation and the
  1980. mailing lists.
  1981.  
  1982. 2. Try to repeat the bug in as simple circumstances as possible. E.g.
  1983. if Wget crashes while downloading wwggeett --rrll00 --kkKKEE --tt55 ----nnoo--pprrooxxyy
  1984. hhttttpp::////eexxaammppllee..ccoomm --oo //ttmmpp//lloogg, you should try to see if the crash
  1985. is repeatable, and if will occur with a simpler set of options.
  1986. You might even try to start the download at the page where the
  1987. crash occurred to see if that page somehow triggered the crash.
  1988.  
  1989. Also, while I will probably be interested to know the contents of
  1990. your _._w_g_e_t_r_c file, just dumping it into the debug message is
  1991. probably a bad idea. Instead, you should first try to see if the
  1992. bug repeats with _._w_g_e_t_r_c moved out of the way. Only if it turns
  1993. out that _._w_g_e_t_r_c settings affect the bug, mail me the relevant
  1994. parts of the file.
  1995.  
  1996. 3. Please start Wget with --dd option and send us the resulting output
  1997. (or relevant parts thereof). If Wget was compiled without debug
  1998. support, recompile it---it is _m_u_c_h easier to trace bugs with debug
  1999. support on.
  2000.  
  2001. Note: please make sure to remove any potentially sensitive
  2002. information from the debug log before sending it to the bug
  2003. address. The "-d" won't go out of its way to collect sensitive
  2004. information, but the log _w_i_l_l contain a fairly complete transcript
  2005. of Wget's communication with the server, which may include
  2006. passwords and pieces of downloaded data. Since the bug address is
  2007. publically archived, you may assume that all bug reports are
  2008. visible to the public.
  2009.  
  2010. 4. If Wget has crashed, try to run it in a debugger, e.g. "gdb `which
  2011. wget` core" and type "where" to get the backtrace. This may not
  2012. work if the system administrator has disabled core files, but it is
  2013. safe to try.
  2014.  
  2015. SSEEEE AALLSSOO
  2016. This is nnoott the complete manual for GNU Wget. For more complete
  2017. information, including more detailed explanations of some of the
  2018. options, and a number of commands available for use with _._w_g_e_t_r_c files
  2019. and the --ee option, see the GNU Info entry for _w_g_e_t.
  2020.  
  2021. AAUUTTHHOORR
  2022. Originally written by Hrvoje NikoA,ii"A <hniksic@xemacs.org>.
  2023.  
  2024. CCOOPPYYRRIIGGHHTT
  2025. Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
  2026. 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2015 Free Software
  2027. Foundation, Inc.
  2028.  
  2029. Permission is granted to copy, distribute and/or modify this document
  2030. under the terms of the GNU Free Documentation License, Version 1.3 or
  2031. any later version published by the Free Software Foundation; with no
  2032. Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
  2033. Texts. A copy of the license is included in the section entitled "GNU
  2034. Free Documentation License".
  2035.  
  2036.  
  2037.  
  2038. GNU Wget 1.19.1 2017-05-13 WGET(1)
Add Comment
Please, Sign In to add comment