Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- So, when I try a command that should work, it only grabs/sees the index.
- Mitchells-MacBook-Pro:TV SITE mitch$ wget --no-directories --content-disposition -H -e robots=off -A.pdf -r https://sites.google.com/site/tvwriting/
- --2019-09-16 07:05:07-- https://sites.google.com/site/tvwriting/
- Resolving sites.google.com (sites.google.com)... 172.217.1.14
- Connecting to sites.google.com (sites.google.com)|172.217.1.14|:443... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘index.html.tmp’
- index.html.tmp [ <=> ] 534.55K 2.10MB/s in 0.2s
- 2019-09-16 07:05:07 (2.10 MB/s) - ‘index.html.tmp’ saved [547375]
- Removing index.html.tmp since it should be rejected.
- FINISHED --2019-09-16 07:05:07--
- Total wall clock time: 0.7s
- Downloaded: 1 files, 535K in 0.2s (2.10 MB/s)
- _______
- Or if I try just a basic mirror...
- Mitchells-MacBook-Pro:TV SITE mitch$ wget -m https://sites.google.com/site/tvwriting
- --2019-09-16 07:11:49-- https://sites.google.com/site/tvwriting
- Resolving sites.google.com (sites.google.com)... 172.217.1.14
- Connecting to sites.google.com (sites.google.com)|172.217.1.14|:443... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- sites.google.com/site/tvwriting: Is a directory
- Cannot write to ‘sites.google.com/site/tvwriting’ (Success).
- Mitchells-MacBook-Pro:TV SITE mitch$ wget -m https://sites.google.com/site/tvwriting/
- --2019-09-16 07:12:03-- https://sites.google.com/site/tvwriting/
- Resolving sites.google.com (sites.google.com)... 172.217.1.14
- Connecting to sites.google.com (sites.google.com)|172.217.1.14|:443... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/index.html’
- sites.google.com/site/tvwriti [ <=> ] 534.55K 1.46MB/s in 0.4s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:04 (1.46 MB/s) - ‘sites.google.com/site/tvwriting/index.html’ saved [547376]
- Loading robots.txt; please ignore errors.
- --2019-09-16 07:12:04-- https://sites.google.com/robots.txt
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/plain]
- Saving to: ‘sites.google.com/robots.txt’
- sites.google.com/robots.txt [ <=> ] 65 --.-KB/s in 0s
- 2019-09-16 07:12:04 (2.14 MB/s) - ‘sites.google.com/robots.txt’ saved [65]
- --2019-09-16 07:12:04-- https://sites.google.com/site/tvwriting/home
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/home’
- sites.google.com/site/tvwriti [ <=> ] 534.57K 2.09MB/s in 0.3s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:04 (2.09 MB/s) - ‘sites.google.com/site/tvwriting/home’ saved [547399]
- --2019-09-16 07:12:04-- https://sites.google.com/site/tvwriting/uk-drama
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/uk-drama’
- sites.google.com/site/tvwriti [ <=> ] 427.13K 2.19MB/s in 0.2s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:04 (2.19 MB/s) - ‘sites.google.com/site/tvwriting/uk-drama’ saved [437382]
- --2019-09-16 07:12:04-- https://sites.google.com/site/tvwriting/uk-drama/pilot-scripts
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/uk-drama/pilot-scripts’
- sites.google.com/site/tvwriti [ <=> ] 367.85K 1.32MB/s in 0.3s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:05 (1.32 MB/s) - ‘sites.google.com/site/tvwriting/uk-drama/pilot-scripts’ saved [376674]
- --2019-09-16 07:12:05-- https://sites.google.com/site/tvwriting/uk-drama/and-then-there-were-none
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/uk-drama/and-then-there-were-none’
- sites.google.com/site/tvwriti [ <=> ] 357.05K 1.62MB/s in 0.2s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:05 (1.62 MB/s) - ‘sites.google.com/site/tvwriting/uk-drama/and-then-there-were-none’ saved [365618]
- --2019-09-16 07:12:05-- https://sites.google.com/site/tvwriting/uk-drama/ashes-to-ashes
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/uk-drama/ashes-to-ashes’
- sites.google.com/site/tvwriti [ <=> ] 359.95K 1.63MB/s in 0.2s
- Last-modified header missing -- time-stamps turned off.
- 2019-09-16 07:12:06 (1.63 MB/s) - ‘sites.google.com/site/tvwriting/uk-drama/ashes-to-ashes’ saved [368588]
- --2019-09-16 07:12:06-- https://sites.google.com/site/tvwriting/uk-drama/a-very-english-scandal
- Reusing existing connection to sites.google.com:443.
- HTTP request sent, awaiting response... 200 OK
- Length: unspecified [text/html]
- Saving to: ‘sites.google.com/site/tvwriting/uk-drama/a-very-english-scandal’
- sites.google.com/site/tvwriti [ <=> ] 357.06K 1.54MB/s in 0.2s
- Which gets me the folder structure of the site, but where the PDFS are I only have: https://imgur.com/a/0b8luIP
- Which are just blank files. I'm assuming this has something to do with his weird google linking.
- What do you think?
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement