Advertisement
Guest User

Untitled

a guest
Feb 27th, 2024
81
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 0.69 KB | Science | 0 0
  1. #/bin/bash
  2.  
  3.  
  4. wget https://docs.google.com/spreadsheets/d/e/2PACX-1vTyDVQFiC-qN4Ryp8H4GlGkKYbMOdN0KL7ygCu2cbiYLMwqmPcLDvEp-wOeVCg0s0AOl20rUjY3p5XK/pubhtml -O indice.html
  5.  
  6. cat indice.html | grep -Po '>\Khttps://archive.org.*?(?=</a>)' > listaEnlaces.txt
  7.  
  8. cat listaEnlaces.txt | while read url; do
  9.     echo "Descargando $url";
  10.  
  11.     wget $url -O indiceAnio.html
  12.  
  13.     cat indiceAnio.html | grep -Po '/download/.*?pdf(?=")' > pdfs.txt
  14.  
  15.     cat pdfs.txt | while read urlPDF; do
  16.         echo "Descargados PDF: $urlPDF";
  17.  
  18.         wget https://archive.org$urlPDF
  19.  
  20.         sleep 1
  21.     done
  22.  
  23.     rm indiceAnio.html
  24.     rm pdfs.txt
  25.  
  26.     sleep 2
  27.  
  28. done
  29.  
  30. rm listaEnlaces.txt
  31. rm indice.html
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement