Advertisement
Guest User

Untitled

a guest
Sep 1st, 2015
52
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.72 KB | None | 0 0
  1. Starting with a bunch of gzipped log files, like "logfile-*.gz".
  2.  
  3. First, uncompress to stdout:
  4.  
  5. gunzip --to-stdout logfile-*.gz
  6.  
  7. Reduce output to just 404s (the 6th field):
  8.  
  9. | grep " 404 "
  10.  
  11. Extract the request path:
  12.  
  13. | cut -d " " -f 7
  14.  
  15. Sort lexicographically, to group identical requests:
  16.  
  17. | sort
  18.  
  19. Reduce duplicate lines to one line, and prefix with duplicate count:
  20.  
  21. | uniq -c
  22.  
  23. Sort again, numerically and in reverse order, to produce a descending list of unique 404s:
  24.  
  25. | sort --numeric-sort --reverse
  26.  
  27. And finally, redirect to a file:
  28.  
  29. > unique-404s.txt
  30.  
  31. All together then:
  32.  
  33. gunzip --to-stdout logfile-*.gz | grep " 404 " | cut -d " " -f 7 | sort | uniq -c | sort --numeric-sort --reverse > unique-404s.txt
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement