Advertisement
metalx1000

Search PDF files for Social Security Numbers

Oct 20th, 2022 (edited)
3,143
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 0.59 KB | None | 0 0
  1. #This will search all pdf files for all 9 digit numerical matches
  2. #this includes 9 digits together or with dashes or comas (etc).
  3.  
  4. output="ssn"
  5. find -iname "*.pdf"| while read pdf
  6. do
  7.   match="$(pdftotext -layout "$pdf" - 2>/dev/null |grep '[0-9]\{3\}-\{0,1\}[0-9]\{2\}-\{0,1\}[0-9]\{4\}')"
  8.   [[ "$match" ]] && echo "$pdf - $match"|tr -s " "
  9. done|tee "$output"|grep '[0-9]\{3\}-\{0,1\}[0-9]\{2\}-\{0,1\}[0-9]\{4\}'
  10. #final grep above it just for highlighting the matches
  11.  
  12. #to narrow the search to SSN formatted with dashes use this.
  13. grep "[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]" "$output"
Tags: pdf
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement