Advertisement
xopsuei

"threaded" pdf to txt

May 25th, 2014
382
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 0.39 KB | None | 0 0
  1. #script.sh
  2. #!/bin/bash
  3. echo "starting"
  4. dir=$(pwd)
  5. count=0
  6. for f in *.pdf
  7. do
  8.     if [ $count -eq 4 ]; then
  9.         wait
  10.         count=0
  11.     fi
  12.     ./pdfToTxt.sh $f &
  13.     count=$((count+1))
  14.     echo $count
  15. done
  16. echo "done"
  17.  
  18. #pdfToTxt.sh
  19. #!/bin/bash
  20. echo "converting "$1
  21. gs -o /tmp/$1.tif -sDEVICE=tiffg4 -r720x720 -g6120x7920 $1 > /dev/null
  22. tesseract /tmp/$1.tif $1 > /dev/null
  23. rm /tmp/$1.tif
  24. echo "converted"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement