Advertisement
gabalese

epubcount.sh

Aug 14th, 2012
190
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 0.55 KB | None | 0 0
  1. #! /usr/bin/env sh
  2. # usage: ./epubcount.sh <epubfile>
  3. # remember to `chmod +x epubcount.sh`
  4.  
  5. IFS=$'\n' #change the internal field separator, otherwise file with spaces will fail.
  6. mkdir temporary/ #use a temporary directory
  7. unzip -jq $1 *html -d temporary/ #unzip all xhtml/html files of the epub archive
  8. cd temporary/ #change directory
  9. cat *html > out.html #concatenate all html
  10. count=`cat out.html | sed 's/<[^<]*>/ /g' | wc -w` #count the words of the concat'd file, excluding tags.
  11. echo $1 ": " $count
  12. cd ..
  13. rm -rf temporary/ #we can trash the directory
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement