Advertisement
hakonhagland

gawk-sed-3

Jan 13th, 2014
301
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.38 KB | None | 0 0
  1. #! /bin/bash
  2.  
  3. gawk '{gsub(/\n\n+/,"\n\n");printf $0}' RS="^$" file | sed '
  4. s/\s*\([.,;!?]\)\s*/\1 /g
  5. s/\s\+/ /g
  6. s/^.*$/\L&/
  7. s/\([.;!?]\s*\)\(.\)/\1\u\2/g
  8. s/^./\u&/
  9. /\(^$\)\|\([!?;.,]\s*$\)/! s/\s*$/.&/
  10. '
  11. gawk 'END{print "Number of paragraphs: "NR}' RS="" file
  12. gawk -F'[.?;!]' 'END {print "Number of sentences: "NF-1}' RS='^$' file
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement