Don't like ads? PRO users don't see any ads ;-)
Guest

Untitled

By: a guest on Apr 29th, 2012  |  syntax: None  |  size: 0.34 KB  |  hits: 15  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. t_pattern = r'''(?x)    # set flag to allow verbose regexps
  2.         ([A-Z]\.)+      # abbreviations, e.g. U.S.A.
  3.           | (\w+\'\w+)      # apostrophe
  4.           | \w+(-\w+)*      # words with optional internal hyphens
  5.           | \$?\d+(\.\d+)?%?  # currency and percentages, e.g. $12.40, 82%
  6.          '''
  7. tokenizer_re = re.compile(t_pattern)