Advertisement
Guest User

pronoiac

a guest
Dec 16th, 2009
209
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Perl 0.62 KB | None | 0 0
  1. #!/usr/bin/perl
  2.  
  3. # call with, say, "./parse-tags.pl tagdata_mefi.txt" or "tagdata_*"
  4.  
  5. $debug = 0;
  6.  
  7. while (<>) {
  8.   chomp;
  9.   if ($_ =~ /^(.+?)\t(.+?)\t(.+?)\t(\S+)/) {
  10.     ($tag_id, $link_id, $link_date, $tag_name) =
  11.       ($1, $2, $3, $4);
  12.     if ($debug) {print "parsed: $tag_id, $link_id, $link_date, $tag_name\n";}
  13.     if ($tag_name =~ /[\x00-\x19]|[\x7F-\xFF]|\"/ ) {
  14.       # quotes, & characters outside POSIX [:print:] [\x20-\x7E]
  15.       #  (Visible characters and spaces outside usual printable area)
  16.       print "tag_name: $tag_id, $link_id, $link_date, $tag_name\n";
  17.     } # end tag_name check
  18.   }
  19. } # end <>
  20.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement