Advertisement
Guest User

Untitled

a guest
Aug 2nd, 2015
198
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.10 KB | None | 0 0
  1. #!/usr/bin/perl -w
  2. # accScrapeOnePage.pl
  3.  
  4. local $/=undef;
  5. $data = <DATA>;
  6. # $data = <>;
  7.  
  8. while ($data =~ /<h3 class="edims".*?(Item .*?)<\/h3>\s*<p class="edims">(.*?)<\/p>(.*)/s) {
  9.  
  10. $refId = wrap ($1);
  11. $refDescr = wrap ($2);
  12. $data = $3;
  13.  
  14. print $refId . "," . $refDescr . "\n";
  15.  
  16.  
  17. } # end while (<DATA>)
  18.  
  19.  
  20. #---------------------
  21. sub wrap {
  22.  
  23. my $field = shift;
  24.  
  25. # if the field contains a comma, it must be wrapped in double quotes
  26. if ($field =~ /,/s) {
  27.  
  28. # since the field is being wrapped in quotes,
  29. # any existing quotes must be escaped with an extra quote
  30. # so, for any quotes that exist, replace with 2 quotes
  31. $field =~ s/"/""/g;
  32.  
  33. # now, wrap field with quotes
  34. $field = '"' . $field . '"';
  35.  
  36. } # end if ($field =~ /./s)
  37.  
  38. return $field;
  39.  
  40. } # end sub wrap
  41.  
  42.  
  43. __DATA__
  44. <!-- bla bla bla html stuff -->
  45.  
  46. <p class="edims"> </p>
  47. <hr class="edims" /><a class="edims" name="002" id="002"></a>
  48. <h3 class="edims" style="color: #585858;">Item 2 - April 2, 2015</h3>
  49. <p class="edims">Authorize execution of an interlocal agreement with Caldwell County for the installation, maintenance, and repair of Caldwell County's wireless communications equipment for a 12-month term for an estimated amount not to exceed $5,000 payable to the City and with annual automatic renewal terms in the same estimated amount per renewal.</p>
  50. <p class="edims"> </p>
  51. <h4 class="edims"><b>Work Papers and Other Backup Documentation</b></h4>
  52.  
  53. <!-- bla bla bla still more html stuff -->
  54.  
  55. <h3 class="edims" style="color: #585858;">Item 3 - April 2, 2015</h3>
  56. <p class="edims">Authorize award and execution of a construction contract with SMITH CONTRACTING COMPANY, INC. for the 3rd Street Reconstruction Phase 4 Guadalupe Street to Nueces Street Project in the amount of $3,895,282.50 plus a $389,528.25 contingency, for a total contract amount not to exceed $4,284,810.75.</p>
  57. <p class="edims"> </p>
  58. ...
  59. <h3 class="edims" style="color: #585858;">Item 501 - April 2, 2015</h3>
  60. <p class="edims">Example with "quoted reference". Yet another contract, here. </p>
  61. <p class="edims"> </p>
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement