This week only. Pastebin PRO Accounts Christmas Special! Don't miss out!Want more features on Pastebin? Sign Up, it's FREE!
Guest

HTML Matching

By: a guest on Aug 27th, 2010  |  syntax: Python  |  size: 0.71 KB  |  views: 34  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. import re
  2.  
  3. s = '''<dt>
  4.    <a href="#profile-experience" >Past</a>
  5. </dt>
  6. <dd>
  7.    <ul class="past">
  8.        <li>
  9.            President, CEO &amp; Founder <span class="at">at</span> China Connection
  10.        </li>
  11.        <li>
  12.            Professional Speaker and Trainer <span class="at">at</span> Edgemont Enterprises
  13.        </li>
  14.        <li>
  15.            Nurse &amp; Clinic Manager <span class="at">at</span> <span>USAF</span>
  16.        </li>
  17.    </ul>
  18. </dd>​​​​​'''
  19.  
  20. ul = re.findall('<dt>.*?Past.*?</dt>.*?<dd>.*?<ul class="past">.*?((<li>.*?</li>\s*)+).*?</ul>.*?</dd>', s, re.DOTALL | re.MULTILINE)
  21. rs = re.findall('<li>.*?</li>', ul[0][0], re.DOTALL | re.MULTILINE)
  22.  
  23. for li in rs:
  24.     print li
clone this paste RAW Paste Data