Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- /usr/lib/python2.6/site-packages/bs4/builder/_htmlparser.py:149: RuntimeWarning: Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.
- "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help."))
- Traceback (most recent call last):
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/hubs/poll.py", line 97, in wait
- readers.get(fileno, noop).cb(fileno)
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/greenthread.py", line 192, in main
- result = function(*args, **kwargs)
- File "crawl.py", line 29, in retrieve_links
- b = BeautifulSoup(src)
- File "/usr/lib/python2.6/site-packages/bs4/__init__.py", line 172, in __init__
- self._feed()
- File "/usr/lib/python2.6/site-packages/bs4/__init__.py", line 185, in _feed
- self.builder.feed(self.markup)
- File "/usr/lib/python2.6/site-packages/bs4/builder/_htmlparser.py", line 150, in feed
- raise e
- HTMLParseError: bad end tag: u'</a";\n\t\t\t\t}\n\t\t\t\tadNode += "</div>', at line 130, column 151
- Removing descriptor: 3
- Traceback (most recent call last):
- File "crawl.py", line 78, in <module>
- begin_crawling()
- File "crawl.py", line 71, in begin_crawling
- data = crawl()
- File "crawl.py", line 56, in crawl
- for link in green_pool.imap(retrieve_links, crawl_queue):
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/greenpool.py", line 232, in next
- val = self.waiters.get().wait()
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/greenthread.py", line 166, in wait
- return self._exit_event.wait()
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/event.py", line 116, in wait
- return hubs.get_hub().switch()
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/hubs/hub.py", line 177, in switch
- return self.greenlet.switch()
- File "/usr/lib/python2.6/site-packages/eventlet-0.9.16-py2.6.egg/eventlet/greenthread.py", line 192, in main
- result = function(*args, **kwargs)
- File "crawl.py", line 29, in retrieve_links
- b = BeautifulSoup(src)
- File "/usr/lib/python2.6/site-packages/bs4/__init__.py", line 172, in __init__
- self._feed()
- File "/usr/lib/python2.6/site-packages/bs4/__init__.py", line 185, in _feed
- self.builder.feed(self.markup)
- File "/usr/lib/python2.6/site-packages/bs4/builder/_htmlparser.py", line 150, in feed
- raise e
- HTMLParser.HTMLParseError: bad end tag: u'</a";\n\t\t\t\t}\n\t\t\t\tadNode += "</div>', at line 130, column 151
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement