stackexchange-gilles

http://meta.travel.stackexchange.com/questions/1214/suggeste

Feb 14th, 2013
37
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. #! /usr/bin/env python
  2. ## Usage: run the SEDE query
  3. ##     http://data.stackexchange.com/travel%20answers/query/97659/wikitravel-mentions-for-offline-postprocessing
  4. ## then download the results to QueryResults.csv and run
  5. ##     csv2markdown <QueryResults.csv
  6. import csv, re, sys, urllib
  7. rows = list(csv.reader(sys.stdin))[:]
  8. hits = {}
  9. for row in rows:
  10.     (n, parent, ty, html, tagname) = row
  11.     links = re.findall(r'http://wikitravel.org/en/[^"#<>? ]*', html)
  12.     for link in links:
  13.         place = link.split('/')[-1]
  14.         pretty_place = urllib.unquote(re.sub(r'_', ' ', place))
  15.         slug = n if parent == '' else '%s/#%s' % (parent, n)
  16.         post_url = 'http://travel.stackexchange.com/q/' + slug if tagname == '' else \
  17.                    '\[[%s](http://travel.stackexchange.com/tags/%s/info)\]' % (tagname, tagname)
  18.         if not hits.has_key(place): hits[place] = (pretty_place, set())
  19.         hits[place][1].add(post_url)
  20. for place in sorted(hits):
  21.     (pretty_place, locations) = hits[place]
  22.     leader = '%s ([travel](http://wikitravel.org/en/%s), [voyage](http://en.wikivoyage.org/wiki/%s))' % (pretty_place, place, place)
  23.     print ' &mdash; '.join([leader] + sorted(locations)) + "  "
RAW Paste Data