Advertisement
Guest User

Untitled

a guest
Oct 10th, 2017
102
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 12.27 KB | None | 0 0
  1. The following tricks I find pretty useful in my daily Python work. I also added a few I stumbled upon lately.
  2.  
  3. 1. Use collections
  4.  
  5. This really makes your code more elegant and less verbose, a few examples I absorbed this week:
  6.  
  7. Named tuples:
  8.  
  9. >>> Point = collections.namedtuple('Point', ['x', 'y'])
  10. >>> p = Point(x=1.0, y=2.0)
  11. >>> p
  12. Point(x=1.0, y=2.0)
  13. Now you can index by keyword, much nicer than offset into tuple by number (less readable)
  14.  
  15. >>> p.x
  16. 1.0
  17. >>> p.y
  18. Elegantly used when looping through a csv:
  19.  
  20. with open('stock.csv') as f:
  21. f_csv = csv.reader(f)
  22. headings = next(f_csv)
  23. Row = namedtuple('Row', headings)
  24. for r in f_csv:
  25. row = Row(*r) # note the star extraction
  26. # ... process row ...
  27. I like the unpacking star feature to throw away useless fields:
  28.  
  29. line = 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'
  30. >>> uname, *fields, homedir, sh = line.split(':')
  31. >>> uname
  32. 'nobody'
  33. >>> homedir
  34. '/var/empty'
  35. >>> sh
  36. '/usr/bin/false'
  37. Superconvenient: the defaultdict:
  38.  
  39. from collections import defaultdict
  40. rows_by_date = defaultdict(list)
  41. for row in rows:
  42. rows_by_date[row['date']].append(row)",
  43. Before I would init the list each time which leads to needless code:
  44.  
  45. if row['date'] not in rows_by_date:
  46.  
  47. rows_by_date[row['date']] = []
  48. You can use OrderedDict to leave the order of inserted keys:
  49.  
  50. >>> import collections
  51. >>> d = collections.OrderedDict()
  52. >>> d['a'] = 'A'
  53. >>> d['b'] = 'B'
  54. >>> d['c'] = 'C'
  55. >>> d['d'] = 'D'
  56. >>> d['e'] = 'E'
  57. >>> for k, v in d.items():
  58. ... print k, v
  59. ...
  60. a A
  61. b B
  62. c C
  63. d D
  64. e E
  65. Another nice one is Counter:
  66.  
  67. from collections import Counter
  68.  
  69. words = [
  70. 'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
  71. 'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
  72. 'eyes', ""don't"", 'look', 'around', 'the', 'eyes', 'look', 'into',
  73. 'my', 'eyes', ""you're"", 'under'
  74. ]
  75. word_counts = Counter(words)
  76. top_three = word_counts.most_common(3)
  77. print(top_three)
  78. # Outputs [('eyes', 8), ('the', 5), ('look', 4)]",
  79. Again, before I would write most_common manually. Not necessary, this is all done already somewhere in the stdlib :)
  80.  
  81. 2. sorted() accepts a key arg which you can use to sort on something else
  82.  
  83. Here for example we sort on surname:
  84.  
  85. >>> sorted(names, key=lambda name: name.split()[-1].lower())
  86. ['Ned Batchelder', 'David Beazley', 'Raymond Hettinger', 'Brian Jones']
  87. 3. Create XMl from dict
  88.  
  89. Creating XML tags manually is usually a bad idea, I bookmarked this simple dict_to_xml helper:
  90.  
  91. from xml.etree.ElementTree import Element
  92. def dict_to_xml(tag, d):
  93. '''
  94. Turn a simple dict of key/value pairs into XML
  95. '''
  96. elem = Element(tag)
  97. for key, val in d.items():
  98. child = Element(key)
  99. child.text = str(val)
  100. elem.append(child)
  101. return elem"
  102. 4. Oneliner to see if there are any python files in a particular directory
  103.  
  104. Sometimes ‘any’ is pretty useful:
  105.  
  106. import os
  107. files = os.listdir('dirname')
  108. if any(name.endswith('.py') for name in files):
  109. 5. Use set operations to match common items in lists
  110. >>> a = [1, 2, 3, 'a']
  111. >>> b = ['a', 'b', 'c', 3, 4, 5]
  112. >>> set(a).intersection(b)
  113. {3, 'a'}
  114. 6. Use re.compile
  115.  
  116. If you are going to check a regular expression in a loop, don’t do this:
  117.  
  118. for i in longlist:
  119. if re.match(r'^...', i)
  120. yet define the regex once and use the pattern:
  121. p = re.compile(r'^...')
  122. for i in longlist:
  123. if p.match(i)
  124. 7. Printing files with potential bad (Unicode) characters
  125.  
  126. The book suggested to print filenames of unknown origin, use this convention to avoid errors:
  127.  
  128. def bad_filename(filename):
  129. return repr(filename)[1:-1]
  130. try:
  131. print(filename)
  132. except UnicodeEncodeError:
  133. print(bad_filename(filename))
  134. Handling unicode chars in files can be nasty because they can blow up your script. However the logic behind it is not that hard to grasp. A good snippet to bookmark is the encoding / decoding of Unicode:
  135.  
  136. >>> a
  137. 'pýtĥöñ is awesome\n'
  138. >>> b = unicodedata.normalize('NFD', a)
  139. >>> b.encode('ascii', 'ignore').decode('ascii')
  140. 'python is awesome\n'
  141. O’Reilly has a course on Working with Unicode in Python.
  142.  
  143. 8. Print is pretty cool (Python 3)
  144.  
  145. I am probably not the only one writing this kind of join operations:
  146.  
  147. >>> row = ["1", "bob", "developer", "python"]
  148. >>> print(','.join(str(x) for x in row))
  149. 1,bob,developer,python
  150. Turns out you can just write it like this:
  151.  
  152. >>> print(*row, sep=',')
  153. 1,bob,developer,python
  154. Note again the * unpacking.
  155. 9. Functions like sum() accept generators / use the right variable type
  156.  
  157. I wrote this at a conference to earn me a coffee mug ;)
  158.  
  159. sum = 0
  160. for i in range(1300):
  161. if i % 3 == 0 or i % 5 == 0:
  162. sum += i
  163. print(sum)
  164. Returns 394118, while handing it in I realized this could be written much shorter and efficiently:
  165.  
  166. >>> sum(i for i in range(1300) if i % 3 == 0 or i % 5 == 0)
  167. 394118
  168. A generator:
  169.  
  170. lines = (line.strip() for line in f)
  171. is more memory efficient than:
  172.  
  173. lines = [line.strip() for line in f] # loads whole list into memory at once
  174. And concatenating strings is inefficient:
  175.  
  176. s = "line1\n"
  177. s += "line2\n"
  178. s += "line3\n"
  179. print(s)
  180. Better build up a list and join when printing:
  181.  
  182. lines = []
  183. lines.append("line1")
  184. lines.append("line2")
  185. lines.append("line3")
  186. print("\n".join(lines))
  187. Another one I liked from the cookbook:
  188. portfolio = [
  189. {'name':'GOOG', 'shares': 50},
  190. {'name':'YHOO', 'shares': 75},
  191. {'name':'AOL', 'shares': 20},
  192. {'name':'SCOX', 'shares': 65}
  193. ]
  194. min_shares = min(s['shares'] for s in portfolio)
  195. One line to get the min of a numeric value in a nested data structure.
  196.  
  197. 10. Enumerate lines in for loop
  198.  
  199. You can number lines (or whatever you are looping over) and start with 1 (2nd arg), this is a nice debugging technique
  200.  
  201. for lineno, line in enumerate(lines, 1): # start counting at 0
  202. fields = line.split()
  203. try:
  204. count = int(fields[1])
  205. ...
  206. except ValueError as e:
  207. print('Line {}: Parse error: {}'.format(lineno, e))
  208. 11. Pandas
  209.  
  210. Import pandas and numpy:
  211.  
  212. import pandas as pd
  213. import numpy as np
  214. 12. Make random dataframe with three columns:
  215.  
  216. df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))
  217. Select:
  218. # Boolean indexing (remember the parentheses)
  219. df[(df.A < 0.5) & (df.B > 0.5)]
  220. # Alternative, using query which depends on numexpr
  221. df.query('A < 0.5 & B > 0.5')
  222. Project:
  223. # One columns
  224. df.A
  225. # Multiple columns
  226. # there may be another shorter way, but I don't know it
  227. df.loc[:,list('AB')]
  228. Often used snippets
  229. Dates
  230. 13. Difference (in days) between two dates:
  231.  
  232. from datetime import date
  233.  
  234. d1 = date(2013,1,1)
  235. d2 = date(2013,9,13)
  236. abs(d2-d1).days
  237. directory-of-script snippet
  238. os.path.dirname(os.path.realpath(__file__))
  239. # combine with
  240. os.path.join(os.path.dirname(os.path.realpath(__file__)), 'foo','bar','baz.txt')
  241. 14. PostgreSQL-connect-query snippet
  242.  
  243. import psycopg2
  244. conn = psycopg2.connect("host='localhost' user='xxx' password='yyy' dbname='zzz'")
  245. cur = conn.cursor()
  246. cur.execute("""SELECT * from foo;""")
  247. rows = cur.fetchall()
  248. for row in rows:
  249. print " ", row[0]
  250. conn.close()
  251. Input parsing functions
  252. 15. Expand input-file args:
  253.  
  254. # input_data: e.g. 'file.txt' or '*.txt' or 'foo/file.txt' 'bar/file.txt'
  255. filenames = [glob.glob(pathexpr) for pathexpr in input_data]
  256. filenames = [item for sublist in filenames for item in sublist]
  257. 15. Parse key-value pair strings like ‘x=42.0,y=1’:
  258.  
  259. kvp = lambda elem,t,i: t(elem.split('=')[i])
  260. parse_kvp_str = lambda args : dict([(kvp(elem,str,0), kvp(elem,float,1)) for elem in args.split(',')])
  261. parse_kvp_str('x=42.0,y=1')
  262. Postgres database functions
  263.  
  264. 16. Upper case in Python (just for example):
  265.  
  266. -- create extension plpythonu;
  267. CREATE OR REPLACE FUNCTION python_upper
  268. (
  269. input text
  270. ) RETURNS text AS
  271. $$
  272. return input.upper()
  273. $$ LANGUAGE plpythonu STRICT;
  274. 17. Convert IP address from text to integer:
  275.  
  276. CREATE FUNCTION ip2int(input text) RETURNS integer
  277. LANGUAGE plpythonu
  278. AS $$
  279. if 'struct' in SD:
  280. struct = SD['struct']
  281. else:
  282. import struct
  283. SD['struct'] = struct
  284. if 'socket' in SD:
  285. socket = SD['socket']
  286. else:
  287. import socket
  288. SD['socket'] = socket
  289. return struct.unpack("!I", socket.inet_aton(input))[0]
  290. $$;
  291. Convert IP address from integer to text:
  292. CREATE FUNCTION int2ip(input integer) RETURNS text
  293. LANGUAGE plpythonu
  294. AS $$
  295. if 'struct' in SD:
  296. struct = SD['struct']
  297. else:
  298. import struct
  299. SD['struct'] = struct
  300. if 'socket' in SD:
  301. socket = SD['socket']
  302. else:
  303. import socket
  304. SD['socket'] = socket
  305. return socket.inet_ntoa(struct.pack("!I", input))
  306. $$;
  307. 18. Commandline options
  308.  
  309. optparse-commandline-options snippet
  310. from optparse import OptionParser
  311. usage = "usage: %prog [options] arg "
  312. parser = OptionParser(usage=usage)
  313. parser.add_option("-x", "--some-option-x", dest="x", default=42.0, type="float",
  314. help="a floating point option")
  315. (options, args) = parser.parse_args()
  316. print options.x
  317. print args[0]
  318. 19. print-in-place (progress bar) snippet
  319.  
  320. import time
  321. import sys
  322. for progress in range(100):
  323. time.sleep(0.1)
  324. sys.stdout.write("Download progress: %d%% \r" % (progress) )
  325. sys.stdout.flush()
  326. Packaging snippets
  327.  
  328. 20. poor-mans-python-executable trick
  329.  
  330. Learned this trick from voidspace. The trick uses two files (__main__.py and hashbang.txt):
  331.  
  332. __main__.py:
  333. print 'Hello world'
  334. hashbang.txt (adding a newline after ‘python2.6’ is important):
  335. #!/usr/bin/env python2.6
  336. Build an “executable”:
  337. zip main.zip __main__.py
  338. cat hashbang.txt main.zip > hello
  339. rm main.zip
  340. chmod u+x hello
  341. Run “executable”:
  342. $ ./hello
  343. Hello world
  344. 21. import-class-from-file trick
  345.  
  346. Import class MyClass from a module file (adapted from stackoverflow):
  347.  
  348. import imp
  349. mod = imp.load_source('name.of.module', 'path/to/module.py')
  350. obj = mod.MyClass()
  351. 22. Occusional-usage snippets
  352.  
  353. Extract words from string
  354.  
  355. words = lambda text: ''.join(c if c.isalnum() else ' ' for c in text).split()
  356. words('Johnny.Appleseed!is:a*good&farmer')
  357. # ['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
  358. 23. IP address to integer and back
  359.  
  360. import struct
  361. import socket
  362. def ip2int(addr):
  363. return struct.unpack("!I", socket.inet_aton(addr))[0]
  364. def int2ip(addr):
  365. return socket.inet_ntoa(struct.pack("!I", addr))
  366. 24. Fluent Python Interface
  367.  
  368. Copied from riaanvddool.
  369.  
  370. # Fluent Interface Definition
  371. class sql:
  372. class select:
  373. def __init__(self, dbcolumn, context=None):
  374. self.dbcolumn = dbcolumn
  375. self.context = context
  376. def select(self, dbcolumn):
  377. return self.__class__(dbcolumn,self)
  378. # Demo
  379. q = sql.select('foo').select('bar')
  380. print q.dbcolumn #bar
  381. print q.context.dbcolumn #foo
  382. Flatten a nested lists
  383. def flatten(elems):
  384. """
  385. [['a'], ['b','c',['d'],'e',['f','g']]]
  386. """
  387. stack = [elems]
  388. top = stack.pop()
  389. while top:
  390. head, tail = top[0], top[1:]
  391. if tail: stack.append(tail)
  392. if not isinstance(head, list): yield head
  393. else: stack.append(head)
  394. if stack: top = stack.pop()
  395. else: break
  396. snap rounding
  397. EPSILON = 0.000001
  398. snap_ceil = lambda x: math.ceil(x) if abs(x - round(x)) > EPSILON else round(x)
  399. snap_floor = lambda x: math.floor(x) if abs(x - round(x)) > EPSILON else round(x)
  400. merge-two-dictionaries snippet
  401. x = {'a': 42}
  402. y = {'b': 127}
  403. z = dict(x.items() + y.items())
  404. # z = {'a': 42, 'b': 127}
  405. 25. anonymous-object snippet
  406.  
  407. Adapted from stackoverflow:
  408.  
  409. class Anon(object):
  410. def __new__(cls, **attrs):
  411. result = object.__new__(cls)
  412. result.__dict__ = attrs
  413. return result
  414. 26. Alternative:
  415.  
  416. class Anon(object):
  417. def __init__(self, **kwargs):
  418. self.__dict__.update(kwargs)
  419. def __repr__(self):
  420. return self.__str__()
  421. def __str__(self):
  422. return ", ".join(["%s=%s" % (key,value) for key,value in self.__dict__.items()])
  423. 27. generate-random-word snippet
  424.  
  425. Function that returns a random word (could also use random.choicewith this list of words):
  426.  
  427. import string, random
  428. randword = lambda n: "".join([random.choice(string.letters) for i in range(n)])
  429. setdefault tricks
  430. Increment (and initialize) value:
  431. d = {}
  432. d[2] = d.setdefault(2,39) + 1
  433. d[2] = d.setdefault(2,39) + 1
  434. d[2] = d.setdefault(2,39) + 1
  435. d[2] # value is 42
  436. 29. Append value to (possibly uninitialized) list stored under a key in dictionary:
  437.  
  438. d = {}
  439. d.setdefault(2, []).append(42)
  440. d.setdefault(2, []).append(127)
  441. d[2] # value is [42, 127]
  442. Binary tricks
  443.  
  444. 30. add-integers-using-XOR snippet
  445.  
  446. Swap two integer variables using the XOR swap algorithm:
  447.  
  448. x = 42
  449. y = 127
  450. x = x ^ y
  451. y = y ^ x
  452. x = x ^ y
  453. x # value is 127
  454. y # value is 42
  455. I know that most of it has been mentioned already But I think you should find some new tricks as well.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement