Advertisement
rdrewd

Letter about Problem 22

Mar 19th, 2013
57
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 7.26 KB | None | 0 0
  1. Drew Davis <r.drew.davis@gmail.com>
  2. 1/19/12
  3.  
  4. to Timothy
  5. Well this has been fun! Attached is my improved version of problem22.
  6. The first big change was to add a NameScorer class. When you instantiate
  7. it, you tell it the alphabet of symbols it'll be asked to process.
  8. The first symbol is counted as value 1, the 2nd symbol is given a value
  9. of 2, etc. The __init__ routine of NameScorer builds the valuedict,
  10. which when you come down to it, is just an implementation detail of
  11. how we compute the score for a name. Assuming ASCII and using the
  12. ord(c)-ord('A') approach is another way to do it. The new approach
  13. removes the valuedict from main(). That looked out of place in main()
  14. and I either had to make it global (which is considered bad form) or
  15. do what my previous version did: pass the valuedict in to the nameScore
  16. routine each time it was called!. Happy to be rid of that.
  17.  
  18. What I learned as I wrote my first class is that a self. qualifier is
  19. very important when trying to refer to a symbol that is part of the
  20. instance of the class. multiple times the tests aborted with a message
  21. that there's no such global symbol as valdict. I'd have to correct
  22. the code to say self.valdict and hit my forehead while exclaiming Doh!
  23. I'm going to have to look for a short cut like import has so you don't
  24. have to repeatedly qualify a symbol reference with the module name.
  25.  
  26. The next big change didn't actually survive to the final version. I had
  27. "for" loops that iterated over the alphabet or the name list, but I also
  28. had a counter that had to tick along with each iteration. That meant an
  29. initialization statement before the "for" and an incrementing statement in
  30. the body of the loop. There isn't wonderful documentation of the syntax
  31. in the Python books I have here, but you can say for (i, name) in <list
  32. of tuples> and thus have both the names and the counter automatically
  33. handled by the "for". That left almost nothing inside the "for" and it
  34. was then that I realized, the initialization of the value dictionary,
  35. could be handled by a "list comprehension" generating the tuples that
  36. then just become an argument to dict(). And the totaling of the weighted name
  37. scores in main could be done by invoking sum(i*ns.score(name) for (i, name) in
  38. zip(range(1,len(nameList)+1), namelist))
  39.  
  40. The books I have here say that current Python implementations implement
  41. list comprehensions like that by actually generating the list of tuples
  42. and iterating over that. Apparently the intent is that some day the
  43. tuples will be generated as needed. But heck. Memory is plentiful.
  44.  
  45. zip. by the way is a function that given 2 lists, generates tuples by
  46. pairing together the i-th element in one list with the i-th element in
  47. the other list. It only generates as many tuples as the shorter of the
  48. 2 lists, but both my lists were the same length.
  49.  
  50. Somewhat invisible in the code is that while making these changes,
  51. I learned a few more basic tricks with the git version control system.
  52. I checked in yesterday's problem22.py to git. That version was implicitly
  53. the master branch.
  54.  
  55. Then I created an experimental branch (yeah, this is overkill for a one
  56. person operation, but remember I'm doing this to learn). Each time I
  57. had a working change to problem22.py, I'd check it into the experimental
  58. branch. My last incremental change was in recognition of the ease of
  59. getting old versions thanks to git and frequent well commented logs of
  60. what changed. So I decided that my old habit of leaving test prints,
  61. etc. embedded into the code, albeit commented out, was just making it harder
  62. to read the program. So I weeded that commented-out scaffolding out of
  63. the final version.
  64.  
  65. One disapointment was I was hoping the list comprehensions would get
  66. interpreted faster than the for loops, albeit at the price of using
  67. more memory. But it runs no faster, maybe even 10ms slower, but that's
  68. in the noise range, so more tests would be needed to pin down what is
  69. actually happening. I did accidently give myself one clue. I added
  70. print statements to show the type of the grammar variable and of the
  71. grammar.parsefile(...) result that I had to convert to a vanilla list
  72. before I could sort the list. I was lazy, so rather than tear apart
  73. the return(list(grammar.parsefile(...))) so I'd have an intermediate
  74. result that I could pass to type() to see what it was, I left the return
  75. statement alone and preceded it with a print statement that changed
  76. list(...) to type(...). That meant I parsed the file twice. The total
  77. CPU time nearly doubled, so without resorting to grand profiling tools
  78. (which I confess I don't yet know anyhow - and profiling a 320ms run is not
  79. such a great thing to do!), but we have some evidence that the vast majority
  80. of the run time is going into pyparsing as it reads and parses the 5000
  81. or so names in the input file. So hoping that tweeking the other loops and
  82. argument passing would improve CPU time was naive on my part. Oops.
  83.  
  84. Oh, and the type of the grammar is "pyparsing.And", which agrees withe the
  85. top level of delimited list generating an expression using the overloaded
  86. + operator. I'm quite certain that the type of "grammar" is a subclass
  87. of parser element, so a grander grammer could be built up with this as
  88. just one part of it, Mercifully, I didn't need anything grander.
  89.  
  90. The type of grammar.parseFile(...) is 'pyparsing.ParseResults". I didn't
  91. dig deeper to see why the thing looks like a list, but won't sort.
  92. It occurs to me that sort depends heavily on the comparison operators
  93. and presumably copy operators to swap values around. Perhaps pyparsing
  94. overloads those operators for its ParseResults type. If and when future
  95. problems cause me to revisit pyparsing, I'll take a closer look.
  96.  
  97. Attached is a PDF file of the listing of the "final" program as produced
  98. by the Eric IDE (in color!). The entire listing now handily fits on a
  99. single page.
  100.  
  101. I ran the file through sloccount and it says 15 non-comment, non-blank
  102. lines of source code. It also estimated that creating 15 lines of Python
  103. code like this would take about 0.65 months, which I figure is 2-3 weeks,
  104. which is scarily close to right. But if I wasn't taking the time to
  105. learn a new language and its modules and git, all at the same time,
  106. I'd like to believe I could knock out 15 lines of code in much less time.
  107.  
  108. (Of course, working on my own, my time is not getting sapped into
  109. Affirmative Action meetings, design reviews, questions from the peanut
  110. gallery and all those other things that motivated me to work at home when
  111. I was doing real design or real coding. So maybe in a real corporate
  112. environment, that 2-3 week estimate still would apply, even for a seasoned
  113. Pythonista, which I don't claim to be yet!)
  114.  
  115. My one regret still is that I'm still not doing test-driven-development.
  116. My programs don't have built-in self-testing capability to watch out for
  117. any regressions as changes are made. I'll have to force myself to try
  118. that stuff in some future problems. At this point, I'm not even sure
  119. where the test code would go. A separate file in place of "__main__".
  120. And, just to keep life interesting, Python has multiple unit-test modules
  121. available, so picking one intelligently will take some work. One of the
  122. unit-test packages is called "nose". If I pick that one will you go "Ewww!"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement