Advertisement
Guest User

phase-exec.py

a guest
Jul 19th, 2017
251
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.98 KB | None | 0 0
  1.  
  2. """
  3. Rules:
  4.  
  5. Cycle may contain any number of independent:
  6.  
  7. * Virtual (takes no time but affects dependecy graph)
  8. alloca
  9. * Loads. So can't group any of:
  10. %arrayidx = getelementptr inbounds i8*, i8** %argv, i32 %i
  11. %0 = load i8*, i8** %arrayidx
  12. %1 = load i8, i8* %0
  13. * Compute.
  14. binops: .add, .sub, .mul, .div, .rem, *sh*, and, or, xor
  15. other: getelementptr
  16. cross-phase: select (implies pick), br-cmp (implies compute and branch)
  17. Example Group:
  18. %conv26.i = sext i8 %9 to i32
  19. %cmp27.i = icmp sgt i8 %9, 57
  20. But not:
  21. %sub31.sub.v.i = select i1 %cmp27.i
  22. * Conversion
  23. ie: *trunc, *ext, *to*, *cast,
  24. * Pick
  25. vector: [insert|extractelement], shufflevector
  26. * Store:
  27. store i64 %4, i64* %5
  28. store %struct.list_head_s* %incdec.ptr, %struct.list_head_s** %next
  29. * Branch
  30. terminators: ret, br, switch
  31. (indirectbr, invoke, resume, catchswitch, catchret, cleanupret, unreachable - not present in benchmark)
  32. other: call,
  33.  
  34.  
  35. Store may depend on Compute and Load. Compute may depend on Load. Due to phased
  36. execution. Also, Branch may depend on anything.
  37.  
  38. There's some ways Compute can depend on Compute, like fused-multiply-add. But
  39. it's complitated, needs reordering. Never-mind for now.
  40.  
  41. [exract|insert]value aren't in benchmark, luckily; No idea how to handle
  42. alloca barely appears.
  43. fence does not appear, and Mill is fence-free besides
  44. cmpxchg, atomic* does not appear - would be complicated
  45. phi - DNA
  46. call malloc, putchar, etc - few in benchmark, treat as branch
  47. va_arg,
  48. ignore multicycle effects (add|div) - assume magic scheduler
  49.  
  50.  
  51. TODO: histogram instruction categories
  52. loop unrolling?
  53. reorder, hoist?
  54. work backwords to spec #pipes, FUs, etc
  55. PGO the code to remove unmodeled branch-prediction effects
  56. Handle switch fairly for Mill and Intel -- one-line switch reduces input program by 10%
  57.  
  58. Features:
  59. * AIR - Apparently Infinite Registers
  60. * PEX - Phased Execution
  61. * EFU - Enough Functional Units
  62. * EDW - Enough Decode bandWidth
  63.  
  64. Notes:
  65. * If no instructions fuse, IPC=1 by definition
  66. * Switches are single-cycle -- EFU
  67.  
  68. Algorithm:
  69. * Load instruction list
  70. * Reverse the list
  71. * For each instruction, get operation, phase, operands, result
  72. * If any operand appears in the result-list for this phase or a subsequent phase emit the old and start a new fused instruction
  73. * Otherwise, add this instruction to the old fused instruction
  74.  
  75. Output is a reversed list of fused instructions
  76.  
  77. IPC = InstructionsIn / FusedInstructionsOut
  78.  
  79. Fans guess Mill will have IPC>10. Skepics guess IPC>2 is hard; IPC>3 is
  80. impossible. Hoping to dissapoint everyone.
  81.  
  82. """
  83.  
  84. from sys import stdin
  85. from re import *
  86. from os import abort
  87.  
  88. match_instr = compile(r"\s*(%\S+)?\s*=?\s*(\S+)\s*(.*)").match
  89. findall_operands = compile(r"(%[^, \(\)\[\]]+)").findall
  90.  
  91. # PHASES
  92. LOAD = 2
  93. COMPUTE = 3
  94. CONVERT = 4
  95. PICK = 5
  96. STORE = 6
  97. BRANCH = 7
  98. # FINAL phase for sizing tables, not a real phase
  99. FINAL = 7
  100. # Alloca, not a real instruction
  101. VIRTUAL = 1
  102.  
  103. def M(op):
  104. return compile(op).match
  105.  
  106. # TODO: special case branch: goes in COMPUTE phase if conditional, BRANCH phase if unconditional
  107. op_phase = { M('alloca'): VIRTUAL,
  108. M('load'): LOAD,
  109. M('store'): STORE,
  110. M('insertelement'): STORE,
  111. M('.*(add|sub|mul|div|rem|sh|and|or|xor|cmp).*'): COMPUTE,
  112. # switch is COMPUTE because its a mix of COMPUTE and BRANCH and COMPUTE is earlier than BRANCH
  113. M('switch'): COMPUTE,
  114. M('getelementptr'): COMPUTE,
  115. M('select'): COMPUTE,
  116. M('.*(trunc|ext|to|cast).*'): CONVERT,
  117. M('br'): BRANCH,
  118. M('call|ret|tail'): BRANCH
  119. }
  120.  
  121. def which_phase(op, operands):
  122. """returns phase for op
  123.  
  124. Examines operands for branch; goes in COMPUTE phase when
  125. conditional"""
  126.  
  127. if op == 'br':
  128. if(len(operands) > 1):
  129. return COMPUTE
  130.  
  131. for matches, phs in op_phase.iteritems():
  132. if matches(op):
  133. return phs
  134. # XXX raise
  135. print "ERROR: which_phase", op, operands
  136. exit(1)
  137.  
  138. def parse(inst):
  139. """returns (op, phase, operands, result)
  140.  
  141. result may be None
  142. """
  143. res = match_instr(inst)
  144. if res is not None:
  145. (result, op, tail) = res.groups()
  146. else:
  147. print "parse('%s') failed" % inst
  148. exit(1)
  149.  
  150. operands = findall_operands(tail)
  151. phase = which_phase(op, operands)
  152. return (op, phase, operands, result)
  153.  
  154. split_instructions = compile(r"\s*About to interpret:\s+",
  155. MULTILINE | DOTALL).split
  156.  
  157. def join_lines(ls):
  158. return ls.replace("\n", "")
  159.  
  160. match_blank = compile(r"^\s*$").match
  161.  
  162. def not_blank(s):
  163. return match_blank(s) is None
  164.  
  165. def conflicts(phops, phase, result):
  166. """return True when the result is an operand of this or any previous
  167. phase"""
  168. for phs in range(phase + 1):
  169. if result in phops[phs]:
  170. # print "CONFLICT: ", result, "used in phase", phs, "supplied in phase", phase
  171. return True
  172. return False
  173.  
  174. def empty_phased_operands():
  175. r = []
  176. for i in range(FINAL + 1):
  177. r.append(set())
  178. return r
  179.  
  180. # Read all input, split by instruction and reverse
  181. # (Most instructions are 1-per line, but switch is multi-line, damnit)
  182. def go():
  183. insts = reversed(
  184. filter(not_blank,
  185. map(join_lines,
  186. split_instructions(stdin.read()))))
  187.  
  188. fused = []
  189. # phased_operands[phase] = [operands]
  190. phased_operands = empty_phased_operands()
  191.  
  192. for inst in insts:
  193. # print inst
  194. op, phase, operands, result = parse(inst)
  195. if conflicts(phased_operands, phase, result):
  196. print len(fused), result, fused
  197. fused = []
  198. phased_operands = empty_phased_operands()
  199. fused.append(inst)
  200. phased_operands[phase] |= set(operands)
  201. # print "PHASE", phase, "OPNDS", phased_operands
  202.  
  203. # Last instruction out
  204. if fused:
  205. print fused
  206.  
  207. go()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement