Advertisement
Guest User

Virtual Machine How To

a guest
Jul 20th, 2019
61
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 9.69 KB | None | 0 0
  1. # bitfields
  2. Packing and extracting bit fields in Python integers.
  3. This is the first in a series of projects in which we will build and program a simulated computer.
  4.  
  5. ## Context: Why we need bitfields
  6.  
  7. Executable *machine code* instructions for a computer are stored in memory in binary, in a series of memory *words*.  Words in the DM2018S consist of 32 binary digits (bits), or 4 bytes.   Memory is uniform and untyped:  The same memory word could be interpreted as an integer or as an instruction or a few characters of text or part of a floating point number.  Treating it as an instruction is simply a matter of how we use it.
  8.  
  9. Each instruction for the computer needs to specify several things: An operation to be performed, operand values or the locations of operand values, where to put the result of the operation, and sometimes some additional information. It would be inefficient to keep each part of that information in a separate memory word.  Instead, all modern computers *pack* this information into *fields* in a smaller number of words.  Many computers following the *RISC* design style pack all the fields of an instruction into a single memory word.  The processor in your phone is likely based on RISC design principles. The DM2018S is also a RISC machine, based loosely on the popular ARM processor used in many cell phones and in the Raspberry Pi.
  10.  
  11. Each of the fields of an instruction (e.g., the operation code, and the index of the register that should receive the result of the operation) can be represented in a few binary digits.  For example, if there are fewer than 32 (2^5) operation codes, then they can be represented by a 5-bit binary number.  If the number of bits required to represent each field sum to 32 or less, then we can *pack* all of them into a 32-bit memory word.  That is what we will do with DM2018S instructions.
  12.  
  13. ## BitField objects
  14.  
  15. A BitField object is a tool for inserting a field into an integer or extracting a field from an integer.
  16.  
  17. For example, suppose we wanted to keep some values in the first four bits (bits 0..3) and other values in the second four bits (bits 4..7) of integers.  We would define two BitField objects:
  18.  
  19. ```
  20. low4 = BitField(0,3)
  21. mid4 = BitField(4,7)
  22. ```
  23.  
  24. Then we could pack two small numbers, each between 0 and 15 inclusive, into one integer:
  25.  
  26. ```
  27. packed = 0
  28. packed = low4.insert(7, packed)
  29. packed = mid4.insert(8, packed)
  30. ```
  31.  
  32. At this point the value of packed is 135, but let's not think about it in decimal.  Think in hex, where each hexadecimal digit represents 4 binary bits.  Now the value is clear:
  33.  
  34. ```
  35. >>> hex(packed)
  36. '0x87'
  37. ```
  38. There is our 7, in the low digit, and our 8, in the high digit.  
  39.  
  40. We can also use the BitField object to extract the individual fields:
  41.  
  42. ```
  43. >>> low4.extract(packed)
  44. 7
  45. >>> mid4.extract(packed)
  46. 8
  47. ```
  48.  
  49. ## Putting BitFields to Work
  50.  
  51. The arithmetic logic unit (ALU) is the part of the central processing unit (CPU) that actually performs arithmetic operations.  There is more to executing a CPU instruction than performing that operations (e.g., we need a way of sequencing operations, and we need a way of moving data between the CPU and main memory); we will build those parts in the next project.  For now, the ALU is enough to put BitField objects to work.
  52.  
  53. We can see how DM2018S instructions are packed into memory words in instr_format.py:
  54.  
  55. ```
  56. reserved = BitField(31,31)
  57. instr_field = BitField(26, 30)
  58. cond_field = BitField(22, 25)
  59. reg_target_field = BitField(18, 21)
  60. reg_src1_field = BitField(14, 17)
  61. reg_src2_field = BitField(10, 13)
  62. offset_field = BitField(0, 9)
  63. ```
  64.  
  65. ```instr_field``` will hold an instruction code, which is defined by a Python enumeration class:
  66.  
  67. ```
  68. class OpCode(Enum):
  69.    """The operation codes specify what the CPU and ALU should do."""
  70.    # CPU control (beyond ALU)
  71.    HALT = 0    # Stop the computer simulation (in Duck Machine project)
  72.    LOAD = 1    # Transfer from memory to register
  73.    STORE = 2   # Transfer from register to memory
  74.    # ALU operations
  75.    ADD = 3     # Addition
  76.    SUB = 5     # Subtraction
  77.    MUL = 6     # Multiplication
  78.    DIV = 7     # Integer division (like // in Python)
  79. ```
  80. Since there are only 8 operation codes, we could have reserved as few as 3 binary digits to hold an operation code, but we have allocated 5 bits so that up to 32 operation codes could be accommodated in the future.
  81.  
  82. The DM2018S has 16 *registers*, which are like very fast memory cells built into the CPU itself (in contrast to main memory, which is a separate hardware component).   The ALU does not access registers directly, but the values fed into the ALU come from registers, and the result value produced by the ALU is then stored in a register.  Since there are 16 registers, we need 4 binary digits to specify each of the two source registers and the target register:
  83.  
  84. ```
  85. reg_target_field = BitField(18, 21)
  86. reg_src1_field = BitField(14, 17)
  87. reg_src2_field = BitField(10, 13)
  88. ```
  89.  
  90. In addition, the instruction holds a 10 bit integer called the *offset*, which is added to the second source register before it is fed to the ALU.  While the other fields are interpreted as *unsigned* integers, the offset field is a *signed* integer, i.e., it may be positive or negative.  Thus its highest digit, bit 9, is 1 if the number is negative and 0 if the number is positive.  When we extract this number from an instruction, it will be necessary to *sign extend* negative values to create valid negative Python integers.
  91.  
  92. One bit of the DM2018S instruction word is reserved for future expansion.  There is also a *condition* field, which will be important in the next project for controlling loops in DM2018S machine code programs, but we will ignore it for now.
  93.  
  94. Since our ALU is not yet part of a complete CPU, we have a very small driver program that executes a very short DM2018S program.  Here's the program:
  95.  
  96. ```
  97. 264503311
  98. 264782848
  99. 399278085
  100. 466403330
  101. ```
  102. Really, that's the program.  Each of those 32-bit integers contains the fields of a DM2018S instruction, packed into bit fields.  We need to unpack or *decode* them, one at a time, to execute the program.  In lieu of a whole CPU to do that for us, our program ```example_tiny_run.py``` drives the process:
  103.  
  104. ```
  105.    for word in words:
  106.        instr = instr_format.decode(word)
  107.        op = instr.op
  108.        to_reg = instr.reg_target
  109.        src_1 = instr.reg_src1
  110.        src_2 = instr.reg_src2
  111.        offset = instr.offset
  112.        result, condition = alu.exec(op,
  113.            regs[src_1], regs[src_2] + offset)
  114.        regs[to_reg] = result
  115. ```
  116.  
  117. ```instr_format.decode``` uses bitfield objects to extract each of the instruction fields from the word and form an Instruction object, which guides how we call ```alu.exec```.      
  118.  
  119. But what does this program mean?  Here is how it is created by another tiny program, ```example_tiny_create.py```, from a sequence of Instruction objects:
  120.  
  121. ```
  122. def gen(s: str) -> int:
  123.    """Create the integer encoding of one instruction from a string"""
  124.    instr = instruction_from_string(s)
  125.    encoding = instr.encode()
  126.    return encoding
  127.  
  128. def make_sample() -> List[int]:
  129.    """Create a small, arbitrary little sequence of
  130.    instructions encoded as 32-bit binary integers.
  131.    """
  132.    return  [
  133.        gen("ADD  ALWAYS r1  r0 r0  15"), # Initializes r1 to 15
  134.        gen("ADD  ALWAYS r2  r1 r1  0"),  # r2 = r1 + r1
  135.        gen("SUB  ALWAYS r3  r2 r0  5"),  # r3 = r2 - 5
  136.        gen("MUL  ALWAYS r3  r3 r0  2")   # r3 = r3 * 2for
  137.    ]
  138. ```
  139.  
  140. We expect ```example_tiny_run.py``` to essentially reverse the encoding process, recovering and executing these instructions.  If it works correctly, we will see the following log output from ```example_tiny_run.py```:
  141.  
  142. ```
  143. INFO:__main__:Executing ADD      r1,r0,r0[15]
  144. INFO:__main__:Computed value 15
  145. INFO:__main__:Executing ADD      r2,r1,r1[0]
  146. INFO:__main__:Computed value 30
  147. INFO:__main__:Executing SUB      r3,r2,r0[5]
  148. INFO:__main__:Computed value 25
  149. INFO:__main__:Executing MUL      r3,r3,r0[2]
  150. INFO:__main__:Computed value 50
  151. ```
  152.  
  153. It is not a very exciting program, just computing some values in registers, but it's about the best we can do with just an arithmetic logic unit.
  154.  
  155. ## Project Steps
  156.  
  157. This is a project that requires more thinking than coding.  Be sure you really understand how each part works.  If some parts don't make sense, ask us about them!
  158.  
  159. Start by completing BitFields class.  There is a test program, test_bitfields.py, to help you determine whether your BitFields class is working.  The test program may also help you clarify questions about exactly how BitFields objects are used and their expected behavior.
  160.  
  161. Completing BitFields project will require very little code.  However, it is code that can require very careful thinking, and needs to be well-documented.  For example, my implementation of method BitField.insert has only four lines of code, three lines of comments (excluding the docstring), and a six-line docstring including an example.  That's a high ratio of documentation to code!
  162.  
  163. When the BitFields class is working, then you can move on to finishing the ALU class.  In hardware, the ALU would use [multiplexer](https://en.wikipedia.org/wiki/Multiplexer) and demultiplexer circuits to direct its operands through a circult determined by the operation code.  A software analogue is to look up the function to apply in a table of functions.  This is not very complex, and it isn't much code, but it may be something you are not very familiar with, so I have provided part of the table and left part of it for you to fill in.
  164.  
  165. There is a test program for the ALU.  When you have completed the ALU and can pass those tests, you should also be able to run ```tiny_example_run.py```.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement