Advertisement
tthtlc

Untitled

Mar 24th, 2012
88
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 17.45 KB | None | 0 0
  1. DIVISION OF ENGINEERING AND APPLIED SCIENCES
  2. HARVARD UNIVERSITY
  3. CS 161. Operating Systems
  4.  
  5. Matt Welsh
  6. Spring 2005
  7. MIPS r2000/r3000 Architecture
  8. Architecture/assembler summary
  9.  
  10. [CS161 Home Page]
  11.  
  12. (This is not intended to be either a comprehensive reference or a tutorial. More information is available from www.mips.com.)
  13.  
  14. Registers
  15. Instructions
  16. Synthetic instructions
  17. Delay slots
  18. Exceptions
  19. Segments
  20. Registers
  21.  
  22. There are 32 general-purpose registers and 3 special registers on the MIPS r2k itself. There are also up to 32 registers each on up to four coprocessors. For CS161 purposes, there is only one coprocessor, coprocessor 0, which is the "system coprocessor"; it takes care of exceptions and virtual memory issues.
  23. Register Symbolic
  24. name Save
  25. by Description
  26. General registers
  27. $0 z0, ZERO N/A Always contains 0, no matter what's written to it.
  28. $1 AT caller Assembler temporary. See below.
  29. $2 v0 caller Value 0. Used for computations; function return value is placed here. Also holds the system call number on syscall entry.
  30. $3 v1 caller Value 1. Used for computations; upper word of 64-bit return value is placed here.
  31. $4 a0 caller Argument 0. First function argument goes here.
  32. $5 a1 caller Argument 1. Second function argument goes here.
  33. $6 a2 caller Argument 2. Third function argument goes here.
  34. $7 a3 caller Argument 3. Fourth function argument goes here. Also used as a flag value on system call return.
  35. $8 t0 caller General-purpose temporary register.
  36. $9 t1 caller General-purpose temporary register.
  37. $10 t2 caller General-purpose temporary register.
  38. $11 t3 caller General-purpose temporary register.
  39. $12 t4 caller General-purpose temporary register.
  40. $13 t5 caller General-purpose temporary register.
  41. $14 t6 caller General-purpose temporary register.
  42. $15 t7 caller General-purpose temporary register.
  43. $16 s0 callee General-purpose saved register.
  44. $17 s1 callee General-purpose saved register.
  45. $18 s2 callee General-purpose saved register.
  46. $19 s3 callee General-purpose saved register.
  47. $20 s4 callee General-purpose saved register.
  48. $21 s5 callee General-purpose saved register.
  49. $22 s6 callee General-purpose saved register.
  50. $23 s7 callee General-purpose saved register.
  51. $24 t8 caller General-purpose temporary register.
  52. $25 t9 caller General-purpose temporary register.
  53. $26 k0 nobody Kernel scratch register.
  54. $27 k1 nobody Kernel scratch register.
  55. $28 gp global Global pointer. Constant for any given process.
  56. $29 sp N/A Stack pointer.
  57. $30 s8 callee Saved register #8 - conventionally, but not always, a frame pointer.
  58. $31 ra caller Return address of function.
  59. Special registers
  60. HI - caller High-order word of 64-bit multiply result, or remainder of divide result.
  61. LO - caller Low-order word of 64-bit multiply result, or quotient of divide result.
  62. PC - N/A Program counter.
  63. Coprocessor 0
  64. cop0 $0 c0_index N/A TLB entry index register.
  65. cop0 $1 c0_random N/A TLB randomized access register.
  66. cop0 $2 c0_entrylo N/A Low-order word of "current" TLB entry.
  67. cop0 $4 c0_context N/A Page-table lookup address.
  68. cop0 $8 c0_vaddr N/A Virtual address associated with certain exceptions.
  69. cop0 $10 c0_entryhi N/A High-order word of "current" TLB entry.
  70. cop0 $0 c0_status N/A Processor status register.
  71. cop0 $13 c0_cause N/A Exception cause register.
  72. cop0 $14 c0_epc N/A PC at which exception occurred.
  73. Any of the 32 general-purpose registers can be used in any instruction that takes register operands. The special registers are accessed using special instructions; the coprocessor registers can be accessed by using special coprocessor instructions to move their values to general registers and back.
  74. Register $31 is the "link register". Most of the instructions for calling subroutines are hardwired to store the return address into this register. (The jalr instruction is, for some reason, an exception.)
  75.  
  76. The coprocessor 0 registers have various bit fields in them. These are:
  77.  
  78. c0_index
  79. Bits Name Description
  80. 31 P Set by the tlbp instruction if the probe fails.
  81. 14-30 unused
  82. 8-13 Index TLB entry number for tlbwi, tlbr, and tlbp.
  83. 0-7 unused
  84. c0_random
  85. Bits Name Description
  86. 14-31 unused
  87. 8-13 Random Semi-random TLB entry number used by tlbwr. Updated by processor. Never has a value between 0-7.
  88. 0-7 unused
  89. c0_entrylo
  90. Bits Name Description
  91. 12-31 PFN Physical page number (bits 12-31 of address) for VM mapping.
  92. 11 N Non-cacheable; if set, RAM cache is disabled accessing this page.
  93. 10 D Dirty; if set, page may be written to.
  94. 9 V Valid; if set, page may be accessed.
  95. 8 G Global; if set, valid in every address space.
  96. 0-7 unused
  97. c0_context
  98. Bits Name Description
  99. 21-31 PTEBase Base address of page table. Untouched by hardware; maintained by software.
  100. 20-0 BadVPN Offset into page table for a kuseg fault (bits 12-30 of c0_vaddr), set by hardware.
  101. c0_vaddr
  102. Bits Name Description
  103. 0-31 vaddr Failing virtual address; set by certain exceptions.
  104. c0_entryhi
  105. Bits Name Description
  106. 12-31 VPN Virtual page number (bits 12-31 of address) for VM mapping.
  107. 6-11 PID ID of address space in which virtual address exists.
  108. 0-5 unused
  109. c0_status
  110. Bits Name Description
  111. 28-31 CU If these bits are set, the corresponding coprocessors are usable. If clear, use of said coprocessors will generate a coprocessor unusable exception.
  112. 23-27 unused
  113. 22 BEV If set the "bootstrap" exception handler addresses are used.
  114. 21 TS If set to 1, the processor is dead in the water and needs to be reset.
  115. 20 PE Set to 1 if a cache parity error occurs. Clear by writing 1.
  116. 19 CM Set to 1 if the most recent data cache load missed, but only if IsC is set.
  117. 18 PZ If set to 1, uses space parity for outgoing data.
  118. 17 SwC If set, the cache control lines affect the instruction cache rather than the data cache.
  119. 16 IsC If set, the data cache is detached from main memory. (For flushing.)
  120. 8-15 IntMask While these bits are set, the corresponding interrupts are masked and do not cause interrupt exceptions.
  121. 6-7 unused
  122. 5 KUo Old kernel/user mode bit (1 = user mode)
  123. 4 IEo Old interrupt enable bit (0 = mask all interrupts)
  124. 3 KUp Previous kernel/user mode bit (1 = user mode)
  125. 2 IEp Previous interrupt enable bit (0 = mask all interrupts)
  126. 1 KUc Current kernel/user mode bit (1 = user mode)
  127. 0 IEc Current interrupt enable bit (0 = mask all interrupts)
  128. c0_cause
  129. Bits Name Description
  130. 31 BD Set if last exception occurred in a branch delay slot.
  131. 30 unused
  132. 28-29 CE Coprocessor number resulting from a coprocessor unusable exception.
  133. 16-27 unused
  134. 10-15 IP Bits reflecting the state of the external hardware interrupt lines. Bit 10 is irq 0.
  135. 8-9 Sw Software interrupts. Like IP, but controlled by software.
  136. 6-7 unused
  137. 2-5 ExcCode An exception code, from the list below.
  138. 0-1 unused
  139. c0_epc
  140. Bits Name Description
  141. 0-31 epc Program counter for restarting after exception.
  142.  
  143. Instructions
  144.  
  145. This table uses the following symbols:
  146. RD, RS, RT Up to three general registers ($0-$31)
  147. HI, LO The special "hi" and "lo" registers
  148. HI:LO "hi" and "lo" as a single 64-bit value
  149. C0_REG A coprocessor 0 register
  150. signed-IMM Immediate value IMM, sign-extended to 32 bits
  151. unsigned-IMM Immediate value IMM, zero-extended to 32 bits
  152. offset Branch or memory-access offset (always signed)
  153. signed- Value is interpreted as signed
  154. unsigned- Value is interpreted as unsigned
  155. address Immediate address for jump
  156. These are the instructions (there are a few not listed, including all the floating-point operations, but this should include anything we'll see in CS161.)
  157. In the opcode names, "u" means "unsigned"; "i" means immediate; the "al" in some jump instructions means "and link", meaning "function call".
  158.  
  159. Instruction Operation Notes
  160. add RD, RS, RT RD = RS + RT; exception on overflow
  161. addi RT, RS, IMM RT = RS + signed-IMM; exception on overflow
  162. addiu RT, RS, IMM RT = RS + signed-IMM
  163. addu RD, RS, RT RD = RS + RT
  164. and RD, RS, RT RD = RS & RT
  165. andi RS, RT, IMM RT = RS & unsigned-IMM
  166. beq RS, RT, branch-offset if (RS == RT) NEXTPC += (branch-offset << 2)
  167. bgez RS, branch-offset if (signed-RS <= 0) NEXTPC += (branch-offset << 2)
  168. bgezal RS, branch-offset $31 = NEXTPC; if (signed-RS >= 0) NEXTPC += (branch-offset << 2)
  169. bgtz RS, branch-offset if (signed-RS > 0) NEXTPC += (branch-offset << 2)
  170. blez RS, branch-offset if (signed-RS <= 0) NEXTPC += (branch-offset << 2)
  171. bltz RS, branch-offset if (signed-RS < 0) NEXTPC += (branch-offset << 2)
  172. bltzal RS, branch-offset $31 = NEXTPC; if (signed-RS < 0 NEXTPC += (branch-offset << 2)
  173. bne RS, RT, branch-offset if (RS != RT) NEXTPC += (branch-offset << 2)
  174. break breakpoint (immediate breakpoint exception) with no delay slot
  175. div RS, RT LO = signed-RS / signed-RT; HI = signed-RS % signed-RT
  176. divu RS, RT LO = unsigned-RS / unsigned-RT; HI = unsigned-RS % unsigned-RT
  177. j address NEXTPC = (NEXTPC & 0xf0000000) | (address << 2)
  178. jal address $31 = NEXTPC; NEXTPC = (NEXTPC & 0xf0000000) | (address << 2)
  179. jalr RD, RS RD = NEXTPC; NEXTPC = RS. RD is normally $31.
  180. jr RS NEXTPC = RS
  181. lb RT, offset(RS) RT = signed-8-memory[RS + offset]
  182. lbu RT, offset(RS) RT = unsigned-8-memory[RS + offset]
  183. lh RT, offset(RS) RT = signed-16-memory[RS + offset]
  184. lhu RT, offset(RS) RT = unsigned-16-memory[RS + offset]
  185. lui RT, IMM RT = unsigned-IMM << 16
  186. lw RT, offset(RS) RT = 32-memory[RS + offset]
  187. lwl RT, offset(RS) RT = unaligned-32-memory[RS + offset] 1
  188. lwr RT, offset(RS) RT = unaligned-32-memory[RS + offset] 1
  189. mfc0 RT, C0_REG RT = C0_REG
  190. mfhi RD RD = HI
  191. mflo RD RD = LO
  192. mtc0 RT, C0_REG C0_REG = RT
  193. mthi RS HI = RS
  194. mtlo RS LO = RS
  195. mult RS, RT HI:LO = signed-RS * signed-RT
  196. multu RS, RT HI:LO = unsigned-RS * unsigned-RT
  197. nor RD, RS, RT RD = ~(RS | RT)
  198. or RD, RS, RT RD = RS | RT
  199. ori RT, RS, IMM T = RS | unsigned-IMM
  200. rfe return from exception 2
  201. sb RT, offset(RS) 8-memory[RS + offset] = RT
  202. sh RT, offset(RS) 16-memory[RS + offset] = RT
  203. sll RD, RT, IMM RD = RT << unsigned-IMM
  204. sllv RD, RT, RS RD = RT << RS
  205. slt RD, RS, RT RD = signed-RS < signed-RT
  206. slti RT, RS, IMM RT = signed-RS < signed-IMM
  207. sltiu RT, RS, IMM RT = unsigned-RS < unsigned-signed-IMM
  208. Yes, according to my reference it actually takes the 16-bit immediate, sign-extends it, and then reinterprets it as an unsigned value. Don't ask me. 4
  209. sltu RD, RS, RT RD = unsigned-RS < unsigned-RT
  210. sra RD, RT, IMM RD = signed-RT >> unsigned-IMM
  211. srav RD, RT, RS RD = signed- RT >> RS
  212. srl RD, RT, IMM RD = unsigned-RT >> unsigned-IMM
  213. srlv RD, RT, RS RD = unsigned-RT >> RS
  214. sub RD, RS, RT RD = RS - RT; exception on overflow
  215. subu RD, RS, RT RD = RS - RT
  216. sw RT, offset(RS) 32-memory[RS + offset] = RT
  217. swl RT, offset(RS) unaligned-32-memory[RS + offset] = RT 1
  218. swr RT, offset(RS) unaligned-32-memory[RS + offset] = RT 1
  219. syscall make system call; immediate syscall exception with no delay slot
  220. tlbp probe tlb: search TLB for entry matching c0_entryhi; set probe-failed bit and index field in c0_index. 3
  221. tlbr read tlb entry: load the TLB entry named by the index field of c0_index into c0_entryhi and c0_entrylo. 3
  222. tlbwi write tlb entry indexed: store c0_entryhi and c0_entrylo into the TLB entry named by the index field of c0_index. 3
  223. tlbwr write tlb entry "random": store c0_entryhi and c0_entrylo into the TLB entry named by the index field of c0_random. 3
  224. xor RD, RS, RT RD = RS ^ RT
  225. xori RT, RS, IMM RD = RS ^ unsigned-IMM
  226. Notes:
  227. lwl/lwr and swl/swr are for accessing unaligned words in memory. The actual specification is complicated, but what it boils down to is that
  228. lwl RT, offset(RS)
  229. lwr RT, (offset+3)(RS)
  230. loads the 32-bit value starting at RS+offset, no matter what the alignment of that address is. swl/swr behave analogously.
  231.  
  232. RFE rotates the lower six bits of the status register by two to the right, so the "previous" interrupt/usermode state becomes the current state and the "old" state is copied into the "previous" state. This inverts what happens on an exception. RFE is normally found in the delay slot of a jump instruction of some kind.
  233.  
  234. For an explanation of these, see the comments in src/kern/arch/mips/include/tlb.h.
  235. Synthetic instructions
  236.  
  237. Because all instructions are exactly 32 bits wide, it's not possible to perform certain logical operations in a single instruction. The assembler will cover for these by emitting multiple actual instructions as needed.
  238. For instance, the "lc" (load constant) and "la" (load address) instructions, both of which load 32-bit constants, will be expanded by the assembler into a "lui" instruction to load the upper half of the word, and then usually an "ori" or "addiu" to set the lower half of the word.
  239.  
  240. Some of these combinations require an extra register to hold intermediate values. Register $1 is reserved for this purpose. You can prevent the assembler from using $1 by putting ".set noat" in the assembler source.
  241.  
  242. Delay slots
  243.  
  244. The MIPS is a pipelined architecture, and certain aspects of the pipeline are exposed to the programmer. In general, "slow" instructions are not finished until the instruction *two* spaces after them is being fetched. The instruction in between is referred to as a "delay slot".
  245. There is no pipeline stall logic; the delay slots must be filled out appropriately in the machine code. If they aren't, the behavior is undefined.
  246.  
  247. The assembler will attempt to fill delay slots for you; however, it isn't very bright about it and usually inserts nops. Also, in some cases it cannot tell what you mean and can silently mangle code that you thought was using delay slots efficiently. For this reason, when coding OS/161, I turned off this behavior with ".set noreorder".
  248.  
  249. Delay slots apply chiefly to two classes of instructions:
  250.  
  251. Loads and stores involving memory.
  252. lw $9, 0($8) ; load value into $9
  253. nop ; $9 won't be ready here
  254. addiu $10, $9 ; now we can use $9
  255. Branches and jumps.
  256. jal myfunc ; call function
  257. move a0, s0 ; executes BEFORE jump happens
  258. addiu s0,s0,v0 ; executes AFTER function returns
  259. The interaction between branch delay slots and exception handling is extremely unpleasant and you'll be happier if you don't think about it.
  260. Exceptions
  261.  
  262. When an exception occurs, information about the exception is recorded in some of the coprocessor 0 registers and execution contains from a known hardwired address.
  263. The following registers are updated on exception:
  264.  
  265. c0_cause: the BD, CE, and ExcCode fields are updated.
  266. c0_context: the BadVPN field is updated in the same cases c0_vaddr is updated.
  267. c0_vaddr: updated on some exceptions (see list).
  268. c0_status: the lower six bits are shifted left by two bits, shifting in zeros for the bottom two bits. This disables interrupts and puts the processor in kernel mode.
  269. c0_epc: set to suitable PC for restarting the instruction that failed.
  270. Execution continues at a hardwired address, one of the following:
  271. Address Description
  272. 0x80000000 UTLB miss exception
  273. 0x80000080 Other exceptions
  274. 0xbfc00000 Processor reset
  275. 0xbfc00100 UTLB miss exception, if BEV is set in c0_status
  276. 0xbfc00180 Other exceptions, if BEV is set in c0_status
  277. The exceptions are:
  278. Code Sets
  279. c0_vaddr? Description
  280. 0 no Interrupt (hardware or software)
  281. 1 yes TLB protection fault ("modification request")
  282. 2 yes TLB miss or UTLB miss on load or instruction fetch.
  283. 3 yes TLB miss or UTLB miss on store.
  284. 4 yes Address error on load or instruction fetch.
  285. 5 yes Address error on store.
  286. 6 no External bus error on instruction fetch
  287. 7 no External bus error on data load or store
  288. 8 no SYSCALL instruction
  289. 9 no BREAK instruction
  290. 10 no Reserved (illegal) instruction
  291. 11 no Coprocessor unusable
  292. 12 no Arithmetic overflow
  293. An address error results from either use of an inadequately aligned pointer (an N-bit quantity must be aligned on an N-bit address boundary, unless the lwl/lwh/swl/swh instructions are used) or an attempt to access kernel memory from user mode.
  294. A TLB entry is "matching" if its VPN field is the same as the page number portion of the virtual address being looked up, and either the G (global) bit is set or the PID field matches the PID field in c0_entryhi.
  295.  
  296. If no matching TLB entry is found, a TLB miss exception occurs, unless the address is in the user mode range (0-0x80000000) in which case a UTLB exception occurs. If a matching entry is found, but it is not marked valid (the V bit is clear), a TLB miss exception (never a UTLB miss exception) occurs. Then, if the dirty (D) bit is not set on a write access, a TLB protection fault occurs.
  297.  
  298. A UTLB miss exception uses (potentially) different exception handling code from a TLB miss exception, but is otherwise the same. The purpose, in conjunction with the c0_context register, is to enable fast-path TLB refill handling. Note that the UTLB exception applies to user addresses, not user mode - if the miss address is below 0x80000000, a UTLB exception occurs whether or not the miss was generated in kernel or user mode.
  299.  
  300. Segments
  301.  
  302. The MIPS divides its address space into several regions that have hardwired properties. These are:
  303. kseg2, TLB-mapped cacheable kernel space
  304. kseg1, direct-mapped uncached kernel space
  305. kseg0, direct-mapped cached kernel space
  306. kuseg, TLB-mapped cacheable user space
  307. Both direct-mapped segments map to the first 512 megabytes of the physical address space.
  308. The top of kuseg is 0x80000000. The top of kseg0 is 0xa0000000, and the top of kseg1 is 0xc0000000.
  309.  
  310. The memory map thus looks like this:
  311.  
  312. Address Segment Special properties
  313. 0xffffffff kseg2
  314. 0xc0000000
  315. 0xbfffffff kseg1
  316. 0xbfc00180 Exception address if BEV set.
  317. 0xbfc00100 UTLB exception address if BEV set.
  318. 0xbfc00000 Execution begins here after processor reset.
  319. 0xa0000000
  320. 0x9fffffff kseg0
  321. 0x80000080 Exception address if BEV not set.
  322. 0x80000000 UTLB exception address if BEV not set.
  323. 0x7fffffff kuseg
  324. 0x00000000
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement