Advertisement
Guest User

Untitled

a guest
Mar 26th, 2019
82
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 23.72 KB | None | 0 0
  1. CSCE 212H: Introduction to Computer Architecture
  2. ================================================
  3.  
  4. ## 2018-01-18
  5. * Number representation
  6. * Unsigned: represented in traditional binary
  7. * On n-bit architecture, \(2^n\) total
  8. * \(Umax=2^n-1\) (minus one because of zero)
  9. * All operations are technically modulo \(2^n\)
  10. * Ints are not integers!
  11. * Signed magnitude: represents signed numbers by using first bit as sign
  12. * Represents \(-(2^{n-1}-1) \textrm{ to } 2^{n-1}-1\) with a duplicate 0
  13. * Does not work well with arithmetic
  14. * Two's Complement: alternate to signed magnitude good for arithmetic
  15. * Operation and representation
  16. * Represents negatives by flipping all bits and adding 1
  17. * First bit is sign bit - 1 represents negative
  18. * Can be passed through arithmetic with no further modification
  19. * Ex. 24 = 0..011000 -> 1..100111 (1's Comp) -> 1..101000 (2's Comp) = -24
  20. * Ex. -24 = 1..101000 -> 0..010111 (1's Comp) -> 0..011000 (2's Comp) = 24
  21. * Ex. 6 + (-5) = 0..0110 + 1..1011 = 0..0001 = 1
  22. * Ex. (-7) + 3 = 1..1001 + 0..0011 = 1..1100 = -4
  23. * Alternative method: flip all bits to the left of the rightmost 1
  24. * On n-bit architecture, \(Tmax=2^(n-1)-1, Tmin=-2^(n-1)\)
  25. * Special case \(Tmin = -2^{63} = 10..0b: 2sComp(Tmin) = 2sComp(10..0) = 10..0 = Tmin\)
  26. * So 2sComp(Tmin) = Tmin or -Tmin = Tmin (because the magnitude of Tmin is greater than Tmax)
  27. * Floats (32 bits)
  28. * IEEE 754 Standard
  29. * (Sign (1b)) ( Exponent (8b) ) ( Fraction (23b) )
  30. * Exponent sign is part of 8-bits by adding a bias to the field value mod 2^31
  31. * Exponent base is 2 not 10
  32. * Floats are not reals!
  33. * Doubles are just double-precision floats so 64-bits
  34. * Funky arithmetic
  35. * Example 1: Is \(x^2 >= 0\) always?
  36. * For unsigned ints, yes because they are unsigned
  37. * For signed ints, no because arithmetic is mod n which can be negative
  38. * For floats, yes
  39. * Example 2: Is \((x+y)+z = x+(y+z)\) always?
  40. * For ints (signed and unsigned), yes because modular arithmetic is associative
  41. * For floats, no because float addition varies with order of magnitude
  42. * \((1e20 + -1e20) + 3.14 \rightarrow 3.14\)
  43. * \(1e20 + (-1e20 + 3.14) \rightarrow 0\)
  44. * Computer arithmetic
  45. * Cannot generate random values - always follows strict mathematical rules
  46. * Not all usual mathematical properties hold for computer data types
  47. * Int operations follow "ring" properties
  48. * Floating point operations follow "ordering" properties (not field though)
  49. * GR#1 **Ints are not Integers, Floats are not Reals**
  50. * GR#2 **You have to Assembly language**
  51. * Probably won't be used after this course because compilers are better than you
  52. * Textbook uses variant of x86 called y86-64
  53.  
  54. ## 2018-01-23
  55. * GR#3 Memory matters
  56. * Not infinite
  57. * Must be allocated and managed
  58. * Pointer errors are very hard to debug
  59. * GR#4 **There's more to performance than asymptotic complexity**
  60. * Ex. row major order iteration of 2D array is much faster than column major
  61. * Fortran is column major, but most other languages are row major
  62. * More integer representation details
  63. * Unsigned: \
  64. \(\begin{align}
  65. & X[n] & * & X[n-1] & * & ... & * & X[2] & * & X[1] & * & X[0] \\
  66. & 2^n & * & 2^{n-1} & * & ... & * & 2^2 & * & 2^1 & * & 2^0
  67. \end{align}\)
  68. * Signed:
  69. * In 2's complement, positive values are the same as their unsigned counterparts
  70. * The magnitude of 2's complement can be found by doing 2's complement operation
  71. * 2's complement value: \(VT(x) = -X[w-1] * 2^{w-1} + \sum_{i=0}^{w-2}{X[i] * 2^{i}}\)
  72. * GR#5 **Computers do more than execute programs**
  73. * Bits vs. Byte
  74. * No intrinsic reason that a byte must be 8 bits but it was decided by Fred Brook
  75. * Used to be 6 bits per byte (because it was enough to store a char)
  76. * All modern computers have Byte Addressable Memory (each byte accessed individually)
  77. * Word is usually 4 bytes (grouped by address last two digits), short usually 2 bytes
  78. * By grouping into words and shorts, some memory can address an entire chunk of 4 bytes
  79. * Half byte is a nibble (each nibble is a hex digit)
  80.  
  81. ## 2018-01-25
  82. * Little Endian vs Big Endian
  83. * For some value 0xFF4292A1,
  84. * FF is MSB=Most Significant Byte, A1 is LSB=Least Significant Byte
  85. * Little Endian starts with LSB in memory
  86. * Big Endian starts with MSB in memory
  87. * Fractional binary numbers
  88. * Ex: \(1011.101b = 1*2^3 + 0*2^2 + 1*2^1 + 1*2^0 + 1*2^{-1} + 0*2^{-2} + 1*2^{-3} \\
  89. = 8 + 2 + 1 + \frac{1}{2} + \frac{1}{8} = 11 + 0.5 + 0.125 = 11.625\)
  90. * Numbers of form \(0.111111...\) are just below 1 (like $0.999...$ is just below 1 in decimal)
  91. * Represented \(1.0 - \varepsilon\)
  92. * Floating point representation
  93. * \((-1)^s M 2^E\) where $s$ is sign bit, $E$ is exponent of 2, $M$ is fraction field
  94. * Single Precision: 32 bits = (1 s) + (8 E) + (23 M)
  95. * Double Precision: 64 bits = (1 s) + (11 E) + (52 M)
  96. * Extended Precision: 80 bits (Intel only) = (1 s) + (15 E) + (63 or 64 M)
  97. * "Normalized" Values
  98. * If E=00..0 or E=11..1, value is "special"
  99. * Exponent coded as *biased* value \(E=Exp-Bias\)
  100.  
  101. ## 2018-01-30
  102. * Examples
  103. * 357 = 0000000101100101 (16-bit 2's complement)
  104. * -357 = 1111111010011011 (16-bit 2's complement)
  105. * 357.0 = 0 10000111 011001000..0 (IEEE 754 standard float)
  106. * Sign = 0 because positive
  107. * Exp = 8 + Bias = 8 + 127 = 135
  108. * Frac = After decimal of 1.00100101 (with trailing 0s)
  109. * Floating point special values
  110. * Denormalized values
  111. * When Exp = 000..0
  112. * Exponent value: E = 1 - Bias (instead of E = 0 - Bias as would be normal for Exp=0)
  113. * 0 before decimal point, not 1 like usual
  114. * Cases:
  115. * Exp=000..0, Frac=000..0
  116. * Represents zero (positive and negative are distinct)
  117. * Exp=000..0, Frac $\neq$ 000..0
  118. * Numbers closest to zero, equispaced
  119. * Smallest normalized value (Exp = 000..01)
  120. * E = Exp - Bias = 1 - 127 = -126
  121. * Value = \(1-2^{126}\)
  122. * Cannot be normalized
  123. * Infinity
  124. * When Exp=111..1, frac = 000..0
  125. * represents $\pm\infty$
  126. * 1.0/0.0 = -1.0/-0.0 = $+\infty$, 1.0/-0.0 = $-\infty$
  127. * Not a Number (NaN)
  128. * When Exp=111..1, frac $\neq$ 000..0
  129. * More Examples
  130. * Gap size in large floats:
  131. * \[\begin{align}
  132. & 0 \ 11111110 \ 111..11 \\
  133. - & 0 \ 11111110 \ 111..10 \\
  134. \hline
  135. & 0 \ 11111110 \ 000..01 \\
  136. = & 0.000..01 * 2^{127} \textrm{ (Must be normalized)} \\
  137. = & 1.000..00 * 2^{127-23} = 1.0 * 104 \\
  138. \approx & 2*10^{31} \textrm{ (HUGE!)}
  139. \end{align}\]
  140. * Float Rounding
  141. * Basic idea:
  142. * First computer exact result
  143. * Make fit into desired precision
  144. * Possibly overflow if exponent is too large
  145. * Possibly round to fit in Frac
  146. * Float Multiplication
  147. * Exact result:
  148. * Sign bit \(s_1 \oplus s_2\)
  149. * Significand \(M_1 \times M_2\)
  150. * Exponent \(E_1 + E_2\)
  151. * Then fix result to fit correctly in float
  152. * Float Addition
  153. * Must first get binary points aligned:
  154. * \((-1)^{s_1}\cdot M_1\cdot 2^{E_1} + (-1)^{s_2}\cdot M_2\cdot 2^{E_2}\)
  155. * Then fix correclty to float
  156. * Mathematical properties of FP addition
  157. * Closed under addition?
  158. * Yes but may produce infinity or NaN
  159. * Commutative?
  160. * Yes
  161. * Associative?
  162. No: (in case of numbers of different magnitudes as seen earlier)
  163. * 0 is Additive Identity?
  164. * Yes
  165. * Every element has additive inverse?
  166. * Almost: everything except infinities and NaNs
  167. * Monotonicity (\(a \geq b \Rightarrow a+c \geq b+c\))
  168. * Almost: except for infinities and NaNs
  169.  
  170. ## 2018-02-01
  171. * Mathematical properties of FP multiplication
  172. * Compare to Commutative ring
  173. * Closed under multiplication?
  174. * Yes (but may generate infinities and NaNs)
  175. * Commutative?
  176. * Yes
  177. * Associative?
  178. * No: possibility of overflow and inexactness of rounding (as in addition)
  179. * 1 is Multiplicative Identity?
  180. * Yes
  181. * Multiplication distributive over addition?
  182. * No: possibility of overflow and inexactness of rounding (as in associativity)
  183. * Monotonicity?
  184. * Almost: all but infinities and NaNs
  185. * Floating point specifications in C
  186. * C guarantees two levels of precision
  187. * float: 32 bits (1 + 8 + 25)
  188. * double: 64 bits (1 + 11 + 52)
  189. * Some machines offer a long double
  190. * Conversions/Casting
  191. * double/float $\rightarrow$ int
  192. * truncates fractional part (like rounding toward zero): \(\lfloor frac \rfloor\)
  193. * Not defined when out of range or NaN (usually sets to TMin)
  194. * int $\rightarrow$ double
  195. * Exact conversion as long as int has $\leq$ 53 bit word size
  196. * FP Puzzles
  197. * Assume `int x = _; float f = _; double d = _;` and neither d nor f is NaN.
  198. * `x == (int) (float) x`?
  199. * Not true for large numbers because large x cannot be represented in float without loss of precision.
  200. * `x == (int) (double) x`?
  201. * True because 32-bit int largest number can be represented in $\leq$ 53 bits without loss of precision.
  202. * `f == (float) (double) f`?
  203. * True because conversion from float to double can be done without loss of precision and conversion back should always work because number is within range of float.
  204. * `d == (double) (float) d`?
  205. * Not true because large doubles would lose precision when converted to float.
  206. * `f == -(-f)`?
  207. * True because negation only affects the sign bit which can be easily reverted with another negation.
  208. * `2/3 == 2/3.0`?
  209. * Not true because `2/3` does integer division where `2/3.0` does floating point division.
  210. * `d < 0.0` $\rightarrow$ `d*2 < 0.0`?
  211. * True because multiplying by 2 does not affect sign bit.
  212. * `d > f` $\rightarrow$ `-f < -d`?
  213. * In comparing float to double, the float is promoted to double so this should always apply.
  214. * `d * d >= 0.0`?
  215. * True because sign bit is XORed so the resulting sign will always be positive.
  216. * `(d+f)-d == f`?
  217. * Not true because FP multiplication is not associative.
  218. * C Primer - Highlights
  219. * Preprocessor
  220. * `#include <X.h>` or `#include "X.h"` includes header X.h
  221. * `#define X Y` replaces all instances of Y with X
  222. * Command line arguments
  223. * Arrays and structures
  224. * Pointers and dynamic memory
  225. * malloc
  226. * free
  227. * Intel x86 Processors
  228. * Dominate laptop/desktop/server industry
  229. * Backwards compatible back to 8086 (1978)
  230. * Complex instruction set computer (CISC)
  231. * Many different instructions with many different formats
  232. * Only a few are regularly used
  233. * Hard to match performance of Reduced Instruction Set Computers (RISC)
  234. * But Intel has done just that.
  235. * Intel's Competitor: Advanced Micro Devices (AMD)
  236. * Historically follows behind Intel: a little bit slower, a lot cheaper
  237. * Built Opteron: tough competitor to Pentium 4
  238. * Developed x86_64 successfully after Intel failed with its Itanium
  239. * Most processors support 64-bit mode these days, but many programs operate in only 32 bits.
  240. * Definitions
  241. * Architecture (also Instruction Set Architecture - ISA):
  242. * Only details of processor design that an assembly code writer must understand
  243. * Assembly/Machine Code View
  244. * CPU:
  245. * Program Counter (PC)
  246. * Registers
  247. * Condition Codes (what each instruction does)
  248. * Arithmetic Logic Unit (ALU)
  249. * Memory:
  250. * Code
  251. * Data
  252. * Stack
  253. * Bus connection:
  254. * Addresses
  255. * Data
  256. * Instructions
  257. ## 2018-02-06
  258. * Turning C code into object code
  259. * C program (x.c) -> (Compiler `gcc -S`)
  260. * Asm program (GAS syntax) (x.s) -> (Assembler `gcc` or `as`)
  261. * Object program (x.o) -> (Linker `gcc` or `ld`) <~ Static Libraries (\*.a)
  262. * Executable program (x) <~ Dynamic Libraries (\*.so)
  263. * Example of C -> Asm
  264. * C:
  265. ```C
  266. long plus(long x, long y);
  267.  
  268. void sumstore(long x, long y, long *dest) {
  269. long t = plus(x, y);
  270. *dest = t; //*
  271. }
  272. ```
  273. * Assembly:
  274. ```
  275. sumstore:
  276. pushq %rbx
  277. movq %rdx, %rbx
  278. call plus
  279. movq %rax, (%rbx)
  280. popq %rbx
  281. ret
  282. ```
  283. * Machine Code:
  284. ```
  285. 0x0400595:
  286. 0x53
  287. 0x48
  288. ...
  289. 0x5B
  290. 0xC3
  291. ```
  292. * Missing everything except links between functions and things (provided by linker file)
  293. * Assembly Characteristics:
  294. * Data Types
  295. * "Integer" data of 1, 2, 4, or 8 bytes
  296. * Data values
  297. * Addresses (untyped pointers)
  298. * Floating point data of 4, 8, or 10 bytes
  299. * Code: byte sequences encoding series of instructions
  300. * No aggregate types such as arrays or structures
  301. * Just continuously allocated bytes in memory
  302. * Operations
  303. * Perform arithmetic operations (ALU)
  304. * Move data between memory locations and registers
  305. * Conditional and unconditional branching
  306. * Disassembler:
  307. * `objdump -d <executable>` converts machine code to more readable form - assembly
  308. * Also possible using `gdb`
  309. * **Note: Reverse engineering is forbidden by Microsoft End User License Agreement**
  310. * x86-64 Integer Registers
  311. * 8-byte examples
  312. * %rax
  313. * %rbx
  314. * %rcx
  315. * %rdx
  316. * %rsi
  317. * %rdi
  318. * %rsp -> stack pointer
  319. * %rbp -> "frame" pointer
  320. * %r8
  321. * ...
  322. * %r15
  323. * %rax (8-byte) -> %eax (4-byte) -> %ax (2-byte) -> %ah/%al (1-byte)
  324. * Moving Data
  325. * `movq` moves quad word (4 words = 8 bytes)
  326. * Operand Types
  327. * Constant integer data
  328. * Prefixed with $ (i.e. $0x400, $-533)
  329. * Encoded with 1, 2, or 4 bytes
  330. * Register
  331. * %rax and others seen previously
  332. * %rsp and sometimes %rbp reserved
  333. * Others sometimes have special purposes
  334. * Memory Address
  335. * Simplest example: (%rax) gives address stored in %rax
  336. * Details of movq (source -> dest):
  337. \[
  338. movq
  339. \begin{cases}
  340. Immediate \quad
  341. \begin{cases}
  342. Register \ \\
  343. Memory \\
  344. \end{cases} \\
  345. Register \quad\quad
  346. \begin{cases}
  347. Register \\
  348. Memory \quad\quad \\
  349. \end{cases} \\
  350. Memory
  351. \end{cases}
  352. \]
  353. ## 2018-02-08
  354. * `leaq`
  355. * load effective address quad
  356. * equivalent to `p = &x[i];`
  357. * Cool optimization:
  358. * C code:
  359. ```
  360. long m12(long x) {
  361. return x*12;
  362. }
  363. ```
  364. * Assembly:
  365. ```
  366. leaq (%rdi,%rdi,2), %rax # t <- x + x*2 (equivalent to t = 3*x)
  367. salq $2, %rax # return t << 2 (equivalent to return 4*t)
  368. ```
  369. * Control flow
  370. * Temporary data `%rax`
  371. * Location of runtime stack `%rsp`
  372. * Location of current code control point `%rip`
  373. * Status of recent tests:
  374. * CF: carry flag - carry out from addition is set
  375. * ZF: zero flag - if t == 0
  376. * SF: sign flag (signed) - if t < 0
  377. * OF: overflow flag (signed) - sign bit is not what it should be (i.e. adding two positive numbers gives negative)
  378. * Control flags set by ALU ops but not by `leaq`
  379. * Jumps
  380. | jX | Condition | Description |
  381. |-----|---------------|----------------------|
  382. | jmp | 1 | Unconditional |
  383. | je | ZF | Equal / Zero |
  384. | jne | ~ZF | Not Equal / Not Zero |
  385. | js | SF | Negative |
  386. | jns | ~SF | Nonnegative |
  387. | jg | ~(SF^OF)\&~ZF | > (signed) |
  388. | jge | ~(SF^OF) | >= (signed) |
  389. | jl | (SF^OF) | < (signed) |
  390. | jle | (SF^OF)|ZF | <= (signed) |
  391. | ja | ~CF\&~ZF | Above (unsigned) |
  392. | jb | CF | Below (unsigned) |
  393. * Bad cases for conditional move
  394. * Expensive computations - `val = Test(x) ? Hard1(x) : Hard2(x);`
  395. * Risky computations - `val = p ? *p : 0;`
  396. * Computations with side effects - `val = x > 0 ? x*=7 : x+=3;`
  397. ## 2018-02-13
  398. * Jump Tables (switch statement)
  399. ```
  400. .section .rodata
  401. .align
  402. .L4:
  403. .quad .L2
  404. .quad .L1
  405. .quad .L3
  406. .quad .L5
  407. ```
  408. * Stack
  409. * %rsp points to top of stack
  410. * Stack is 'upside down' because stack bottom is at highest address and stack grows down with %rsp at lowest address element
  411. * Heap is usually below stack
  412. * Parameters to procedures
  413. * First 6 parameters are stored in %rdi, %rsi, %rdx, %rcx, %r8, %r9
  414. * Remaining parameters are pushed onto the stack if necessary with arg 7 on top of stack
  415. * Return of function is stored in %rax
  416. * Other registers should be pushed and popped unless it is known that the register can be clobbered without issue
  417. * Recursive procedures
  418. * Any language supporting recursion must implement a stack
  419. ## 2018-02-15
  420. * Parameters pneumonic
  421. * `rdi` - Dolly's
  422. * `rsi` - Special
  423. * `rdx` - Dress
  424. * `rcx` - Costs
  425. * `r8` - $89
  426. * `r9` - ^
  427. * For stack parameters you do not pop them off, just address `(%rbp)` for param 7 and `8(%rbp)` for param 8 if param 7 is 8 bytes long
  428. * Array allocation
  429. * `T A[l]` is array length l of elements type T named A
  430. * In memeory, this is just a strip of memory with `l*sizeof(T)` bytes which can be accessed using `A[2]` or just `*(A + 2*sizeof(T))`.
  431.  
  432. ## 2018-02-20
  433. * Arrays
  434. * C:
  435. ```C
  436. #define ZLEN 5
  437. typedef zip_dig char[];
  438.  
  439. void zincr(zip_dig z) {
  440. size_t i;
  441. for (i = 0; i < ZLEN; i++) {
  442. z[i]++;
  443. }
  444. }
  445. ```
  446. * ASM:
  447. ```asm
  448. # %rdi = z
  449. movl $0, %eax # i = 0
  450. jmp .L3 # goto middle
  451. .L4: # loop:
  452. addl $1, (%rdi,%rax,4) # z[i]++
  453. addq $1, %rax # i++
  454. .L3: # middle:
  455. cmpq $4, %rax # test i == 4
  456. jbe .L4 # if <=, goto loop
  457. rep; ret
  458. ```
  459. ## 2018-02-22
  460. * Nothing to see here.
  461.  
  462. ## 2018-03-05
  463. * Hardware Control Language (HCL)
  464. * Similar to Hardware Description Language (HDL)
  465. * Most prevalently Verilog and VHDL
  466. * Conventions
  467. * Single bits are low caps, word-level structures are caps
  468. * Decoders
  469. * \(n \times 2^n\) Decoder takes $n$ inputs and has $2^n$ outputs where output i is 1 and all others are 0 when the inputs represent i in binary
  470. * Used to do correct action based on opcode section of instruction
  471. * Multiplexers (MUX)
  472. * Has $n$ selector inputs and $2^n$ data inputs with 1 output where the output is equal to the $i^{th}$ data input when the selector inputs represent i
  473. * Used in selecting which register to use
  474. * HCL 1-bit multiplexer: `bool out = (s && a) || (!s && b);`
  475. * HCL Word-Level Examples
  476. * Minimum of 3
  477. ```VCL
  478. int Min3 = [
  479. A < B && A < C : A;
  480. B < A && B < C : B;
  481. 1 : C;
  482. ];
  483. ```
  484. * 4-Way Multiplexer
  485. ```VCL
  486. int Out4 = [
  487. !s1 && !s0 : D0;
  488. !s1 : D1;
  489. !s2 : D2;
  490. 1 : D3;
  491. ];
  492. ```
  493. * Arithmetic Logic Unit (ALU)
  494. * Uses a multiplexer to output the appropriate operator on inputs
  495. * Sets flags OF, ZF, CF, SF in the process
  496. * Flip Flops
  497. * RS Flip Flop
  498. * Two NOR gates with back flow to opposite gate
  499. * Usually has a clock input to prevent race conditions
  500. * input R,S: 01 -> set to 1, 10 -> reset to 0, 00 -> no action (for reading)
  501. * 2 outputs Q, Q' (or Q+, Q-) where Q is value of
  502. * Latching
  503. * Transparent 1-Bit Latch
  504. <img src="img/transparent-1-bit-latch.png" alt>
  505. * Random Access Memory (RAM)
  506. * From CPU to Memory, there are three lines:
  507. * 1-way CPU to Memory: control bus (read or write, a few other things)
  508. * 1-way CPU to Memory: address bus (may be chunked)
  509. * 2-way: data bus carries values to and from memory
  510. * HCL Overview
  511. * Data Types
  512. * bool : Boolean
  513. * a, b, c, ...
  514. * int : words (usually thinking 64 bit words)
  515. * A, B, C, ...
  516. * Operations
  517. * Logic operations: `&&, ||, !`
  518. * Word Comparisons: `==, !=, <, <=, >, >=`
  519. * Set Membership: `A in { B, C, D }`
  520. * Word Expressions:
  521. * Case Expressions:
  522. `int Z = [ (case 1) : (value); (case 2) : (value); 1 : (default case); ];`
  523.  
  524. ## 2018-03-08
  525. * Y86-64 Instruction Set
  526. * Instructions
  527. * halt: 00
  528. * nop: 10
  529. * cmovXX rA, rB : 2(fn) (rA) (rB)
  530. * irmovq V, rB : 30 (F) (rB) (--V--) // Value into register
  531. * rmmovq rA, D(rB) : 40 (rA) (rB) (--D--) // Reg to Mem
  532. * mrmovq D(rA) rB : 50 (rA) (rB) (--D--) // Mem to reg
  533. * OPq rA rB : 6(fn) rA rB
  534. * jXX Dest : 7(fn) (Dest)
  535. * call Dest : 80 (Dest)
  536. * ret : 90
  537. * pushq rA : A0 (rA) (F)
  538. * popq rA : B0 (rA) (F)
  539. * Functions for OPq
  540. * addq : 60
  541. * subq : 61
  542. * andq : 62
  543. * xorq : 63
  544. * Jumps for jXX
  545. * jmp : 70
  546. * jle : 71
  547. * jl : 72
  548. * je : 73
  549. * jne : 74
  550. * jqe : 75
  551. * jg : 76
  552. * Sequential Hardware Structure
  553. * Fetch: get value pointed by program counter (PC) (aka instruction pointer, IP) and move to instruction register (IR); increment PC
  554. * Decode: Figure out what the command is
  555. * Execute: Execute the command (in the case of ALU stuff)
  556. * Memory: Move store memory stuff
  557. * Write Back: write to registers
  558. * PC: Start back from top
  559. * Pipelining is when multiple stages are occuring simultaneously (why the process is split into 5 subprocedures)
  560. * Instruction example
  561. * For instruction `OPq rA, rB : 6(fn) (rA) (rB)`
  562. * Fetch: read two bytes
  563. * Decode: read operand registers
  564. * Execute: perform ALU operation, set condition codes
  565. * Memory: Do nothing
  566. * Write Back: Update register
  567. * PC Update: Increment PC by 2 (2 because the OPq has an extra byte of registers)
  568. * Exactly what happens in `OPq rA, rB : 6(fn) (rA) (rB)`
  569. * Fetch
  570. * \(\textrm{icode:ifun} \leftarrow M_1[PC]\) &emsp; // Read instruction byte
  571. * \(rA:rB \leftarrow M_1[PC+1]\) &emsp; // Read register byte
  572. * \(valP \leftarrow PC+2\) &emsp; // Computer next PC
  573. * Decode
  574. * \(valA \leftarrow R[rA]\) &emsp; // Read operand A
  575. * \(valB \leftarrow R[rB]\) &emsp; // Read operand B
  576. * Execute
  577. * \(valE \leftarrow valB\textrm{ OP }valA\) &emsp; // Perform ALU operation
  578. * \(\textrm{set CC}\) &emsp; // Set condition code register
  579. * Memory
  580. &emsp; // NOP
  581. * Write Back
  582. * \(R[rB] \leftarrow valE\) &emsp; // Write back result
  583. * PC Update
  584. * \(PC \leftarrow valP\) &emsp; // Update PC
  585. * Notes on other opcodes
  586. * For ops like rmmovq which have large data parts (i.e. D is 8 bytes), use \(M_8\) instead of \(M_1\)
  587. * For stack operations, one register represents no register (represented F above and usually 0xF in 15-register system)
  588. * In popq Decode:
  589. * \(valA \leftarrow R[\%rsp]\) &emsp; // valA is used to put value at old %rsp into rA
  590. * \(valb \leftarrow R[\%rsp]\) &emsp; // valB is used to calculate %rsp + 8 for new %rsp
  591. * cmovXX moves rA into rB only when the condition is true
  592. * If condition is false, 0xF is used as destination
  593.  
  594. ## 2018-03-22
  595. * Y86-64 Instruction Mapping
  596. * `0x1000: call 0x12300`
  597. * Fetch: \(icode:ifun \leftarrow M_1[PC] \\ valC \leftarrow M_8[PC+1] \\ valP \leftarrow PC + 9\)
  598. * Decode: \(valA \leftarrow R[rsp] \\ valB \leftarrow R[rsp]\)
  599. * Execute: \(valE \leftarrow valB + (-8)\)
  600. * Memory: \(M_8[valE] \leftarrow valP\)
  601. * Write Back: \(R[rsp] \leftarrow valE\)
  602. * PC Update: \(PC \leftarrow valC\)
  603. * `ret` (0x90)
  604. * Fetch: \(icode:ifun \leftarrow M_1[PC]\)
  605. * Decode: \(valA \leftarrow R[rsp] \\ valB \leftarrow R[rsp]\)
  606. * Execute: \(valE \leftarrow valB + 8\)
  607. * Memory: \(valM \leftarrow M_8[valA]\)
  608. * Write Back: \(R[rsp] \leftarrow valE\)
  609. * PC Update: \(PC \leftarrow valM\)
  610.  
  611. ## 2018-03-29
  612. * (Insert stuff here)
  613.  
  614. ## 2018-04-03
  615. * iaddq V, rB
  616. * Fetch: \(icode:ifun \leftarrow M_1[PC] \\ rA:rB \leftarrow M_1[PC+1] \\ valC \leftarrow M_8[PC+2] \\ valP \leftarrow PC + 10\)
  617. * Decode: \(valB \leftarrow R[rB]\)
  618. * Execute: \(valE \leftarrow valB + valC\)
  619. * Memory:
  620. * Write Back: \(R[rB] \leftarrow valE\)
  621. * PC Update: \(PC \leftarrow valP\)
  622.  
  623. ## 2018-04-05
  624. * HW Convert to assembly:
  625. ```c
  626. void swappair(long a[], n) {
  627. // Assumes n is length of a and n is even
  628. int i;
  629. long swap;
  630. for (i = 0; i < n; i = i + 2) {
  631. swap = a[i];
  632. a[i] = a[i+1];
  633. a[i+1] = swap;
  634. }
  635. }
  636. ```
  637.  
  638. ## 2018-04-24
  639. * N/A
  640.  
  641. ## 2018-04-26
  642. * #TODO
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement