Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- THIS DOCUMENT GOES INTO DETAIL ABOUT IMPLEMENTING PRACTICALLY ALL
- CHIP-8 AND SUPERCHIP OPCODES, WITH PERSONALIZED NOTES FROM EXPERIENCE.
- First things first, let's review the arrays and data types and their
- initialization. Please bear in mind that every one of them is an unsigned
- integer, and over/under flow can and *will* occur.
- 'I' is your index register.
- It has a size of 2 bytes (but effectively only needs 12 bits) (up to 0xFFFF)
- Initialized at value: 0
- 'PC' is your program counter.
- It has a type of 2 bytes (but effectively only needs 12 bits) (up to 0xFFFF)
- Initialized at value: 512 (0x0200)
- 'SP' is your stack index pointer.
- Its type is irrelevant, but at least 1 byte is needed. It's tied to the STACK array mentioned below.
- Initialized at value: 0
- 'STACK' represents your routine stack array.
- It is made up of 16 levels (STACK[0x0]..STACK[0xF])
- Each slot has a size of 2 bytes (up to 0xFFFF)
- Initialized at value: 0
- 'V' represents your V registers array.
- There are a total of 16 registers (V[0x0]..V[0xF])
- Each register has a size of 1 byte (up to 0xFF)
- Initialized at value: 0
- 'MEM' represents your memory array.
- There are a total of 4096 slots (MEM[0x000]..MEM[0xFFF])
- -- on XO-CHIP, it has 65.536 slots (MEM[0x0000]..MEM[0xFFFF])
- Each slot has a size of 1 byte (up to 0xFF)
- Initialized at value: 0
- 'KEY' represents your keypad array.
- There are a total of 16 keys (KEY[0x0]..KEY[0xF])
- Each key has a boolean value (either 1 or 0)
- Initialized at value: false
- -- Note that it's also valid to implement the keys as a single bitwise value as well, so long as you adjust your code accordingly.
- 'RPL' represents your persistent registers array. They are unique for each rom.
- There are a total of 16 registers (RPL[0x0]..RPL[0xF])
- -- the original hardware only allowed use of the first 8 registers, but modern implementations and roms may require all 16. Their use is extremely rare.
- Each register has a size of 1 byte (up to 0xFF)
- Initialized at value: 0
- 'VRAM' represents your display pixel array.
- There are a total of 128 pixels horizontally, and 64 vertically, for a total of 8192.
- It should be noted that the system always begins by only allowing the top-left quadrant of the total VRAM area to be seen. This can be changed by instructions.
- The exact implementation of this depends on your preference and ability:
- A) could be a 2D boolean array: VRAM[0][0]..VRAM[127/63]
- B) could be a 1D boolean array: VRAM[0]..VRAM[8191]
- C) or it could be a 1D byte array: VRAM[0]..VRAM[1023]
- D) or even a 2D byte array: VRAM[0][0]..VRAM[15][63]
- -- I will only be providing instruction examples for method A.
- Initialized at value: 0
- 'DELAY_TIMER' is what the name implies. Counts down once per frame if the value is not 0.
- It has a size of 1 byte (up to 0xFF)
- Initialized at value: 0
- 'SOUND_TIMER' is what the name implies. Counts down once per frame if the value is not 0. The buzzer is active for as long as the timer value is not 0 too.
- It has a size of 1 byte (up to 0xFF)
- Initialized at value: 0
- Having said that, let's establish some platform basics. The system has a screen refresh rate of 60 Hz, and the two timers it has also count down by 1 for every frame in tandem if their value is non-zero.
- This does not, however, imply that the "cpu" of the system runs at the same rate. Some people prefer to have a separate timing system to finely-tune how many instructions they run per second (IPS). This approach, while valid, is also more complicated to pull off. If you're aiming for simplicity, what I would recommend is to run a fixed amount of instructions per frame (IPF). This means that you only need to bother with a single timer implementation to control your 60 Hz loop and nothing more.
- The typical chip-8 emulation speed is around 540-660 IPS, and for super-chip it's around 1800 IPS. To match that in IPF, you multiply IPF by 60 and pick whichever multiple is closest or feels best for you. Do keep in mind though that these speeds concern old roms for these two platforms. There exist newer chip-8 and super-chip roms that may require IPS in the hundreds, or more, to run at proper speed, thus your IPF should be a configurable variable.
- Now that you're a bit more informed on the topic of emulation speed, back to the details. Starting off with the timers, as mentioned, both count down at the same rate as the screen refreshes. The sound timer specifically though will produce a (usually) square wave tone when its value is non-zero. Given that some roms may have obnoxiously long buzz sequences, I'd recommend that the tone you pick isn't ear-piercing, nor too loud.
- As is usually the start to any emulation journey, you'll first need to have some data at the ready for you to play around with, and this means loading a rom into memory. The implementation for that part will be up to you, but here's the general guideline on how to arrange things:
- MEM[0]..MEM[79] = FONT DATA (chip-8)
- MEM[80]..MEM[239] = BIG FONT DATA (super-chip)
- MEM[512]..onwards = ROM DATA
- If you followed the notes on initialization values for the rest earlier on, then that's about it for this segment. Different extensions such as Chip-8 HiRes or Chip-8X have different init values for certain variables, or extra data to tango with. I will not be detailing these here.
- ///////////////////////////////////////////////////////////////////////////
- // This next segment will go over some basic implementations for the //
- // instructions themselves, extra notes about them and any applicable //
- // quirks, and generally spoilers. You have been warned. //
- ///////////////////////////////////////////////////////////////////////////
- As described previously, we want to be running multiple instructions per frame, so bear in mind that the following process (which ideally should be a function of its own for ease of use) will be running many times each frame.
- The very first thing we must do is, of course, to fetch some data from memory and assemble our instruction (also known as opcode). In this system, instructions are always 2 bytes long -- but make no mistake, it does not imply that they will always be aligned in memory to start from an even index. A rom may take such routes that the program counter would start assembling an opcode from an odd index too. Anyway, here's how we'll start:
- OPCODE = (MEM[PC] << 8) | MEM[PC+1]
- Simple enough! I'd like to take a moment to note here that most instructions will bump the program counter (PC) up by 2 at their end. Rather than risk making some mistake in this process, it's easier and prudent to play it safe and simply increment the PC as the very next step:
- PC += 2
- There we go. Before we dive deeper into the instructions themselves, I like to set up some commonly used variables ahead of time. It's *technically* wasteful when not all of them will come into use, but I like the clarity of avoiding macros or redefining the needed bits in every single instruction:
- NNN = OPCODE & 0x0FFF // our 12 bit JUMP address
- NN = OPCODE & 0x00FF // the lowest byte (also seen as KK in other guides)
- Now for the individual nibbles.
- P = (OPCODE & 0xF000) >> 12 // 1st nibble - most significant
- X = (OPCODE & 0x0F00) >> 8 // 2nd nibble - also known as X in opcodes
- Y = (OPCODE & 0x00F0) >> 4 // 3rd nibble - also known as Y in opcodes
- N = OPCODE & 0x000F // 4th nibble - least significant
- And yes, there's more efficient ways to arrange these if you want. You can figure them out if you think about them, but easier understanding is what I'm going for.
- Now we can tackle the instructions themselves. There's a switch/case tree at the bottom of this doc that showcases an example structure for the opcode matching process. If you have a different approach you're always welcome to experiment, just make sure not to leave any holes. It's bad practice in general to allow any kind of potential mismatching.
- Let's get to bashing the opcode into actual code.
- .... 00CN - scroll display N lines down (SUPER-CHIP)
- Its purpose is to scroll the VRAM a certain amount of rows downwards, depending on the N value. An N value of 0 is invalid, and thus you should either throw an error for incorrect instruction, or handle as a no-op. The rows that go off-bounds do not get wrapped around and are discarded.
- // The provided example tackles the aforementioned method A
- for(y = 63; y >= N; y--)
- for(x = 0; x < 128; x++)
- VRAM[x][y] = VRAM[x][y-N]
- for(y = 0; y < N; y++)
- for(x = 0; x < 128; x++)
- VRAM[x][y] = 0
- .... 00E0 - clear the screen
- Its purpose is to clear out the VRAM, so everything must go back to the default initialization value.
- // The provided example tackles the aforementioned method A
- for(y = 0; y < 64; y++)
- for(x = 0; x < 128; x++)
- VRAM[x][y] = 0
- .... 00EE - return from subroutine
- Its purpose is to return back to the instruction stored in the last STACK entry, as denoted by the stack index pointer (SP). Do note that there's potential for OOB access here.
- if SP
- PC = STACK[--SP]
- else
- QUIT_WITH_MSG: EXIT FROM EMPTY STACK
- .... 00FB - scroll display 4 pixels to the right (SUPER-CHIP)
- Its purpose is to scroll the VRAM 4 columns to the right. The rows that go off-bounds do not get wrapped around and are discarded.
- // ^^ The provided example tackles the aforementioned method A
- for(y = 0; y < 64; y++)
- for(x = 127; x >= 4; x--)
- VRAM[x][y] = VRAM[x-4][y]
- for(x = 0; x < 4; x++)
- VRAM[x][y] = 0
- .... 00FC - scroll display 4 pixels to the left (SUPER-CHIP)
- Its purpose is to scroll the VRAM 4 columns to the left. The rows that go off-bounds do not get wrapped around and are discarded.
- // ^^ The provided example tackles the aforementioned method A
- for(y = 0; y < 64; y++)
- for(x = 0; x < 128; x++)
- VRAM[x][y] = VRAM[x+4][y]
- for(x = 124; x <= 127; x++)
- VRAM[x][y] = 0
- .... 00FD - stop signal (SUPER-CHIP)
- Much likes its name implies, it's a signal to stop. What this would actually do for you is, well, up to you. I like to stop further instruction fetching.
- QUIT_WITH_MSG: RECEIVED STOP SIGNAL
- .... 00FE - disable extended screen mode (SUPER-CHIP - will run at 64x32)
- Its purpose is to change the resolution of your display. Typically this means limiting your visibility to the top left quadrant of the VRAM.
- .... 00FF - enable extended screen mode (SUPER-CHIP - will run at 128x64)
- Its purpose is to change the resolution of your display. Typically this means extending the visibility to the entirety of the VRAM.
- .... 0NNN - ML routines
- This category is for anything outside of the aforementioned instructions. If it's a 0x0000 then you've just hit blank memory. Either the rom is malformed, or you did something wrong somewhere. Anything else in this range is a machine-language routine and those should either be no-op'd or stop emulation, as you can't emulate them without emulating the actual computer that chip-8/super-chip used to run on.
- if NNN
- QUIT_WITH_MSG: MACHINE CODE <OPCODE> NOT SUPPORTED
- else
- QUIT_WITH_MSG: CALL TO 0x0000 DETECTED
- .... 1NNN - jump to address
- Its purpose is to set the program counter to address NNN. Fairly simple. In many roms, it's used as a method of "stopping" by jumping to itself. You may wish to prevent needless execution by catching such a scenario.
- if PC - 2 == NNN
- QUIT_WITH_MSG: ADDRESS JUMP LOOP DETECTED
- // ^^ lots of games initiate one to signal they're done
- else
- PC = NNN
- .... 2NNN - call subroutine
- Its purpose is to store the current program counter to the STACK as denoted by the stack index pointer, as denoted by the stack index pointer (SP), then jump to the address NNN. Do note that there's potential for OOB access here.
- if SP >= 16
- QUIT_WITH_MSG: CALL STACK OVERFLOW
- else
- STACK[SP++] = PC
- PC = NNN
- .... 3XNN - skip next instruction if VX == NN
- if V[X] == NN
- PC += 2
- .... 4XNN - skip next instruction if VX != NN
- if V[X] != NN
- PC += 2
- .... 5XY0 - skip next instruction if VX == VY
- if V[X] == V[Y]
- PC += 2
- .... 6XNN - set VX = NN
- V[X] = NN
- .... 7XNN - set VX = VX + NN
- If you don't have a single-byte type, you'll need to mask like in the commented snippet.
- V[X] += NN
- // V[X] &= 0xFF
- .... 8XYN - Arithmetic Instructions. This note is not an instruction itself. I just want to interject and clarify that the order of operations seen here for instructions 8xy4 through 8xyE is important. Remember, either V[X] or V[Y] could, actually, be a V[0xF] underneath.
- .... 8XY0 - set VX = VY
- V[X] = V[Y]
- .... 8XY1 - set VX = VX | NY
- V[X] |= V[Y]
- .... 8XY2 - set VX = VX & VY
- V[X] &= V[Y]
- .... 8XY3 = set VX = VX ^ VY
- V[X] ^= V[Y]
- .... 8XY4 - set VX = VX + VY, VF = carry
- SUM = V[X] + V[Y] (size of at least 2 bytes)
- V[X] = SUM & 0xFF
- V[15] = SUM >> 8
- .... 8XY5 - set VX = VX - VY, VF = !borrow
- FLAG = V[X] >= V[Y]
- V[X] = V[X] - V[Y]
- // V[X] &= 0xFF
- V[15] = FLAG
- .... 8XY7 - set VX = VY - VX, VF = !borrow
- FLAG = V[Y] >= V[X]
- V[X] = V[Y] - V[X]
- // V[X] &= 0xFF
- V[15] = FLAG
- .... 8XY6 - set VX = VX >> 1, VF = carry
- This instruction has a discrepancy due to confused documentation. The original (chip-8) method is to shift VY into VX, whereas the alternative (super-chip) is to shift VX itself. The SHIFTQUIRK variable in this case is TRUE if we want the latter behavior.
- if SHIFTQUIRK Y = X
- FLAG = V[Y] & 1
- V[X] = (V[Y] >> 1) & 0xFF
- V[15] = FLAG
- .... 8XYE - set VX = VX << 1, VF = carry
- This instruction has a discrepancy due to confused documentation. The original (chip-8) method is to shift VY into VX, whereas the alternative (super-chip) is to shift VX itself. The SHIFTQUIRK variable in this case is TRUE if we want the latter behavior.
- if SHIFTQUIRK Y = X
- FLAG = V[Y] >> 7
- V[X] = V[Y] << 1
- // V[X] &= 0xFF
- V[15] = FLAG
- .... 9XY0 - skip next instruction if VX != VY
- if V[X] != V[Y]
- PC += 2
- .... ANNN - set I = NNN
- I = NNN
- .... BNNN - jump to NNN + V0 (or V[X])
- This instruction has a discrepancy due to confused documentation. The original (chip-8) method is to add V[0] to the address NNN for the jump, whereas the alternative (super-chip) is to add V[X] instead. The JUMPQUIRK variable in this case is TRUE if we want the latter behavior.
- if JUMPQUIRK
- PC = NNN + V[X]
- else
- PC = NNN + V[0]
- .... CXNN - set VX = RND & NN
- V[X] = RND(256) & NN
- .... DXYN - draw sprite
- // TBD
- .... EX9E - skip next instruction if key VX is held
- The system polls the key denoted at V[X] to check if it's held down. Since the system only had 4 hardware lines for the keyboard, all bits past the 4th of the V[X] value are ignored, thus the masking.
- if KEY[V[X] & 0xF] == 1
- PC += 2
- .... EXA1 - skip next instruction if key VX is not held
- The system polls the key denoted at V[X] to check if it's held down. Since the system only had 4 hardware lines for the keyboard, all bits past the 4th of the V[X] value are ignored, thus the masking.
- if KEY[V[X] & 0xF] == 0
- PC += 2
- .... FX07 - set VX = delaytimer
- V[X] = DELAY_TIMER
- .... FX0A - wait for key press and release, set VX = key
- This instruction awaits for a key to be pressed and subsequently released. To accomplish this, you need to be able to compare the current key state with that of the last frame. Example code that'd take place along with your timer decrements:
- // for(z = 0; z < 16; z++)
- // CACHED_KEY[z] = CUR_KEY_STATE[z]
- // CUR_KEY_STATE[z] = iskeyheld(z)
- You can also cheat a little by only checking for a key release only, it will work fine:
- for(z = 0; z < 16; z++)
- if CACHED_KEY[z] & !CUR_KEY_STATE[z]
- V[X] = z
- return // < terminate early if we got a match
- PC -= 2 // if the loop didn't detect a key release, we must backtrack
- .... FX15 - set delaytimer = VX
- DELAY_TIMER = V[X]
- .... FX18 - set soundtimer = VX
- SOUND_TIMER = V[X]
- .... FX1E - set I = I + VX
- I += V[X]
- You may have seen other guides suggesting to set VF according to whether the I register overflowed past 0xFFF. This is incorrect. The only rom that makes use of this is called Spacefight 2091, a super-chip game. No overflow occurs, it just wants VF to be set to 0 because the game is buggy. Do not implement this behavior, look for the patched rom instead.
- .... FX29 - point I to 5-byte-tall numeric sprite for value in VX
- The system used a jump table originally, and would mask the value of V[X] to ensure it doesn't go OOB.
- I = (V[X] & 0xF) * 5
- .... FX30 - point I to 10-byte-tall numeric sprite for value in VX (SUPER-CHIP)
- The system used a jump table originally, and would mask the value of V[X] to ensure it doesn't go OOB.
- I = (V[X] * 10) + 80
- .... FX33 - store BCD of VX in memory at I, I+1 and I+2
- The instruction merely separates the hundreds, tens, and singles numbers from V[X] and stores them into memory in sequence.
- MEM[I] = V[X] / 100
- MEM[I+1] = (V[X] / 10) % 10
- MEM[I+2] = V[X] % 10
- .... FX55 - save V0..VX in memory at I..I+X
- This instruction has a discrepancy due to confused documentation. The original (chip-8) method increments the I register for each loop iteration, whereas the alternative (super-chip) does not. The LOADSTOREQUIRK variable in this case is TRUE if we want the latter behavior.
- for(n = 0; n <= X; n++)
- MEM[I+n] = V[n]
- if !LOADSTOREQUIRK
- I += X + 1
- .... FX65 - load V0..VX from memory at I..I+X
- This instruction has a discrepancy due to confused documentation. The original (chip-8) method increments the I register for each loop iteration, whereas the alternative (super-chip) does not. The LOADSTOREQUIRK variable in this case is TRUE if we want the latter behavior.
- for(n = 0; n <= X; n++)
- V[n] = MEM[I+n]
- if !LOADSTOREQUIRK
- I += X + 1
- .... FX75 - save V0..VX (X<8) in the RPL flags (SUPER-CHIP)
- This instruction, in the original super-chip implementation, was limited to 8 RPL registers. If it was called with an X of 8 or larger, it was capped to 7. You *probably* don't have to worry about emulating that.
- for(n = 0; n <= X; n++)
- RPL[n] = V[n]
- .... FX85 - load V0..VX (X<8) from the RPL flags (SUPER-CHIP)
- This instruction, in the original super-chip implementation, was limited to 8 RPL registers. If it was called with an X of 8 or larger, it was capped to 7. You *probably* don't have to worry about emulating that.
- for(n = 0; n <= X; n++)
- V[n] = RPL[n]
- ///////////////////////////////////////////////////////////////////////////////
- ///////////////////////////////////////////////////////////////////////////////
- On this part I'll show you a structure for matching the opcode to the appropriate instructions. I'll try to keep it simple, but let's review the variables we defined earlier so you don't have to scroll all the way back up, context is important at all times!
- NNN = OPCODE & 0x0FFF // our 12 bit JUMP address
- NN = OPCODE & 0x00FF // the lowest byte (also seen as KK in other guides)
- P = (OPCODE & 0xF000) >> 12 // 1st nibble - most significant
- X = (OPCODE & 0x0F00) >> 8 // 2nd nibble - also known as X in opcodes
- Y = (OPCODE & 0x00F0) >> 4 // 3rd nibble - also known as Y in opcodes
- N = OPCODE & 0x000F // 4th nibble - least significant
- You will want to break out after executing any opcode. Don't forget it.
- CASE P
- 0x0 :
- CASE (OPCODE & 0x0FF0)
- 0x0C :
- CASE N
- 0x0 : INVALID
- DEF : 00CN (SUPER-CHIP)
- 0x0E :
- CASE N
- 0x0 : 00E0
- 0xE : 00EE
- DEF : INVALID
- 0x0F :
- case N
- 0xB : 00FB (SUPER-CHIP)
- 0xC : 00FC (SUPER-CHIP)
- 0xD : 00FD (SUPER-CHIP)
- 0xE : 00FE (SUPER-CHIP)
- 0xF : 00FF (SUPER-CHIP)
- DEF : INVALID
- DEF : INVALID
- 0x1 : 1NNN
- 0x2 : 2NNN
- 0x3 : 3XNN
- 0x4 : 4XNN
- 0x5 : 5XY0 // if N > 0 that's invalid
- 0x6 : 6XNN
- 0x7 : 7XNN
- 0x8 :
- CASE N
- 0x0 : 8XY0
- 0x1 : 8XY1
- 0x2 : 8XY2
- 0x3 : 8XY3
- 0x4 : 8XY4
- 0x5 : 8XY5
- 0x6 : 8XY6
- 0x7 : 8XY7
- 0xE : 8XYE
- 0x9 : 9XY0 // if N > 0 that's invalid
- 0xA : ANNN
- 0xB : BNNN
- 0xC : CXNN
- 0xD : DXYN
- 0xE :
- CASE NN
- 0x9E : EX9E
- 0xA1 : EXA1
- 0xF :
- CASE NN
- 0x07 : FX07
- 0x0A : FX0A
- 0x15 : FX15
- 0x18 : FX18
- 0x1E : FX1E
- 0x29 : FX29
- 0x30 : FX30 (SUPER-CHIP)
- 0x33 : FX33
- 0x55 : FX55
- 0x65 : FX65
- 0x75 : FX75 (SUPER-CHIP)
- 0x85 : FX85 (SUPER-CHIP)
Add Comment
Please, Sign In to add comment