Advertisement
Bring_Stabity

SRW: GC VWF writeup DRAFT!

Feb 19th, 2015
364
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 8.24 KB | None | 0 0
  1. This document is not a tutorial, but rather documentation about how I added VWF to Super Robot Wars: GameCube
  2.  
  3. Speaking of which, I made a VWF!
  4.  
  5. And in case that doesn't sound impressive enough...
  6.  
  7. I added VWF to a game by hex editing the compiled powerPC binaries without the help of an assembler.
  8.  
  9. Tools:
  10. Dolphin Debug version.
  11. IDA - Interactive DisAssembler
  12. Gekko CPU manual
  13. Windows calculator (in programmer mode)
  14. Notepad
  15.  
  16.  
  17. OK, now onto what I did. Dashman had already made a custom font, figured out how to inject it into a fixed location in memory, and use AR codes to change the font pointers (all six of them), to point to his custom font. However, we needed a way of doing this without an AR code, if we wanted this to run on actual hardware. Therefore, we'd have to edit the pointers some time after they're written, but before they're used. Dash proposed editing part of the character drawing routine to modify the pointers, although that would require the pointers to be recreated each time we wanted to write a character. Such an inefficient proposal is anathema to machine code programmers (seriously, when I go full ASM, I will shank you for an extra clock cycle or two of savings). Therefore, I ventured out to find the location in code where the pointer was written.
  18.  
  19. I fired up Dolphin, set up a memory breakpoint for all the font pointers and tried to run it (turns out the memory breakpoints don't work correctly. So I decided that I would work my way through the funtion tree to find where the memory was changed. I walked through the code, everytime a subroutine was called, I'd save state before the subroutine, and put a breakpoint at the next instruction after the subroutine. If the memory location was changed, I knew the operation was in that sub-tree, if it wasn't I moved on. I actually started fairly deep inside the function tree, because I had started where the memory break point dropped me off. When I identified the precise operation on which that memory location was written to, I fired up IDA, and examined it from there. What I found was a long chain of...
  20.  
  21. mr r4, r29
  22. addi r3, r31, <var1>
  23. bl sub_80242014
  24. stw r3, <var2> (r13)
  25.  
  26. That branch and link operation called the decompression script, and that store word operation wrote the pointer to the memory address in question. I had an idea of how to change the pointers if I could safely sneak in a "bl" operation somewhere and find the free space to add my own code. This however, was a silver platter. I had two registers that I could safely modify, and four operations per pointer store. I needed either one register and three operations per store, or two registers and 2.5 operations per store on average (since I was able to reuse a temporary value in 3/6 cases). I replaced three adjacent blocks of the above code which wrote three out of the six font pointers to memory with the following code.
  27.  
  28. lis r4, -0x7F62 # 0x809EE7E4
  29. ori r3, r4, 0x548C # 0x809E548C
  30. nop
  31. stw r3, -0x6E8C(r13)
  32. nop
  33. ori r3, r4, 0xB170 # 0x809EB170
  34. nop
  35. stw r3, -0x6E80(r13)
  36. nop
  37. ori r3, r4, 0xE7E4 # 0x809EE7E4
  38. nop
  39. stw r3, -0x6E44(r13)
  40.  
  41. As an aside, I must point out that the Gekko dispatcher consumes nops, therefore, there is no performance hit to actually using them. If there was, you can bet your ass that I would have moved all the instructions close together and branched to the next real operation, that would have saved five whole clock cycles (unconditional branches are also free) (I'm a crazy ASM guy, huh?). What this code does: L4 contains the top half of the font pointer. r3 gets the bottom half of the pointer ored with the top half of the pointer, or in other words, the complete pointer. In my original version I messed up here. I used addi instead of ori. However, addi takes a signed 16 bit integer, so I was actually subtracting the address instead of adding it. ori, takes an unsigned 16 bit integer and performs a bitwise or. The stw operation stayed mostly the same between the original and my code. I shifted the order of the stores so that I could group my pointers logically so as to reduce the number of lis commands that I'd have to use (remember that part where I mentioned that ASM coders are crazy?)
  42.  
  43. This gave us the correct font, but it still didn't give us the correct widths. The game would still try to draw Shift-JIS characters as 18 pixels wide, and ASCII characters as 10 pixels wide. This in fact was a saving grace for me. The game was already capable of doing dual width font, I just needed to expand it to a true variable width font. Dashman had already located the area in the code where the comparison was done and we had a workaround to replace the S-JIS width call with a hard coded "li r6 9" which would halve the width of the S-JIS characters, giving us nice S-JIS alphanumerics (the event text needs to be in S-JIS because they use ASCII characters as control codes), at the cost of unreadable Japanese characters.
  44.  
  45. At this point, I need to go on another aside about the operation that finds whether a character is S-JIS or ASCII. The next character to be printed is stored in register 31. An ASCII character will look like 0000 00xx. A S-JIS character will look like 0000 8xxx. I would have used a compare. They used a mask and compare operation. The value was compared with the following AND mask 0000 FF00. If the result was 0, it was an ASCII character and the next branch would catch it. If it was a S-JIS character, it would fail the comparison and fall through the comparison.
  46.  
  47. There was no space for me to put the variable width code, so I had to find some free space. Dash and I had already tested expanding the size of the DOL. That didn't work, so I had to find some free space to place my code. I found plenty inside of the "Metrowerks Target Resident Kernel for PowerPC". So, I replaced the two fixed font width memory calls with subroutine calls to the kernel. Fortunately, this didn't cause a problem. Bugs in my code caused a few major problems. Here's my current S-JIS variable width code. I need to update it to take into account a few special characters (and improve the efficiency, it hurts to look at that code right now).
  48.  
  49.  
  50. .set r4 stack, -0x28
  51. .set r3 stack, -0x24
  52.  
  53. stw r3, r3 stack(r1) # VWF for SJIS
  54. stw r4, r4 stack(r1)
  55. cmplwi r30, 0x829A
  56. bgt loc_8000352C
  57. lis r3, -0x7F61 # 0x809F6F7C
  58. ori r3, r3, 0x6F7C # 0x809F6F7C
  59. addi r4, r30, -0x7000
  60. addi r4, r4, -0x1140
  61. add r4, r4, r3
  62. lbz r6, 0(r4)
  63. b loc_80003530
  64. li r6, 0x12
  65. lwz r3, r3 stack(r1)
  66. lwz r4, r4 stack(r1)
  67. blr
  68.  
  69. Of course I didn't write this without plenty of hilarious errors. First, I accidentally used lwz instead of lbz. That resulted in me trying to write a character that was 151 million pixels wide. That gave a nice error. Also, I didn't keep my registers straight the next two times, and I ended up addressing the wrong point in memory and I was getting noise for my widths. That noise resulted in the characters being several letters wide, repeating the same letter each time, but having it scroll up a little each time. It was hilarious. Kudos if you can imagine a file structure which would result in that happening.
  70.  
  71. Just for completion's sake, here's the code for the ASCII characters. Also, note that Shift-JIS has ASCII characters at the ASCII codepoints, but also half-width characters in the upper half of the 8bit codespace. Notice that I was much smarter with the code this time. I use only one register (saving me a pop and push) and I setup the memory offset to take into account that the ASCII character set starts at 0x20.
  72.  
  73. stw r3, var_24(r1) # VWF for ASCII
  74. cmplwi r30, 0x7D # 0x809F70BC
  75. bgt loc_80003570
  76. lis r3, -0x7F61 # 0x809F70BC
  77. ori r3, r3, 0x70BC # 0x809F70BC
  78. add r3, r30, r3
  79. lbz r6, 0(r3)
  80. b loc_80003574
  81. li r6, 0xA
  82. lwz r3, var_24(r1)
  83. blr
  84.  
  85. Offset notes:
  86. 0x801501AC: Start of the method that call the decompression script and stores pointers to memory.
  87. 0x80150228: The first decompression call.
  88. 8040af80: The value stored in r13. All pointers to decompressed assets are stored at a fixed offset from here.
  89. 0x80003500: Location of the injected S-JIS vwf function.
  90. 0x80003550: Location of the injected ASCII vwf function.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement