Thurler

Un-nerfing Extra Attack

Apr 20th, 2023 (edited)
119
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 19.73 KB | None | 0 0
  1. "Bro why is Sakuya so bad" - someone playing PD with nerfed Extra Attack, probably. Were the nerfs too harsh? Would it have been better to keep it as it was? Well, maybe we can figure out a better way if we can change how Extra Attack and the other skills relating to it work:
  2.  
  3. 01. How Extra Attack works in the code
  4. 02. The proc rate
  5. 03. The subclass proc rate nerf
  6. 04. The MP cost nerf
  7. 05. Patching the MP cost nerf
  8. 06. The MP cost un-nerf code
  9.  
  10. ==================================================================
  11.  
  12. 01. How Extra Attack works in the code
  13.  
  14. Using ghidra and cheat engine, we can isolate the code that decides whether Extra Attack will proc or not, and what to do if it will proc. This code runs right after the animation for a move is played, and all of that move's side effects (like inflicting ailments) are applied:
  15.  
  16. int skill_level(character, skill_id):
  17. if character cannot learn skill_id:
  18. return 0;
  19. else:
  20. return character.skills.find(skill_id).level;
  21.  
  22. int extra_attack_id = 0x118;
  23. int vengeful_cat_step_id = 0x19d;
  24. int perilous_blossom_id = 0x1bc;
  25.  
  26. bool will_proc = false;
  27. int proc_rate = 0;
  28. proc_rate = proc_rate + (15 * skill_level(attacker, extra_attack_id));
  29. proc_rate = proc_rate + (10 * skill_level(attacker, vengeful_cat_step_id));
  30.  
  31. if attack_used used is a subclass attack:
  32. proc_rate = 2 * proc_rate / 3; // integer division
  33.  
  34. int random_number = rand(99); // generates a number 0-99
  35. if perilous_blossom_last_concentrate:
  36. will_proc = true;
  37. perilous_blossom_last_concentrate = false;
  38. else, if random_number < proc_rate:
  39. will_proc = true;
  40.  
  41. if skill_level(attacker, perilous_blossom_id) > 0:
  42. if attack_used is "concentrate":
  43. will_proc = false;
  44.  
  45. if will_proc:
  46. if target is still alive:
  47. if attack_used requires 0 or more MP:
  48. int restore = attack_used.required_mp / 2; // integer division
  49. character.mp = character.mp + restore;
  50. do_move(attack_used, character, target); // starts a new animation
  51. else:
  52. for each enemy_slot of the 5 enemy slots:
  53. if enemy_slot is not empty:
  54. int restore = attack_used.required_mp / 2; // integer division
  55. character.mp = character.mp + restore;
  56. do_move(attack_used, character, enemy_slot); // starts a new animation
  57.  
  58. if skill_level(attacker, perilous_blossom_id) > 0:
  59. if attack_used is "concentrate":
  60. perilous_blossom_last_concentrate = true;
  61.  
  62. I know that was quite the long block of code to go through, but we can break it down into a few major sections: calculating the proc rate, deciding if the skill will proc, deciding on a new target if the previous one died, restoring part of the used MP, and finally calling a new move recursively. At the end it updates the flag that controls functionality for Perilous Blossom Spring. Let's analyze each part of it:
  63.  
  64. The base proc rate is 15 times the skill level. Note that this goes against the in-game description, which claims it to be 16. The game would be correct if comparison was made with <= instead of just <. We then add 10 to the base proc rate for each level of Vengeful Cat's Erratic Step learned. A final step is performed if the attack being used is from a subclass, which doubles the proc rate then divides it by 3.
  65.  
  66. In order to decide if Extra Attack procs, we must first check the conditions of Perilous Blossom Spring - if we used concentrate on the last turn, then Extra Attack must proc. If we are using Concentrate however, it must not proc in order to prevent an infinite recursive loop when Concentrate is called a second time. We take the proc rate and compare it to a random value 0-99, if none of the special cases apply.
  67.  
  68. If the original target is still alive, Extra Attack will continue targetting it. If it is not, however, the game picks a new target based on the lowest enemy slot that is filled. This has a possibly unintended side effect of changing rows from top to bottom if all 5 slots are filled, slot 4 dies, slot 5 lives, and slot 4 was originally targetted. Regardless, note that the code that restores MP and calls the move recursively is repeated in both the IF and ELSE statements.
  69.  
  70. There are only 2 minor differences, the first is the target of the new recursive move call, and the other is a check if the move used does not consume negative MP, to avoid having the game subtract the MP recovered from using Concentrate.
  71.  
  72. Now that we have an understanding of how the code works, we have 4 main goals when trying to apply balance patches to Extra Attack:
  73.  
  74. > Change the proc rate of Extra Attack
  75. > Change the proc rate of Vengeful Cat's Erratic Step
  76. > Change/Remove the subclass skill nerf
  77. > Change/Remove the MP cost nerf
  78.  
  79. ==================================================================
  80.  
  81. 02. The proc rate
  82.  
  83. Thankfully, the assembly code that handles the skill level and base proc rate multiplication is already generic enough, so all we have to do is change the constants it is multiplying the level by:
  84.  
  85. 6b f0 0f imul esi,eax,0xf
  86. 6b c8 0a imul ecx,eax,0xa
  87.  
  88. All we have to do here is change that 0F and that 0A into whatever number 00-7F we want. In practice, 64 already guarantees Extra Attack will always proc, which can cause all sorts of soft-locks. Make sure to leave some room for probability to snap you out of Extra Attack - or just make sure you're actually dealing damage. Or healing / buffing. Or concentrating.
  89.  
  90. Offset 0x7a15b in the exe contains the proc rate per level of Extra Attack (default 0xf)
  91. Offset 0x7a175 in the exe contains the added proc rate per level of Vengeful Cat's Erratic Step (default 0xa)
  92.  
  93. ==================================================================
  94.  
  95. 03. The subclass proc rate nerf
  96.  
  97. The assembly code for this already has a multiplication and a division, so it should be trivial to change it to any fraction we want, like 1/1 to make the nerf disappear, or 0/1 to make it only proc for personal spells. Just don't divide by zero!
  98.  
  99. In reality, the multiplication by 2 is implemented as a left shift, which means we don't have room to change the numerator at all. Unless we make some space, that is. The way the game checks if we are using a subclass spell is by comparing it to 0x513 and 0x5dc - if the attack we're using has an ID between those values, it knows we're using a subclass spell. The assembly code looks like this:
  100.  
  101. 8b 45 f8 mov eax,DWORD PTR [ebp-0x8]
  102. 8b 88 f4 06 00 00 mov ecx,DWORD PTR [eax+0x6f4]
  103. 81 b9 8c 00 00 00 14 05 00 00 cmp DWORD PTR [ecx+0x8c],0x514
  104. 7c 2b jl skip_nerf
  105. 8b 45 f8 mov eax,DWORD PTR [ebp-0x8]
  106. 8b 88 f4 06 00 00 mov ecx,DWORD PTR [eax+0x6f4]
  107. 81 b9 8c 00 00 00 db 05 00 00 cmp DWORD PTR [ecx+0x8c],0x5db
  108. 7f 16 jg skip_nerf
  109.  
  110. So we move the skill ID to ECX, compare it to the value and then jump if we are outside the nerf range. Note that moving the ID to ECX is repeated, and because the CMP instruction doesn't affect any registers, we can safely omit the second loading into ECX, since the value isn't used post-jump, reducing the code to:
  111.  
  112. 8b 45 f8 mov eax,DWORD PTR [ebp-0x8]
  113. 8b 88 f4 06 00 00 mov ecx,DWORD PTR [eax+0x6f4]
  114. 81 b9 8c 00 00 00 14 05 00 00 cmp DWORD PTR [ecx+0x8c],0x514
  115. 7c 2b jl skip_nerf
  116. 81 b9 8c 00 00 00 db 05 00 00 cmp DWORD PTR [ecx+0x8c],0x5db
  117. 7f 16 jg skip_nerf
  118.  
  119. Which saves us 9 bytes. We can then use these bytes to change the multiplication code. We only really need 1, since the IUML instruction only takes up 3 bytes compared to shift left's 2, meaning we need 8 NOPs of padding:
  120.  
  121. d1 e0 shl eax
  122. becomes
  123. 6b c0 02 imul eax,eax,0x2
  124.  
  125. In order to patch this in the exe, simply paste the following 29 (0x1d) bytes at offset 0x7a193:
  126. 81 B9 8C 00 00 00 DB 05 00 00 7F 1F 8B 85 54 FE FF FF 6B C0 02 90 90 90 90 90 90 90 90
  127.  
  128. Now we have full control over the numerator and denominator of the original 2/3 function, simply mess with the offsets for the 0x2 and 0x3:
  129.  
  130. Offset 0x7a1a7 is the numerator for the nerf (default 0x2)
  131. Offset 0x7a1b2 is the denominator for the nerf (default 0x3)
  132.  
  133. ==================================================================
  134.  
  135. 04. The MP cost nerf
  136.  
  137. The assembly code for halving the MP cost is a very simple shift right. That only takes up 2 bytes, which we can easily NOP out to remove the nerf, but maybe we can reach a better outcome by patching out this logic into something else, like an arbitrary fraction (say, recover 3/4 of the spent MP) or restore X less than the MP cost (say, recover 4MP if the spell cost 5), so that triggering Extra Attack still presents some form of penalty to the MP pool.
  138.  
  139. This would be impossible, if not for the fact that the code that accomplishes this is repeated between the target alive if/else statements. By carefully manipulating the execution flow, we can remove the repeated code and use the newly acquired space to add more steps to the MP recovery function.
  140.  
  141. In order to better visualize what we're about to do, here's an attempt at explaining how the execution flows goes in the vanilla code. Pieces of code that are the exact same will be labeled with the same letter:
  142.  
  143. (Target alive) (Target dead)
  144.  
  145. +-----------+ +-----------+
  146. | Check Pos | | Rec MP |
  147. +-----------+ +-----------+
  148. | JNE Prep4 | | Prep 4 |
  149. +-----------+ +-----------+
  150. | Rec MP | | Prep 3' |
  151. +-----------+ +-----------+
  152. | Prep 4 | | Prep 2 |
  153. +-----------+ +-----------+
  154. | Prep 3 | | Prep 1 |
  155. +-----------+ +-----------+
  156. | Prep 2 | | Call Move |
  157. +-----------+ +-----------+
  158. | Prep 1 | | JMP Out |
  159. +-----------+ +-----------+
  160. | Call Move |
  161. +-----------+
  162. | JMP Out |
  163. +-----------+
  164.  
  165. To explain the labels a bit, "Check Pos" checks if the MP cost value is zero/positive, Prep 1-4 will calculate and prepare the 4 arguments to the function call that performs the move, which happens at "Call Move". The "JMP Out" simply jumps to the next batch of code in the function that houses this logic, and the "JNE Prep 4" just jumps to that side's Prep 4 block, but only if the MP cost is negative.
  166.  
  167. The block that takes up the most space in the function above is the MP recovery, so if we can make both sides depend on the same block, we're golden. We can do that with some jumping around, which is sort of represented below:
  168.  
  169. (Target alive) (Target dead)
  170.  
  171. +------------+ +-----------+
  172. | Check Pos | | Call R-MP |
  173. +------------+ +-----------+
  174. | JNE A-Neg | | Prep 3' |
  175. +------------+ +-----------+
  176. | JMP A-Pos | | Prep 2 |
  177. +------------+ +-----------+
  178. | Rec MP | | Prep 1 |
  179. +------------+ +-----------+
  180. | Prep 4 | | Call Move |
  181. +------------+ +-----------+
  182. | Return | | JMP Out |
  183. +------------+ +-----------+
  184. | Call R-MP | (A-Pos points here)
  185. +-----------+
  186. | Prep 3 |
  187. +-----------+
  188. | JMP Prep2 |
  189. +-----------+
  190. | CallPrep4 | (A-Neg points here)
  191. +-----------+
  192. | JMP Prep3 |
  193. +-----------+
  194.  
  195. If you follow the jumps and the function calls, the blocks are executed in the exact same way they were previously, but it's much less straight-forward. It might also seem like the right side is imbalanced, but the argument preparation is much, much smaller than the recover MP part of the code. This allows for no code repetition, freeing up space in both code regions. Now we can freely patch the left side to remove the shift right and so something else with the MP recovery.
  196.  
  197. What makes the code redirection above work are the clever ways we use function calls and returns, as well as chaining jumps on the right side. Because we return to the same place we were at when we call a function, we can call R-MP from 2 different places on the right, and have the different Prep 3 sections accordingly. The case for negative MP cost does a function call straight into Prep 4, just like the original flow. Assembly doesn't really care where the function actually starts - it just needs to know where to start running code from.
  198.  
  199. To patch the left side, we simply place the following 139 (0x8b) bytes at offset 0x7a2a8:
  200.  
  201. 0F 8C 0C 01 00 00 E9 F5 00 00 00 8B 45 F8 8B 88 EC 06 00 00 8B 55 F8 8B 4C 8A 24 8B 55 F8 8B 82 F4 06 00 00 8B 80 94 00 00 00 99 2B C2 D1 F8 99 03 81 B8 00 00 00 13 91 BC 00 00 00 8B 4D F8 8B 89 EC 06 00 00 8B 75 F8 8B 4C 8E 24 89 81 B8 00 00 00 89 91 BC 00 00 00 8B 45 F8 8B 88 F4 06 00 00 8B 91 90 00 00 00 C3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  202.  
  203. To patch the right side, we simply place the following 134 (0x86) bytes at offset 0x7a374:
  204.  
  205. E8 3A FF FF FF 52 8B 85 48 FE FF FF 50 8B 4D F8 8B 91 EC 06 00 00 52 8B 45 F8 8B 88 F4 06 00 00 8B 91 8C 00 00 00 52 8B 4D F8 E8 91 53 F9 FF E9 52 00 00 00 E8 06 FF FF FF 52 8B 45 F8 8B 88 F0 06 00 00 51 EB C7 E8 41 FF FF FF EB EC 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  206.  
  207. And now, to patch the shift right, we just change a few instructions in the Rec MP part of the code we patched. That of course depends on what we want to accomplish. If we simply want to NOP out the shift right to make Extra Attack not consume any extra MP, all we have to do is change the 2 bytes for that instruction:
  208.  
  209. d1 f8 sar eax
  210. becomes
  211. 90 nop
  212. 90 nop
  213.  
  214. If we want an arbitrary fraction, we must insert a multiplication and a division where there was only a shift right. During the calculation, we must use register ECX, so we push it into the stack to avoid losing its value, popping it after we're done:
  215.  
  216. 51 push ecx
  217. 6b c0 01 imul eax,eax,0x1
  218. b9 02 00 00 00 mov ecx,0x2
  219. f7 f9 idiv ecx
  220. 59 pop ecx
  221.  
  222. The code above is 10 bytes bigger than the original, so it turns out our Prep 4 block gets shifted by 10 bytes, and we need to update our call to it on the right side to match the new address:
  223.  
  224. E8 41 FF FF FF call prep4
  225. becomes
  226. E8 4B FF FF FF call prep4
  227.  
  228. If, instead, we want to subtract a fixed value from the MP regeneration, we will need a subtraction instruction, with a fixed value. We do need some extra logic to handle negative values though, in case we're trying to subtract more than the spell originally cost (e.g. subtracting 2 from Cat's Walk 1MP cost):
  229.  
  230. 83 e8 01 sub eax,0x1
  231. 73 02 jae skip_xor
  232. 31 c0 xor eax,eax
  233.  
  234. Applying a xor operation with the same register in both operand slots is the best way to zero out a register, meaning no MP will be restored if we end up with a negative value. This code is 5 bytes longer than the original, which means we need to adjust our call offset here as well:
  235.  
  236. E8 41 FF FF FF call prep4
  237. becomes
  238. E8 46 FF FF FF call prep4
  239.  
  240. ==================================================================
  241.  
  242. 05. Patching the MP cost nerf
  243.  
  244. For convenience, this section will list everything that needs to be patched in order to change the behavior of the MP cost nerf. Three options are provided:
  245.  
  246. >>> OPTION 1 - Restore a fraction of the MP cost
  247.  
  248. Patch the following 139 (0x8b) bytes at offset 0x7a2a8:
  249. 0F 8C 0C 01 00 00 E9 F5 00 00 00 8B 45 F8 8B 88 EC 06 00 00 8B 55 F8 8B 4C 8A 24 8B 55 F8 8B 82 F4 06 00 00 8B 80 94 00 00 00 99 2B C2 51 6B C0 01 B9 02 00 00 00 F7 F9 59 99 03 81 B8 00 00 00 13 91 BC 00 00 00 8B 4D F8 8B 89 EC 06 00 00 8B 75 F8 8B 4C 8E 24 89 81 B8 00 00 00 89 91 BC 00 00 00 8B 45 F8 8B 88 F4 06 00 00 8B 91 90 00 00 00 C3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  250.  
  251. Patch the following 134 (0x86) bytes at offset 0x7a374:
  252. E8 3A FF FF FF 52 8B 85 48 FE FF FF 50 8B 4D F8 8B 91 EC 06 00 00 52 8B 45 F8 8B 88 F4 06 00 00 8B 91 8C 00 00 00 52 8B 4D F8 E8 91 53 F9 FF E9 52 00 00 00 E8 06 FF FF FF 52 8B 45 F8 8B 88 F0 06 00 00 51 EB C7 E8 4B FF FF FF EB EC 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  253.  
  254. Offset 0x7a2d8 is now the numerator for the restore fraction (default 0x1)
  255. Offset 0x7a2da is now the denominator for the restore fraction (default 0x2)
  256.  
  257. >>> OPTION 2 - Restore a fixed X less than the MP cost
  258.  
  259. Patch the following 139 (0x8b) bytes at offset 0x7a2a8:
  260. 0F 8C 0C 01 00 00 E9 F5 00 00 00 8B 45 F8 8B 88 EC 06 00 00 8B 55 F8 8B 4C 8A 24 8B 55 F8 8B 82 F4 06 00 00 8B 80 94 00 00 00 99 2B C2 83 E8 01 73 02 31 C0 99 03 81 B8 00 00 00 13 91 BC 00 00 00 8B 4D F8 8B 89 EC 06 00 00 8B 75 F8 8B 4C 8E 24 89 81 B8 00 00 00 89 91 BC 00 00 00 8B 45 F8 8B 88 F4 06 00 00 8B 91 90 00 00 00 C3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  261.  
  262. Patch the following 134 (0x86) bytes at offset 0x7a374:
  263. E8 3A FF FF FF 52 8B 85 48 FE FF FF 50 8B 4D F8 8B 91 EC 06 00 00 52 8B 45 F8 8B 88 F4 06 00 00 8B 91 8C 00 00 00 52 8B 4D F8 E8 91 53 F9 FF E9 52 00 00 00 E8 06 FF FF FF 52 8B 45 F8 8B 88 F0 06 00 00 51 EB C7 E8 46 FF FF FF EB EC 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  264.  
  265. Offset 0x7a2d7 is now how much MP should NOT be replenished from the original cost (default 0x1)
  266.  
  267. >>> OPTION 3 - Restore all MP cost
  268.  
  269. Patch the following 139 (0x8b) bytes at offset 0x7a2a8:
  270. 0F 8C 0C 01 00 00 E9 F5 00 00 00 8B 45 F8 8B 88 EC 06 00 00 8B 55 F8 8B 4C 8A 24 8B 55 F8 8B 82 F4 06 00 00 8B 80 94 00 00 00 99 2B C2 90 90 99 03 81 B8 00 00 00 13 91 BC 00 00 00 8B 4D F8 8B 89 EC 06 00 00 8B 75 F8 8B 4C 8E 24 89 81 B8 00 00 00 89 91 BC 00 00 00 8B 45 F8 8B 88 F4 06 00 00 8B 91 90 00 00 00 C3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  271.  
  272. Patch the following 134 (0x86) bytes at offset 0x7a374:
  273. E8 3A FF FF FF 52 8B 85 48 FE FF FF 50 8B 4D F8 8B 91 EC 06 00 00 52 8B 45 F8 8B 88 F4 06 00 00 8B 91 8C 00 00 00 52 8B 4D F8 E8 91 53 F9 FF E9 52 00 00 00 E8 06 FF FF FF 52 8B 45 F8 8B 88 F0 06 00 00 51 EB C7 E8 41 FF FF FF EB EC 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  274.  
  275. ==================================================================
  276.  
  277. 06. The MP cost un-nerf code
  278.  
  279. This is what the left part of our diagram looks like post-patch:
  280.  
  281. jl A_NEG
  282. jmp A_POS
  283. REC_MP:
  284. mov eax,DWORD PTR [ebp-0x8]
  285. mov ecx,DWORD PTR [eax+0x6ec]
  286. mov edx,DWORD PTR [ebp-0x8]
  287. mov ecx,DWORD PTR [edx+ecx*4+0x24]
  288. mov edx,DWORD PTR [ebp-0x8]
  289. mov eax,DWORD PTR [edx+0x6f4]
  290. mov eax,DWORD PTR [eax+0x94]
  291. cdq
  292. sub eax,edx
  293. sar eax,1
  294. cdq
  295. add eax,DWORD PTR [ecx+0xb8]
  296. adc edx,DWORD PTR [ecx+0xbc]
  297. mov ecx,DWORD PTR [ebp-0x8]
  298. mov ecx,DWORD PTR [ecx+0x6ec]
  299. mov esi,DWORD PTR [ebp-0x8]
  300. mov ecx,DWORD PTR [esi+ecx*4+0x24]
  301. mov DWORD PTR [ecx+0xb8],eax
  302. mov DWORD PTR [ecx+0xbc],edx
  303. PREP_4:
  304. mov eax,DWORD PTR [ebp-0x8]
  305. mov ecx,DWORD PTR [eax+0x6f4]
  306. mov edx,DWORD PTR [ecx+0x90]
  307. ret
  308.  
  309. This is what the right part of our diagram looks like post-patch:
  310.  
  311. call REC_MP
  312. push edx
  313. mov eax,DWORD PTR [ebp-0x1b8]
  314. push eax
  315. PREP_2:
  316. mov ecx,DWORD PTR [ebp-0x8]
  317. mov edx,DWORD PTR [ecx+0x6ec]
  318. push edx
  319. mov eax,DWORD PTR [ebp-0x8]
  320. mov ecx,DWORD PTR [eax+0x6f4]
  321. mov edx,DWORD PTR [ecx+0x8c]
  322. push edx
  323. mov ecx,DWORD PTR [ebp-0x8]
  324. call DO_MOVE
  325. jmp DONE
  326. A_POS:
  327. call REC_MP
  328. PREP_3:
  329. push edx
  330. mov eax,DWORD PTR [ebp-0x8]
  331. mov ecx,DWORD PTR [eax+0x6f0]
  332. push ecx
  333. jmp PREP_2
  334. A_NEG:
  335. call PREP_4
  336. jmp PREP_3
  337. DONE:
  338. ...rest of function...
Add Comment
Please, Sign In to add comment