Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Discussion time!
- Context:
- In the past week or so, I have been working with @EllipticEllipsis to add "Message extraction support" to ZAPD. The goal was being able to handle every language/encoding OoT and MM supports.
- The first problem was the "format codes" this game uses to be able to produce special things, like printing an item icon in the message, playing a sound, showing a highscore, printing a controller's button in text, etc. We decided to handle those with macros and the C preprocessor string concatenation. There are more problems here (mainly enums problems), but I'll list them at the end.
- Then we had issues with japanese. Mainly the 2byte encoding used, which is shift-jis. If the text were exported as-is, then we have the problem of it not being easily editable. One solution would be just export the text as-is and tell everyone who want to edit it to open the file with shift-jis encoding instead of the default utf-8 (it is easy to do with vscode and probably a lot more editors too). Other possible solution would be convert the text to utf-8 during extraction and convert it back to shift-jis during compilation. We felt like this may be a bigger decision that should be discussed here.
- After that, we wanted to extract messages from the iQue version too. The main problem of this version is the lack of pre-leaks documentation of this game, so we had to discover how it worked ourselves. We end up discovering the iQue versions use the "format codes" of the non-japanese encoding, and adds a few more to be able to handle the chinese characters as a two byte encoding (maybe somebody already knew this, idk). We currently don't know what encoding is being used in iQue, we only know that each 2-byte sequence is directly and sequentially mapped to each texture in the font file (which i named `cn_font_static`). If anybody knows something about the iQue encoding, let us know!
- Then MM. The messaging system changed in MM, because of course it would be different. In OoT, each message is just kinda a raw string (`const char[]`) with a few format codes shenanigans. MM in the other side decided that each message needs a header before the actual message (which is slightly different for japanese vs non-japanese messages). MM also decided it needs like the of triple format codes/special characters, and not reuse any format code of OoT, so a whole new set of macros needs to be made for MM. Also, MM still needs the OoT macros/formatcodes because it uses them for the ending credits (`staff_message_data_static`). It isn't completely bad, but writing another bunch of dumb macros is tirng.
- (I think it is funny that the iQue version is more similar to "normal" OoT than MM.)
- Current state:
- Currently, ZAPD is able to extract non-japanese text. For the foreign characters (ie, Γ© or ΓΌ), ZAPD extracts them in a utf-8 compatible way. In the actual compilation phase, a small python script is used to convert those characters back to the corresponding format-code. This way, compiling `nes_message_data_static`, `fra_`, `ger_` and `staff_` does :OK: in the current OoT repo.
- [add random screenshot]
- ZAPD can extract japanese text right now too, but it is currently limited to extracting it as shift-jis, so external tools would be needed to properly mod those files. I wasn't able to test and see if the compilation would be :OK: since this file is not part of the PAL version of the game, but looking at the compiled .o file with vbindiff, it looks like that it should be :OK:.
- [add japanese screenshot]
- iQue is being extracted too, but still has the issue of using an unknown encoding.
- [add chinese screenshot]
- MM will need to add the headers message structs to their repo, but the extraction is working (I'm halfway of writing macros, but the text is legible).
- [add MM screenshot]
- Finally, current problems:
- Finally, here is a list of problems that needs to be discussed:
- 1. Since the messages are extracted as `char[]` we can't use the enums we have for sfx, itemsids, etc as macro arguments.
- 2. Should we convert back and forth the japanese messages during extraction/compilation to utf-8? or would be a better solution to open those files in shift-jis?
- 3. iQue
- Minor problems:
- 1. When compiling japanese, any 2bytes character which has the form `0xXX5C` (lower byte is `5C`), the `5C` part is omitted, the the rest of the current message is shifted. A workaround is escaping that character, but this is far from optimal.
- 2. Japanese has an unknown symbol at the very end of `jpn_message_data_static` which is not part of shift-jis. It is not used in normal gameplay (probably). The current workaround is a macro.
- I really want to thank @EllipticEllipsis for taking the time to help me to take decisions, investigating encodings outside and inside the game, among others. Without his help I would have had a lot more troubles with japanese, and iQue wouldn't even be a possibility.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement