Untitled

I am glad you like the challenge.

I am open to suggestions for the scoring system. Regarding the concern that `this just results in people implementing all features and creating long and complex programs`: Most features should be pretty straight-forward to implement, some may not take more than one or two lines of code. It's mostly basic text processing - most of the difficulty/challenge comes from not being familiar with Japanese. It's really only the final feature that takes a little bit more work (but not that much).

I have to explain many things that would be obvious to anybody who can understand Japanese (but that's part of the fun!).If you were all familiar with Japanese, this could almost be one codegolf question. Having to implement all features at once might put off some people. This way, you can post a partial working answer and improve upon it later.

I have edited the question to clarify the points you addressed. Use the edit history for easier navigation.

"READING will never contain any KATAKANA." This refers to the user input your program gets, namely to the `READING` part. I did not count the dictionary as (user) input, but more like a separate source file. As I indicated (?), you probably want to convert all `KATAKANA` from the dictionary file to `HIRAGANA`.

I want to leave the output format as open as possible. Output does not need to be an array, as long as there is a 1:1 mapping from your output to the array format. Joining the Kanji and the Kana of the output must result in the input `MOONGLYPH`s and kana, however.

The basic feature, and features 1 and 2 were meant as an introduction, to get you started and lead you the way. Most of the other features assume that Feature 3 is implemented. I think I'll merge feature 1 and 2 with the basic feature.

`it may appear multiple times` 木々々 should be treated as equivalent to 木々々 etc. 人々 needs feature 4 to produce the correct result, in the sense that if your program outputs `no match found` here, it still counts as having implemented feature 6. I'll change the example.

`does our program need to handle unmatched punctuation can we just assume that punctuation characters will always agree between MOONGLYPHS and READING?` That is exactly what I was trying to say by `always appear at the same abstract position and should be ignored` In general, I don't want to make this unneccessary difficult, you should be able to concentrate on the main task. Just don't remove them entirely, they still need to appear in the output.

Feature 8: If you don't know Japanese, just think of it like this. There are squiggly symbols, and there's a list of readings/`KANA`s attached to each of them. You don't need to know anything at all about these symbols. `ヵ` is nothing more than yet another `MOONGLYPH` with the given readings. In fact, I implemented this feature, as well as the kana feature more or less simply by adding all kana and `ヵ` to the dictionary file (upon runtime). Also, it may help to think of `KATAKANA` as uppercase, and `HIRAGANA` as lowercase, and `MOONGLYPH` as Egyptian Hieroglyphs. You can write down their pronunciation either in uppercase or lowercase, it doesn't matter, but it may be conventional to use all capitals.

`You refer to "the last four",` I was also referring to `kana` in general before that ;)  `Or are all other kana simply read as themselves while those aren't?` Yes. And ケ has got a few other readings as well, but that's part of feature 8.

`discrepancy between MOONGLYPHS and READING come from in the example you already have? Is one Katakana and one Hiragana` If you are talking about the example for feature 12, then yes. `READING` always contains only `HIRAGANA` (and punctuation). `MOONGLYPH`s may contain both `HIRAGANA` and `KATAKANA`, and they are for most part, read as themselves. So both `ま`(ma) and `マ`(ma) are read as `ま`(ma).

`How are readings to be treated which aren't found in the dictionary files, like those from features 8 and 12?` Adding those readings is part of those features. If you do not choose to implement these features, your program should behave the same way as it does for other impossible matches (see previous feature 1, `静岡`/`つき`/`静,岡`/`[]`. Your program should indicate (eg outputting nil, an empty, or by crashing) that there is no match.

And yes, I like lunatic references.