Feb 25th, 2018
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 14.24 KB | None | 0 0
  1. MarI/O FAQ
  2. We get lots of questions about what MarI/O is and how it works so here is a compilation of the most frequently asked questions. I'd suggest you read this and watch some of the videos in the youtube description before asking your questions in chat if you're interested in MarI/O. If you're not really interested, then why are you here? Leave, and do something else.
  3. Further down there is a more in-depth FAQ which takes up the tech of it a bit more. I'd recommend reading that too.
  5. Q: What is this? / What's the point of this?
  6. A: It's a self-learning AI (Artificial Intelligence) made by SethBling that is emulating evolution to get through the game "super mario bros" one level at a time. The point is to see just how far this AI can make it in the game. So far we've only encountered one that was impossible for it (2-1) with the original code, but it should be possible with akisames new tweaks. More on those tweaks further down.
  8. Q: What is Gen, species and genome?
  9. A: A gen (generation) is a set of 300 genomes. When a gen ends, the AI calculates what species to keep and what genomes to breed based on the average fitness they got and a new gen begins.
  10. A genome is a ”run” and species contain genomes (runs) that are similar to eachother.
  11. Every level the AI starts fresh with 300 species, 1 genome in each and as time goes by, the species that don't do well in the level gets killed off.
  12. It's easy to think this is just one bot playing over and over, but in fact it's sort of like 300 different players with different skills competing who does the best in the level, Then those who do the best get to reproduce to create the next generation of players. Eventually one of those players gets good enough and finishes the level.
  14. Q: What is fitness?
  15. A: How far to the right it goes, and how fast. It's how the AI calculates progress.
  17. Q: What does the percentage mean?
  18. A: the percentage after "genome" is how far we are into this generation, and the max fitness percentage is how many of the genomes in the current generation has made it close to max fitness.
  20. Q: Does MarI/O know the layout of the level?
  21. A: It has a simplified "map" where moving objects are black blocks and stationary objects are white blocks.
  23. Q: Can it go left?
  24. A: It can do anything a real player can, walk right and left, press up and down, and A and B, but it rarely goes left since that affects fitness in a negative manner.
  26. Q: Why does the AI just stop sometimes?
  27. A: It sometimes stops as a random action, or rather lack of action, just like it can do other things randomly.
  29. Q: Why does MarI/O reset even though it didn't die?
  30. A: The code says to reset to the savestate after not gaining any fitness for x amount of time, which means that it will not only reset when it dies, but also when it gets stuck on an obstacle.
  32. Q: Is the AI completely reset between levels?
  33. A: Yes, the streamer has to manually move it over to the next level and reset it.
  35. Q: Why is it reset between levels?
  36. A: Because it can't use knowledge from a previous level on another level. All levels are different so it would confuse it if it was not reset.
  38. Q: I sometimes see it starting in the middle of the level, why is that and does it mess up the learning?
  39. A: Around the middle of most levels, there's an invisible checkpoint. The game simply loads the checkpoint before the AI gets a chance to reset. Since the fitness is basically frozen when it loads the checkpoint (due to the fact that it has been past that point already) and since it didn't go to the next genome yet, this does not affect the learning what so ever.
  41. Q: Why is it learning so slowly? / Why is it so dumb?
  42. A: Because learning takes time, and evolution takes even longer. Also, it isn't really meant to be played in real-time, but rather sped-up a bunch. Although that isn't very viewer friendly so we have it in real-time, and it takes time.
  44. Q: can it run other games as well?
  45. A: No, it can't. That would require rewriting the ram addresses to conform with that game as well as change the fitness system.
  47. Q: MarI/O just got a lot further than it does now! what gives?
  48. A: Each species is different and thus will give a different fitness. It happens quite often that one species is way in the lead. eventually the other species will either catch up or die out.
  50. Q: Something is wrong!! MarI/O suddenly gained 1000 fitness near the end of the level!!
  51. A: SethBling originally wrote MarI/O to play level 1-1 so he added a bonus of 1000 fitness for completing the level. However not all levels have the same length. This was not a problem in world 1 because level 1-1 was the longest but this is a problem for a few levels such as 2-1 and 3-1 that gains the bonus 32 pixels before the finish, 2-3 that gains the bonus 416 pixels before the finish, 3-2 that gains the bonus 176 pixels before the finish and 4-1 that gains the bonus 432 pixels before the finish. This behavior might be removed later but we deemed this "bug" too minor to warrant its own update to fix.
  53. Q: How long did each level take?
  54. A: 1-1 finished in about 36 hours
  55. 1-2 finished in 13 days
  56. 1-3 finished in 16 days
  57. 1-4 finished in about 36 hours
  58. 2-1 was skipped twice, once after 31 days being stuck at a pit for over 2 week and once after we encountered an issue where the spring at the end of the level didn't spawn at all.
  59. 2-2 was finished in 14 days 12 hours
  60. 2-3 was finished in 5 days 5 hours
  61. 2-4 was finished in 5 days 8 hours
  62. 3-1 was skipped after about 41 days primarily due to the RAM issues with bizhawk (bizhawk has a known issue where the RAM isn't cleared like it's supposed to, making the RAM usage sky high) was getting a bit too much, but also because many people thought it was frustrating with how long the level took to finish.
  63. 3-2 was finish in 22 hours and 28 minutes
  64. 3-3 was finished in 12 days 9 hours
  68. For the people who are more interested in the tech stuff of it all, Akisame took it upon himself to write a more in-depth FAQ for those kinds of stuff which will be here under.
  70. Q: In what language is marI/O written?
  71. A: MarI/O is written in lua. Lua is a lightweight programming language designed for running code real time on simple microchips. Lua is mainly focused on speed, portability, extensibility and ease of use. In this case lua was used because of its speed and ability to work together with emulators.
  73. Q: What tweaks did Akisame make?
  74. A: A few simple tweaks. The pipes and pits gave problems with calculating the fitness correctly and had to be changed.
  75. The banner was adjusted a bit to be slightly smaller and include a counter for species that reach a fitness that is close to the max fitness (< 30).
  76. The tweaks include a few more fixes. For a more detailed documentation of the code check the link in the description.
  78. Q: How is fitness calculated?
  79. A: The fitness is currently being calculated as follows: fitness = rightmost - currentFrame / 2 -39.5 +timebonus. In this formula rightmost is the rightmost pixel marI/O has reached. The currentFrame is the number of frames that have passed since marI/O started this run. the -39.5 is to compensate for the loading frames so we start at 0 fitness and the timebonus is a value that is used to freeze the fitness when mario is standing still, falls into a pit or enters a pipe. The timebonus variable used here will change slightly when the fix for looping levels will be introduced.
  81. Q: How are species removed?
  82. A: Once a gen has ended it dumps the lower half of the genomes of each species (it removes the genomes with the lowest fitness). it then sorts the species based on the fitness they attained. the genomes in these species are then allowed to breed based on a random factor and a check to apply a bias towards better preforming species. this could remove species but it is more likely that species are removed by either being deemed a weak species which means that they are below a certain percentage of the max fitness or by not improving for 25 generations.
  84. Q: Why is this AI so slow?/Why can't this AI see what it needs to do to progress?
  85. A: Imagine you are given a grid and you are told to connect them to 6 other boxes. you are then given a number that ranks them and told: "go and get the highest number here as you can, good luck". It is basically the same for the AI. It makes node that connect a grid that represents its vision to outputs (with some logic added to these connections). It has no knowledge of what went wrong or how it went wrong. All it "knows" is that a certain combination of nodes on this grid gives a certain score (what we've called "fitness"). It will make variations on the combination of nodes that worked best and sends them back to the emulator for testing. As you can imagine this is SLOW and based on luck as the AI needs to make the right variations. An important side note is that the AI can and does learn that some nodes should not be changed.
  87. Q: Am I wrong to assume that this setup is effectively working toward creating a perfect set of button presses for each level, instead of creating a toolset for all levels?/Is this an automatically generated TAS (Tool Assisted Speedrun)?
  88. A: The AI will "learn" to play a level not by generating the perfect key presses but by generating "nodes". MarI/O "sees" the blocks around it and these nodes are linked to blocks at a relative position from mario. The AI can then react to the blocks that are around Mario. There are 2 kinds of blocks: Stationary and enemy blocks and 2 types of nodes: Green and Red.
  89. A green node will give a "push button" signal on a stationary block and a "release button" signal on an enemy or moving block. Red nodes are reversed and will give a "push button" signal on an enemy or moving block and a "release button" signal on a stationary block.
  90. ​So for example a red node that's hooked up to the jump button will press that button when is sees an enemy but releases it when it encounters a stationary block and vice versa for the green node.
  91. Besides nodes there are also switches that will flip the output of a node or select the average of all the nodes connected to the switch.
  92. With this simplistic system the AI will learn to react to the "configurations" of blocks on the screen that will give it the highest fitness.
  94. Q: It always seems to me that the early gens just try things randomly. Is that correct?
  95. A: Yes. it does try out stuff randomly. MarI/O doesn't know what works and what doesn't. It needs to test that and that is where the fitness comes in. After testing a set of nodes (also known as a genome) it knows the fitness of that SET of nodes.
  96. After that it will mutate and mix these nodes with the nodes from other genomes and sometimes add a node and test what it does to the fitness. If the fitness increases that new set of nodes is promoted and if it decreases the fitness that set will be reduced.
  97. Breeding allows for the mixing of nodes. that way you get genome children that have a lot of the "bad" nodes and genome children that have a lot of the "good" nodes. The bad nodes will die out and only the "good" nodes will remain.
  98. Keep in mind however that MarI/O can't determine if a node is "good" or "bad", it can only determine that a set of nodes is "good" or "bad".
  100. Q: After, say, beating 1-1, is it better equipped to take on 1-2 because of the arrangement of the nodes it has created?
  101. A: yes and no. yes, MarI/O will perform better in the beginning and no because it can get stuck in a local maxima more easily (and some situations might take a lot longer depending on what nodes were selected).
  102. So for example you teach MarI/O to do world 1-1. It learns to run and occasionally jump a pipe but in 1-2 it suddenly encounters completely different stuff like the many pillars in the beginning or turtle shells or moving platforms.
  103. MarI/O will have to learn to deal with that but since there is no longer a large selection of random nodes that might do this you are stuck waiting for a random evolution for every obstacle that is completely new.
  104. MarI/O will still learn to get over it but it could potentially take a lot longer.
  106. Q: With enough time it seems that these evo-AI's could effectively learn to do anything.
  107. A: yes. Given a large enough population and enough time with a perfect fitness system they could do anything. The problem is that it sometimes takes a (impractically) LONG time and that local maxima are a problem.
  108. Local maxima are sets of nodes that work together perfectly but are unable to finish a level. Any mutation to these sets will decrease the fitness and make it less likely to survive. This can make local maxima a very time consuming problem.
  110. Q: Why does the AI do stupid things that make mario get stuck?
  111. A: An evolutionary AI is not deterministic. That means that an evolutionary AI doesn't know anything about the road ahead. It just selects nodes that increase the fitness the most.
  112. If that gives MarI/O slightly better fitness right then and there but put mario in a position where he will be stuck then that is what mario will learn.
  114. Q: Why does marI/O always get stuck at the stairs?
  115. A: The stairs require a very specific set of nodes to work without interfering with a normal run. In the very beginning marI/O is able to walk up a stairs quite easily since there are no other nodes to interfere with but as marI/O progresses you might notice that the stairs mario was hopping up in the beginning is now jumped over entirely. This has 2 reasons: 1) it is faster so it gives more fitness to jump over a stair. 2) the (early) nodes that walk up a stairs tend to interfere with marI/Os normal behavior. That second reason is why stairs are so difficult for marI/O. marI/O needs a specific configuration of nodes to walk up a stairs without interfering with the normal run. You can see this if you look at the nodes that activate for a genome that is able to walk up a stair as they are all very similar.
  117. Q: The new generation started at X% of the genomes! / A genome/species was skipped! What happened?
  118. A: The system that determines what genomes are already measured determines that by checking if that species has a fitness. There is a bug in the code that doesn't reset the fitness of some genomes. We already know about it and we already know how to fix it but we determined that this bug was not game breaking so we decided to leave it in the code because we want to keep the script as original as possible.
Add Comment
Please, Sign In to add comment