AI Learns to Play Super Mario Bros!

3 роки тому

Using a Genetic Algorithm and Neural Network, a population of AI were able to learn to play different levels of Super Mario Bros for the NES.
Code: github.com/Chrispresso/SuperM...
Blog:
Music: / ashamaluevmusic
first song: Memory
second song: Cinematic Orchestral
third song: My World

КОМЕНТАРІ: 665

@Kosmicd12 3 роки тому

Haha the walljump in 2-1 was sick

@Chrispresso 3 роки тому

New meta for your next speed run? But seriously, your explanations of speed runs for the game helped me a ton in understanding some of this stuff....even though I still got some of them wrong.

@Kosmicd12 3 роки тому

@@Chrispresso unfortunately with the walljump in 2-1 you can only get 6 on the TIME remaining so you get 6 fireworks :( I'm really surprised about the 7-1 fpg it did, since what it did there would actually waste a lot of time lol. No problem! Great vid!

@Joshua_Gomez15 3 роки тому

Nice, kosmic watched this!

@zantly 3 роки тому

tbh i accdently did a wall jump on a pipe un 1-1

@Eshyyyyy 3 роки тому

Gaming

@amanofnoreputation2164 3 роки тому

AI: executes frame perfect walljumps. Also AI: dies to *coins.*

@Chrispresso 3 роки тому

Mario's greed ultimately killed him

@easyaspi31415 3 роки тому

Shiny, but deadly coins

@LancerloverLL 3 роки тому

I usually don't like comments but when this one said 99 I just couldn't help it.

@veggiet2009 3 роки тому

@@easyaspi31415 I rate this comment an unbelievably exceptional 5 out of 5 fuzzies!

@ashercd6487 3 роки тому

@@veggiet2009 we don't have royal fuzzies, and we don't have rainbow fuzzies. How about 5 *BIG CHUNGUS FUZZIES!?!*

@elijahrunyon3347 3 роки тому

Mario has no natural fear of piranha plants

@nguyentranduy7053 3 роки тому

Year 2030: UKposts notification popup: AI let's play episode 5 uploaded!

@deano1 3 роки тому

Ai: I’m scared to hold the run button Also Ai: does three wall jumps in a row and first try bullet bill glitch

@Chrispresso 3 роки тому

Maybe wall jumps are just easier than running. Maybe I've been playing wrong...

@Cyril29a 2 роки тому

@@Chrispresso Well if it can only see three types of things, -1,0,1 then running limits interpretation time and leads to errors and death. Wall jumps are done on already interpreted object encounters so from its point of view I would say wall jumps are easier. Running is just too risky.

@DyrianLightbringer 3 роки тому

when you stopped rewarding the AI for points, probably should have made it stop seeing coins

@Chrispresso 3 роки тому

Because coins can be an indicator of when to jump I kept them in. Ultimately it may have been its downfall...

@laurinneff4304 3 роки тому

@@Chrispresso you could've maybe made it see coins as a separate thing, similar to how enemies are seen separately from blocks

@TrickShotKoopa 3 роки тому

@@laurinneff4304 ”The reason I did 3 was so I could give normalized values (-1, 0 or 1) in this case for the blocks. If I wanted to use more than the three I needed a way to encode the inputs (one-hot encoding or something similar) but I was running this on a pretty old laptop at the time and wanted to simplify the inputs. ” - Chispresso

@BlueTelevisionGames 3 роки тому

This was fun to watch!

@DeusVult838 3 роки тому

hi darby!

@DanPellegrino486 2 роки тому

A wild Darby has appeared.

@codeyisafk9487 2 роки тому

Hi BTG!

@thehelmsdepot 3 роки тому

I think allowing the AI to "see" more of the screen would improve it's performance. While we only "focus" on a certain section of screen, our peripheral vision does allow us to take in the entire thing and subconsciously react to upcoming obstacles. Also, by seeing the top of the screen it could potentially learn the level-skip shortcut on world 1-2, or even other glitches unknown to us.

@ret4kind 3 роки тому

8:00 that's the exact location I predicted the AI can't handle. Still great work.

@Chrispresso 3 роки тому

Thanks :D Glad you enjoyed! Ya... that section was just tricky in general unless you can either slow down or know that jump is coming. There are techniques that could get around that like using a memory. Maybe next time!

@skilz8098 3 роки тому

@@Chrispresso I'm thinking that maybe... have an extra backpropagation, rewind, instant replay type mechanism for the last couple of hundred frames only when he dies... This would be different from the long term memory of learned traits... This way, when the A.I. runs through a level and dies... upon death, the last few hundred frames or approximately the last 3 - 5 seconds of gameplay is recorded into a buffer net only when the death condition is met, and this is replayed through its own mini-neural net to be trained upon... Then these become learned traits upon the next trial... You may have to extend that frame count or time interval depending on the nature of the death... Was it contact with a creature, or was it from falling into a pit... if falling into a pit, then it depends on the height and time taken for the fall. You would want to rewind back to where he was at least a solid second or two on the previous stable platform... That and you may need to add in a dynamic look-ahead feature too.

@MaDrung 3 роки тому

OR anyone else for that matter. Thinking back on my childhood ;)

@harm991 3 роки тому

Basically the AI needs more different inputs in vision? (it has like 3 right now? Empt, Block or Enemy?)

@Chrispresso 3 роки тому

Yep! Very good observation. The reason I did 3 was so I could give normalized values (-1, 0 or 1) in this case for the blocks. If I wanted to use more than the three I needed a way to encode the inputs (one-hot encoding or something similar) but I was running this on a pretty old laptop at the time and wanted to simplify the inputs.

@red-k2048 3 роки тому

@@Chrispresso DO it again with the right amount of inputs pls :D I'm super curious off AI limits. For example, if it thinks it found the fastest route, will it still look for a different even faster one. And will the AI ever discover he can go down the pipes ever? Is there a way to make it look for the highest score in the fastest way possible. Man I love AI, thank u for the video, I subbed btw ;)

@Corpsecreate 3 роки тому

@@Chrispresso You can make it a CNN, with multiple binary channels representing different types of objects.

@Chrispresso 3 роки тому

@@Corpsecreate That's true. I might have to try that since it should use less weights overall as well.

@Austin1990 3 роки тому

Awesome work! The fact that the AI got so far without differentiating coins from bricks is astonishing. It is such a drastic simplification that you more or less got away with. I find that far more impressive than the fact that it failed a level.

@avo_k 3 роки тому

you are currently my favorite channel on youtube, your content is absolutly amazing ! thank you very much for all those clean, simple, begginer friendly videos ; and thank you even more for the code

@Chrispresso 3 роки тому

Wow! That means a lot! Thank you! I definitely want to make sure the content stays beginner friendly and have been thinking about including some bits of code in the videos but haven't been sure. Also considering doing some stuff with Unity to get some eventual better graphics and environments for AI.

@viniciusqueiroz2713 3 роки тому

Hi! Awesome job and explanation! I just wanted to point out the reason why the NN stops to wait for the Piranha Plant to go back down in the pipe, instead of doing what the speedrunners usually do. As you mentioned, your visual representation (on the left) uses one block per 16 pixels. That is, a ground block has 16x16 pixels, while your AI is only seeing one whole block. It thinks the Piranha Plant takes up the whole pipe (which would be 32 pixels in width), whereas there is a small gap in each side of the pipe that Mario can stand and stay alive while the Piranha is there. I know the model was running on an old laptop, and this workaround would take much more computational resources to train the NN (more than double, if you halven the number of pixels per input block), but I wanted to point it, just to guarantee you are taking that into account! =)

@stxcxv 3 роки тому

i love finding videos like these late at night, theyre always so calming idk why. thank you mr ai guy, this made my night have a good week

@Chrispresso 3 роки тому

Happy to hear that!

@melo-7904 Рік тому

Ai in media: destroys everything due to faulty programming Ai in real life: merio

@Selicre 3 роки тому

An AI performing FPG? No way. This is by far the furthest I've seen this game taken, AI-wise.

@Chrispresso 3 роки тому

Ya, I was very surprised that these AI were able to learn some of those tricks!

@scottbigbrain3944 3 роки тому

@@Chrispresso I am impressed with the breakthroughs that your ai's have been able to do. You should make tutorials for this stuff.

@johnforrestboone1 3 роки тому

i am skeptical. using the algorithm time loss is not (in my observation) weighted heavily enough to merit the discovery of those glitches. those glitches save fractions of a second.

@vbag42 2 роки тому

@@johnforrestboone1 THIS and it took years for some of the greatest Mario players to (kinda accidently) figure out these glitches and it also took a huge amount of game knowledge to come up with something like forward dash. Speedrunners were saving a pixel of forward movement cuz the world updated every 20 frames instead of every frame or something like that and this AI has what 10k generating of knowledge of smashing keys lmao

@Zod_JB 2 роки тому

That’s really bad ass bro! I’d honestly be interested in more training to see if it can eventually compete with speed runs.

@StaffandStormcloud 3 роки тому

The music was so beautiful and tender. I cried A LOT

@frankycornejo2047 3 роки тому

Very cool, would love to see more videos like this, it would be particularly cool to see the AI’s improvement and learning overtime.

@Chrispresso 3 роки тому

Noted!

@frankycornejo2047 3 роки тому

@@Chrispresso wow, thank you for taking the time to read my comment. I look forward to watching more of your amazing videos.

@hawnshill7441 3 роки тому

I'm going to subscribe because I want to see the final results. Oh yeah this was in my recommended and it's the first video of yours I've watched.

@Chrispresso 3 роки тому

Welcome! Glad you're at least being recommended things you're interested in! Definitely will revisit Mario in the future.

@BatBeardGames 3 роки тому

Brilliant I love this, just subscribed.

@Scn64 3 роки тому

Great video! I've been running your code just about 24/7 and trying different strategies for a little over two months now. I've got it running on my desktop and laptop simultaneously. I think I'm addicted, haha!

@Chrispresso 3 роки тому

Thanks! I'm glad you're able to enjoy it! I know what you mean. I found myself watching way longer than I should have at times....

@TWFDeadzoneII 3 роки тому

did it beat the game?

@Scn64 3 роки тому

@@TWFDeadzoneII It just finished level 6-1 and I'll be going to 6-2 soon. So far it's beaten every level except for maybe one or two. I think world 2-2 is one that it had some trouble finishing. It would get through the pipe at the end but had trouble going all the way to the flagpole at that point. For the most part, it's been great at figuring out the levels.

@TWFDeadzoneII 3 роки тому

@@Scn64 that's cool. Have you made any changes to the original code?

@Scn64 3 роки тому

@@TWFDeadzoneII I've made some changes in the config file just to test different strategies. The default strategy that the creator put in still seems to be the best though. I've also made some minor changes to the Python code that just affects how some information is displayed to the user...the core functionality is still all original though.

@nico7654321 3 роки тому

I love your solution teaching the AI to solve little scenarios that can be repeated through the game, allowing the machine to play new unknown levels

@Chrispresso 3 роки тому

Thanks!

@warmCabin 3 роки тому

8:23 Shiny, yet deadly coins

@eloygarcia9221 3 роки тому

ceave gaming

@novarender_ 3 роки тому

@@eloygarcia9221 game ceaving

@svrem 3 роки тому

You make awesome videos dude. Keep going!

@Chrispresso 3 роки тому

Thanks :) Definitely working on more. Just takes time to build some of the AI. Hoping to get onto a more regular upload schedule in the future!

@VaibhavSharma-zj4gk Рік тому

bro keep making these videos....they are very helpful...

@bradenjuengel9104 3 роки тому

The best part is that you taught the AI that trust only leads to being hurt.

@jggerald7877 Рік тому

@Chrispresso, you're one of the best! Those who can program artificial intelligence, machine learning, and neural networks I consider to be geniuses! ;)

@JohnSmith-ts2od Рік тому

Thanks for your research, i am working on something where i can play Mario with the AI and this really helps ❤

@O.M.JaYY3 3 роки тому

Hello! This video was fantastic, my friend. People can get lost pretty easily listening to things like this being explained, I should know! Happens to me often >:) It NEVER felt that way, from beginning to end. Ending it all with a few facts was a nice touch. New SUB! Thank you Chrispresso.

@Chrispresso 3 роки тому

Glad you enjoyed it! And I'm happy it wasn't boring or overly complicated :D

@user-wc1pt5st8b Рік тому

Id love if you could explain more math, and machine learning stuff, just like in the blog but also in the video (in depth).

@LeinaDZiur 3 роки тому

4:36 - The AI has learned something we call 'swag'

@AwesSK 3 роки тому

Hey man I would just like to say congrats for hitting the UKposts algorithm. Congrats on 10k subs.

@Chrispresso 3 роки тому

Haha I'm not there yet... but maybe some day :D

@AwesSK 3 роки тому

@@Chrispresso Oh sorry. I'll come back in 4 months. Should be applicable then.

@virojsiriwattanakamol2672 2 роки тому

Thank you very much for making AI so fun. We are teaching AI to our high school students. Training AI to win games like Mario can really inspire them. Thanks a million times.

@martinsosmucnieks8515 3 роки тому

Another great video, thanks for the great content

@Chrispresso 3 роки тому

Glad you enjoyed it

@user-zb9iu5mf8p 6 місяців тому

I like how AI actually tries to speedrun the game instead of learning it

@nxtboyIII Рік тому

Hey this is pretty cool. How did you transfer the onscreen gameplay onto the 16x16 grid show on the left side in your video?

@Exachad 2 роки тому

Good job. Huge improvement over MarI/O

@TheMazyProduction 3 роки тому

Awesome video man, you got a sub.

@Chrispresso 3 роки тому

Glad you enjoyed it and thanks for the sub!

@BunnyRaptor 3 роки тому

And I can't even get an NN to solve beginner minesweeper...good work!

@Mingura666 3 роки тому

If I may correct, at 3:56 it doesn't happen once every 60 pixels but once every 60 frames. Mario must jump in a specific way, with certain speed and hit a specific pixel. The game runs at 60 fps so Mario stands on that pixel for 1/60th of a second. During that little time he's standing on the block and therefore he can jump off of it. Your AI figured it out. That's scary and amazing.

@Chrispresso 3 роки тому

Oops, sorry for the mistake, but thanks for the clarifications! Guess the AI knows more about it than me...

@Joqer88 3 роки тому

Found your channel yesterday. I love your content! I would also love to see the process of getting to the result you show with some coding. :)

@Chrispresso 3 роки тому

Glad you enjoyed! I will definitely try to add that in future videos :D

@boo7948 3 роки тому

Same over here, just found you and id love to see the progress

@Allen_lena 3 роки тому

So taking Mar I/O, deciding to use a fixed structure instead of NEAT, and giving it raw training time and you end up having wall jumps and FPG. That's crazy. I wonder how it would work if it could see outside of the pink box. Maybe not everything, but a maxpooling of what's outside, in order for it to see everything while keeping input space small.

@Chrispresso 3 роки тому

You can make the pink box as large (tall and wide) as you'd like. I made the code pretty flexible with it. I thought about that but didn't end up adding it. Even if you use the blocks but pooled them into 2x2 for safe/unsafe/empty it could potentially help. Might end up trying that or even just try using a CNN.

@schwegelbin 2 роки тому

I subbed, before seeing 1 second of any video from him

@LancerloverLL 3 роки тому

AI is an incredible thing and I'm looking forward to it improving many aspects of our life. I am most excited about autonomous cars. Just imagine how many hours, weeks or even years of our lives are being wasted away staring through a windshield every day.

@Julenuri 3 роки тому

Amazing video, dude

@kriller3771 3 роки тому

Wow, I can’t believe the AI found fpg in 7-1. It’s so close to full flagpole glitch / bullet bill glitch, it would be so sick to see it get BBG. Also, just so you know, because of how floor detection works in smb1 wall jumping has a 5 Y-pixel window. So every block (16 pixels), there is a 5 Y-pixel range where you can hit a wall jump pixel and get a wall jump.

@Chrispresso 3 роки тому

It would be! Maybe some day it will. And thanks for that. I thought it was just the one pixel, but I was just reading about it from some speed running websites and probably just misunderstood it.

@kriller3771 3 роки тому

@@Chrispresso Its actually a very common misconception that there is only 1 pixel you can land on, and not a lot of people know its actually a 5 pixel window, so places you read might have actually been saying its just a 1 pixel window lol. For example, this bismuth video about SMB3 compares the smb3 wall jump to the smb1 wall jump (link is time stamped), and this video says that smb1 WJ has only a 1 pixel window, and bismuth didn't know otherwise until I told him ukposts.info/have/v-deo/bIeggaJknXuB1Zc.html

@copperymarrow1583 3 роки тому

everybody gangsta till it does bullet bill glitch

@scottbigbrain3944 3 роки тому

You might want to expand the area that mario can see to include some of the area behind him. This way the ai could learn to avoid gumbas and turtles that are chasing him.

@Chrispresso 3 роки тому

I thought about this but it wasn't an issue early on. By the time I realized I should have done it, I'd spend so much time training the current AIs that I didn't want to do it over.

@tails_the_god 3 роки тому

@@Chrispresso hmm in a different version?

@casperdewith 3 роки тому

8:26 Shiny yet deadly coins

@colinbaker7614 3 роки тому

AI performing flag pole glitch is insane. It’s one of the hardest speedrunning strats for this game. Well done.

@Chrispresso 3 роки тому

The AI figured out how to do it before I even really knew what it was....

@IntegralKing 3 роки тому

@@Chrispresso "ah crap I wrote a bug into my code" speedrunners: "OMGWTFBBQ IT'S SO GOOD. DOUBLE FLAG POLE + INFINITE WALLJUMPS"

@eskimoprime09 3 роки тому

It would be cool to see a Human vs AI project. Have one person play a new game for the first time, and give them some time to practice speedrunning and skill at the game. At the same time, someone else build a learning AI and have it learn for the same amount of time. See who comes out on top at the end. I feel like at first the human will obviously be better because we don't have an initial huge hurdle to jump through, but over time the AI will surpass.

@micmarlen Рік тому

I dowloaded the code and I'm having so much fun, this is great, great job, also I always wanted to see a NN learn how to play Mario, also I discovered that making the pink box bigger makes the AI learn way faster. is there a way to extend it behind mario?

@PastaMaster115 Рік тому

I'm curious to see what it would do in a X-4 level. Also why was the AI almost constantly using the "up" input? Up has very little use in SMB.

@electra_ 3 роки тому

Really interesting! I've worked on a very similar project myself that was an extension of Sethbling's MarI/O called LuigI/O, which was eventually able to beat all the levels (though it required separate neural networks for each level, and often multiple networks per level. I think it overfitted too much.) I'm impressed that this was able to find the bullet bill flagpole glitch. Were all the final runs you showed using the same final network, or did each level need a different network? You said some of the things were shared, referencing the wall jump, but to be honest I kind of feel like a lot of the wall jumps that AI find are pure luck, especially since the inputs provide no way to actually time anything frame perfectly or pixel perfectly as they are only 16-block precise. Also, how does it fare on the other levels like the -2, -3, and -4 stages? You only showed the first level of each world here.

@Chrispresso 3 роки тому

Ah, so that's what LuigI/O is! Just looked up the channel and that's pretty cool! They were different neural networks. They used the same base from previous levels but could change if they needed to. So the weights from 1-1 carries to 2-1, then 2-1 to 3-1, etc. Ya I don't think wall jumps were intentional, but once it knows it, it's pretty easy to do again. Since it's able to see ahead for the blocks, a lot of times it would do a small jump into a wall jump. Obviously it only works under certain conditions though since the weights are unique and not shared like in a CNN. I still need to test on other levels like -2, -3. Originally I had planned to but the states I needed for OpenAI gym weren't there and only included the -1 stages. I do plan to come back to this in the future with a more advanced AI and see how it fares compared to this AI and finally test it on all stages (-1, -2, -3, -4).

@electra_ 3 роки тому

ah, okay the weights just carried over, yeah that's what ours did as well. But, it's impressive you were able to beat up to 7-1. 8-1 is an extremely hard level since it's the longest one in the game. I never thought about the coins fooling it since they look like blocks to the input viewer, but that could certainly be a reason why it screws up there. That would probably be pretty easy to fix. Definitely a lot of cool things you can add to try and improve this!

@Chrispresso 3 роки тому

@@electra_ I imagine there will be a few levels that will give it difficulty when I try again. Anything with warp pipes might be a problem since I'm not sure how to encode some of that information or what it will look like to the fitness of the individual at that moment. Definitely excited to try out some new techniques though!

@electra_ 3 роки тому

only level with warp pipes that really matter is 8-4, and that level is... basically impossible for AI lmao there's both land and water movement which work totally differently, there's pipes that you have to just know which one to go down exactly, there's a section with a required hidden block (or wall jump) and it's long. I believe LuigI/O needed 3-4 networks split up over various points to beat it at all, as well as a custom fitness function to tell it when it was on top of the high pipe, and a penalty for whenever it passed the pipe it needed to go in.

@HandledToaster2 3 роки тому

Hey Chris. I read your blog post on how this whole thing works. I am currently at the start of my Data Science & ML learning journey, taking some Udemy courses by Jose Portilla and Kirill Eremenko. I could somewhat grasp what you meant by action = f(state) and the fitness function, but otherwise I have no idea what happened or what all those nodes even mean. I just want to say... this is inspiring. I want to come back here in a few weeks, perhaps even months, and read your blog post again, but this time understanding everything perfectly like it's just an intro programming class. Then maybe I'll reverse engineer it and do your project from the beginning as exercise. Thanks for posting awesome things :)

@Chrispresso 3 роки тому

Glad it's inspiring! All my stuff is always freely available on GitHub. I've got a few other things currently in the works, so you should always have some stuff to follow along with :D

@HandledToaster2 3 роки тому

@@Chrispresso awesome man, got the notifications on for your channel, can't wait to see some more :)

@years8703 3 роки тому

This is so fascinating

@HeduAI Рік тому

Thanks for this awesome video! Can you please tell what is meant by "a total playtime of 5 years"? How are 3 weeks of training time equivalent to 5 years? Thank you!

@hiteshKrawal Рік тому

It would have been more satisfactory to watch if Mario jumps on the turtles.

@veggiet2009 3 роки тому

Did you know about Mari/o project beforehand? And if so did you tackle this project any differently?

@Chrispresso 3 роки тому

Ya I saw MarI/O like 4 or 5 years ago or something like that and it's actually one of the reasons I first wanted to learn about AI. They are very different projects from what I can tell. MarI/O uses something called NEAT which basically allows the neural network to change architecture. I instead did a fixed architecture with a deep network and had a genetic algorithm control the weights of the network. NEAT has similar attributes to a genetic algorithm but not quite as good (my opinion) since it has to build the network from the ground up and often doesn't work well without re-training from one environment to the other. From what I can tell, MarI/O allows any box on the screen to be considered for input. I restricted mine to a certain view since I used a dense (fully connected) neural network. I'd say the biggest similarities are NEAT and genetic algo + neural network both rely on generations of improvement and they both use boxes of pixels as inputs. I also carried weights over between levels to help advance the next populations and to carry over some information like danger of enemies. Hopefully that helps and isn't too confusing!

@veggiet2009 3 роки тому

@@Chrispresso very cool, I could tell that it was an improvement, as It seemed to be adaptive, though I couldn't quite tell why

@abrasmage 3 роки тому

@@Chrispresso is reinforcement any better than genetic?

@Chrispresso 3 роки тому

@@abrasmage Hard to say if it's better or not. RL algorithms like A3C and PPO have seen a lot of success in video games though. I would say the main advantage of RL is that you let it learn even more by itself and let it figure out what it cares about.

@chariouibouchaib4416 Рік тому

Hello Chrispresso. Do you know Dofus Retro - Ankama? Could you code and Ai who can learn to make a battle with monsters and win it?

@Keyshooter 3 роки тому

"unfortunally mario was unable to beat this level" and my heart sank in sadness

@Chrispresso 3 роки тому

Don't worry. Mario will come back with a vengeance.

@Magma_Bolt 2 роки тому

This video is absolutely amazing! There's one thing that really confused me though, how is it possible to record all of these games played by the A.I.? If there's the capability of finding a specific Mario's run which did specific things of interest and utilising the visual recording of their run to aid in the explanation of the technique being discovered it must have been recorded but I have no idea how it's possible to do so given the amount of data it seems that would need to have been stored given the huge number of runs? I'm also confused about how it's possible to find the point at which the AI discovered these strategies/techniques? Please can someone let me know? :)

@decyattysyachpchyol Рік тому

He probably ran example runs with graphical output after X% clear rate.

@saqibperwaiz4043 3 роки тому

Brother it would be very helpful if you could suggest topics that I should learn related to this.I know basics of Neural Network and have started learning Reinforcement learning.

@markopancic3078 2 роки тому

Hey I'm trying a similar project and I was wondering how you went about detecting enemies and blocks etc? I was going down the route of using open CV to try match the sprites with match template but it seems too slow. color filtering and edge detection seems too fiddley and hard. Hoping you could throw some insight my way haha

@markopancic3078 2 роки тому

or did you do that and that is the reason you only did over world levels

@Chrispresso 2 роки тому

I read values from RAM for the emulator. You can check out the source code for more information.

@ph30nix62 3 роки тому

Have you tried increasing the complexity of your reward system? Such as diminishing returns or having conflicting reward triggers? 🤔 how would the NBC balance out rewards that have a cost of some type? Also what does it call a wall jump?

@NeunEinser 3 роки тому

Shiny, yet deadly coins.

@maurox1614 3 роки тому

Just a question. What input are you giving to the ai? Just one "simplified" frame at a time or multiple frames? Are you giving other inputs like the buttons pressed in the previous frame/s ? What I don't understand here is how the ai can develop the concept of "velocity" that is strictly connected to the concept of time, if the input is only a single frame and therefore there is no correlation with the previous frame/action. Thanks in advance if you want to respond me!

@Chrispresso 3 роки тому

I just give it the current frame it sees. This AI has no "memory" and no experience replay buffer. It doesn't have a concept of velocity necessarily, but it might figure out that being too close to X while holding Y is dangerous. So it may end up releasing X. That could cause it to slowdown if the buttons correspond to running towards the right.

@PalaceDude 7 місяців тому

I feel so stupid, my whole life I did not know you could jump on a wall like that, and this AI figured it out on the second world. Bruh.

@pooglechen3251 3 роки тому

Looks awesome, I'd love to see this with other side scrollers such as Adventure Island, Ninja Garden, or Kirby.

@Chrispresso 3 роки тому

That would be cool! I'll look into some of them! Side scrollers are a good environment for this type of AI.

@divyanshukumar2605 2 місяці тому

The ending was emotional

@martinmartossimon8078 2 роки тому

Great job!!

@TxoriCom 3 роки тому

Hi. I was wondering if your AI code could be plugged on other kind of games or if it needs to be this particular ROM. Does it read the screen or the ROM code in order to do the magic? Because I made some Flash games a while back, and I'm very interested to see if an AI could learn how to play them. Is this possible? Edit: Oh, I see that you are reading the RAM :)

@Chrispresso 3 роки тому

This one does in fact read from the RAM, but you can definitely read from the screen instead!

@yommmrr 3 роки тому

I was thinking itll never hold a record if it cant flag pole glitch and then it learns it. This is awesome.

@Chrispresso 3 роки тому

Haha it might still never hold a record, but at least it was able to learn a lot.

@QuestforaMeaningfulLife Рік тому

Brilliant!

@tanmad21 3 роки тому

I'm curious how it learned the routes to take on the maze castles - did learning from previous castles cause it to take the wrong route on later castles?

@NobungaGames 2 роки тому

What did you use for the visualization on the left to show your neural network?

@SUNILKUMAR-og4vm 3 роки тому

Hi, Nice implementation. I read your blog, for fitness function what is the reason for choosing values 1.8 and 1.5 as exponent?

@Chrispresso 3 роки тому

I just wanted a slightly larger exponent for rewarding distance than penalizing time. I also wanted both to be exponential to more greatly impact large differences between values.The actual values of 1.8 and 1.5 were somewhat random, but the difference of .3 was chosen by graphing potential distances/times and ensuring that the fitness would continually increase and not go negative.

@SUNILKUMAR-og4vm 3 роки тому

@@Chrispresso Okay, you got the difference by plotting a graph. To plot this how many simulation you ran (How many data points)?

@Chrispresso 3 роки тому

@@SUNILKUMAR-og4vm I didn't run any simulations with the AI for that. I just took the total distance of the level and gave different random times it could probably finish in. I just wanted to make sure that if the AI finished the level in 20s and another AI finished in 40s, that the fitness between the two would be quite large. So I just created a bunch of fake data to see what it would look like.

@SUNILKUMAR-og4vm 3 роки тому

@@Chrispresso Nice approach, thank you for sharing. Can you suggest any resources to read about developing and optimizing a fitness function.

@Chrispresso 3 роки тому

@@SUNILKUMAR-og4vm I actually can't. I have mainly learned through trial and error and I know there is research being done in that area. I would say just think about what you really care about and make sure not to add bias. Like I could have rewarded "enemies killed", but then that doesn't weight equally between levels since there isn't the same number of enemies. It could also add unwanted bias toward certain sections of levels where enemy spawns are substantially more. A rule of thumb for myself is to think about the end goal and reward the end goal. Anything in the middle is bias. Chess? Reward winning. Don't reward capturing a piece, otherwise you introduce bias to the AI. If capturing a piece is important, the AI can figure that out. Same applies to a lot of different scenarios.

@shromp2034 3 роки тому

Technically you can do the bullet bill glitch faster (in 8-2) by using it to hit the bottom block it skips Mario walking to the castle and counts the timer points instantly but still impressive non the less

@robgable2426 3 роки тому

Hey!! Take it easy there Skynet!! We don't want Terminator Mario running around looking for turtles to murder do we?!! 🤔😅

@PeaceInExile 3 роки тому

This is pretty cool. I'd like to see how an AI like this would handle the original Ninja Gaiden.

@yenkina 3 роки тому

Does it possibly mean we could use AI to find speedrun glitches in the future ??! Interesting

@Chrispresso 3 роки тому

Definitely! I'm not sure how long it took people to find out about the flagpole glitch or wall jumps, but it didn't take too long for this AI. So it's very possible to use it to find other glitches in all types of games/environments!

@henrikbrautmeier6534 3 роки тому

Automatised runs are already a tool to find bugs in the classic development process. But, usually the studios don't use AI for that

@snoo2496 3 роки тому

@@Chrispresso This is already used in celeste classic, with a few python scripts and some restrictions, it's not really an AI, just brute forcing with restrictions

@relic374 3 роки тому

Three things 1) This game is already *super* optimized. It is possible, but unlikely in my opinion. 2) There is such a thing as a TAS, which is when a human inputs inputs for a computer to execute. So I guess that's like AI. 3) @Chrispresson I'm sure that wall jumps were ran into during development or at least the wall jump pixels.

@lostnumbr 3 роки тому

@@relic374 a tas is just a predetermined set of inputs given by a person. it's nothing like a self taught ai. a tas isn't learning how to beat the game on it's own.

@garycraft1101 2 роки тому

One thing came up to my mind is that there might be some adjusting that may affect the simulation. It seems like in this simulation the AI has ability to push every button when ever its needed but in reality, at least arrow keys are limited due the physical mechanism of the controller. I believe that in real Nintendo controller you can not push up and down keys at the same time. Also left and right cannot be pushed same time. Then there was some limits of how many buttons can be pressed at the same time. If I remember correctly it was something like, if you pressed too many buttons at the same time, then the controls just jammed as long as you pressed too many buttons. So this simulation can be adjusted to be more realistic by adding the real controller limits for the AI. I think this works because it was made with Nintendo emulator, but if these controls were fixed to real Nintendo inputs, then the result might be totally chaotic.

@GameplayUploaded 3 роки тому

I would love to see the competition between AI against TAS someday.

@hugo-garcia 3 роки тому

AI : Learn to play super mario bros SpeedRunners : Hold my beer

@KSATica Рік тому

how are you able to show the matrices on the left side screen and map performace of whats mario doing.

@hamtsammich 3 роки тому

How did you make the view on the left? Would it be possible to make something similar for megaman? (for example) If so, how?

@Chrispresso 3 роки тому

All my code is on GitHub so you can check it out in the description. I just parsed the RAM addresses for block types (easier for NES, but much harder for N64). The neural network is drawn with just a bunch of lines and circles. You could do something similar for Mega Man. I haven't played it in forever but most platformers could follow a similar approach to what I did here. Maybe I'll look into Mega Man a bit more....

@hamtsammich 3 роки тому

@@Chrispresso I've been thinking about a megaman version of this for quite some time, but I've been stuck on how to get the neural net to recognize the game state. I don't much know anything about reading the rom data, especially when my vision includes using data from the sequels.

@airdouannoymous Рік тому

And by learning that not all safe blocks were really safe, AI Mario lost his trust and began to question reality.

@thepoordog 3 роки тому

do you think you could make an AI for the new super mario 3d world on nintendo switch? Coming up with a fitness function/loss function for that would be difficult i'd imagine

@Chrispresso 3 роки тому

It would probably need to use something more advanced that this type of AI. This AI doesn't receive feedback until the end of an episode. For something like Super Mario 3D it would need to update as it progresses.

@bagorolin 2 роки тому

How do you actually make the ai "see" the game. Really just picture recognition or is there some API used?

@captainwilliams1325 3 роки тому

shiny yet deadly coins, now bane of a.i.

@RodriHermo 3 роки тому

Very interesting video. Also, which is the AI PB until 8-1?

@diegovalentino2083 2 роки тому

Chrispresso en el nivel 8: Ah, por cierto IA, se me olvidó decirte que las monedas no son bloques IA: La decepción, la traición hermano

@grim66 3 роки тому

This is a really interesting video, but as someone who isn't super familiar with how neural networks work, is there an easy way to describe what the 10 "additional" inputs (the gray/blue dots) are? I heard them named by something about a row but I don't get what they represent.

@Chrispresso 3 роки тому

Ah, good question. The 10 additional inputs are a "one-hot encoded variable" That means that at most one of those 10 can be "active" (have a value of 1 in this case) at a time. The reason for doing this is to know which row of the pink rectangle it's in for the input. You give it a one-hot encoded variable to get rid of potential bias. So you're no more likely to think being in row 1 is better than being in row 2 or 3, etc.

@grim66 3 роки тому

@@Chrispresso So they're basically just a height indicator? "Which row of the screen is Mario in" In that case, what exactly does the gray/blue state represent?

@spencermoran2970 3 роки тому

How did you identify the types of blocks? Did you need to run some sort of separate classification algorithm to know if a group of pixels was an enemy or obstacle?

@Chrispresso 3 роки тому

That was hardcoded. I wrote a portion of code to read the RAM values and compare what they are to determine the type of block and then just grouped them accordingly.

@christaltaylor473 2 роки тому

nice 3-1 wall jump save

@jodeboyd8386 3 роки тому

last time i checked every other time it gets to a point where it stands next to the pipe and just looks at it

@Chrispresso 3 роки тому

Do you expect a plumber NOT to inspect a pipe???

@Steev42 3 роки тому

Aww, I really wanted to watch one try one of the maze levels.

@Chrispresso 3 роки тому

Don't worry. Next time it will

@kakyoindonut3213 2 роки тому

AI: accidentally learn wall jump and flag glitch and was able to execute it smoothly in every play also AI: "yeah I'm pretty confident I could jump on that block"