Anyone for a game of Space Invaders? – Page 5 – Educational thought experiments in AI/ML written up with fun in mind along with source-code in GitHub

Getting a very high score

To understand why the game is beatable, you need to be aware of how limited it is. Once you’re an ace (scoring 10k+ puts you in the top players apparently), the game ceases to increase in difficulty. It ends because of a momentary lapse of hand-eye coordination, no other reason (perhaps maybe a toilet break?)

Game Difficulty

Just two factors control the difficulty:

the starting height of the invaders (lower = harder)
the firing rate of the invaders

Reaching and completing level 7 is the forever-play moment.

On level 7, the racks of aliens start at their lowest point. Quoting Topher’s disassembly, there is a table that defines what height the invaders start at for each round that I have drawn below for you.

As you can clearly see, levels 7-9 are identical and 10 gets easier – the aliens are near the top.

Next, let’s look at the “reload” table:

I originally used /256, because MSB/LSB is normally done that way in assembler… The Space Invader scoring uses “BCD” (binary coded decimal). So “30” represents “3000” not 48×256.
Dave makes mistakes too.

When the score exceeds 3000 points, the reload fire rate of the aliens is at its fastest… By the time you’ve reached level 3, you’ll probably exceed 3000 points. By level 7, you’re at maximum reload fire rate with the invaders starting at their lowest point.

The game cannot get any more difficult.

Bonus Points

Hitting the saucer (counting shots to do so) and getting 300 points is incredible skill-wise esp. at later levels, but it’s a distraction when you automate.

I often award the AI for hitting the saucer, but I don’t bother to actively encourage it.

When something can kill every invader you throw at it, the score increases constantly. Hitting the saucer could give 300 points, but instead, it simply shoots more bottom invaders to get the same points. If it can’t die, and it isn’t in a hurry, why expend the effort to hit them?

If you feel differently, feel free to train it to treat saucer destruction more seriously.

How to make an AI that wins

Given that the game does not get progressively more difficult, it’s easy to win.

Breaking the game into “distinct” difficulties

Earlier levels are at a slower firing rate.
During level 3/4 it reaches the fastest firing rate (saucers bring it on sooner), which continues for ALL subsequent levels
Level 1 has the aliens positioned highest, with them getting lower until on level 7 they are at their lowest and remain so for 8 & 9.
At level 10, the aliens are positioned at the height it was for level 2

Here’s the table showing both why level 7 is the forever level (lowest and fastest firing), and also the repetition:

That means an AI, only needs to beat levels 1-4,5, 7, 10 & 11 to be 100% perfect.

We can go about training the easy way or the hard way.

The hard way would be to say we must have a single neural network that completes every level perfectly. How big does it need to be; how long will it take? A while I suspect, and I strongly suggest you don’t try evolutionary approaches if that’s what you plan – stick to reinforcement learning. You may find an AI that goes thru 3 levels perfectly, then gets stuck on the 4th. I managed to get 3 hidden neurons to complete levels 8-12, and it was on level 13 when I stopped it. But that doesn’t mean it will complete all the levels any time soon.

An evolution where you score and select the best with each epoch is inefficient but hugely practical if you frame the problem properly, and that’s where my approach wins.

The process goes like this:

Start with score 0, level 1 only, and let the AI learn in “quick mode”. Make sure you reward the AI predominantly on invaders (goal = clear all), bonus points for saucers destroyed on the way, and additional points for lives at the end of the level.
When it has reached level 2, and all 3 lives are present, it is trained.
Press B to save the brain as a template. Restart the app.
Repeat the process for each subsequent level 2, 3, 4, 5, 6, 7, 10 and 11. Be careful on level 2 that you wait for it to finish with 4 lives (as it passes 1500 and gains a life).

Training complete. Technically if you reach 9999 it goes back to 0, and so you gain another life at 11,500 points. I did not implement that, intentionally. It helps if one does not zero the score if you want to see it break records.

Now go onto the “AI Play Config” tab, and enter each template such that the score it starts the next level with corresponds to the score the prior level completed with.

It only learnt 1-7, 10 and 11.

For 1-4, enter the level

For 6, enter 6,+6,+1,+1

For 7, enter 7,+1,+1,+6

For 10, enter 10,+8

Lastly, for 11, enter 11,+8.

Set the starting level to 1, starting score to 0, and untick “single level”.

Click “AI Play“. It will now run forever, well until you stop it. If you use the .json config files and /Templates, all you need to do is click “AI Play”, it comes pre-trained.

I thus claim the #1 top score for AI Space Invaders!

Is this approach fair? Absolutely. If you’ve followed my blog, I’ve shown I can do AND/OR/XOR etc gates, even COUNTER using neural networks. Therefore I could chain the tiny AI neural networks together and make them into one “brain”. That’s less than all the filters and kernels used by “smart” AI.

The only thing irking me slightly is the predictability of the game. The invader that fires is deterministic. For non-rolling-shot bullets, as per the original, the sequence is dictated by tables indicating which one. Obviously, if you destroy invaders, then they fire from an invader higher up or not. The rolling shot depends on where the player’s ship is, as it targets the player. Given the AI plays consistently for the same inputs, all levels that are “repetition” of another will look the same as the earlier one. There is no random number generator built into the original having searched the source code. The only open question because it uses “timers” is maybe the first bullet fired is different because the timer when the game starts is on a different number. That doesn’t stop the approach from working, it just might require either a little more learning or a tweak to the learning process. i.e. check the impact of the timer, and maybe score the AI on playing one game for each variation of the start timer value. Not taxing to do…

The following is proof, please feel free to skip to “summary/challenges”.

To infinity and beyond

I am not Buzz Lightyear, so maybe not quite “infinity and beyond“. But I’ll prove it works beyond doubt. The top AI score is 154,380 on the Atari 2600 version, and the top human score on the real game is 218,870. Therefore my intended benchmark is the latter. You wouldn’t expect any less from me, I hope.

3 neurons, tournament mode, 40% random, 10% elites saved.

Scoring as follows:

The config for the AI is radar + position, with shields, starting at level 1 with a score of “0” and we’re learning a single level.

Click Start AI Learning.

A few minutes or less later, it completed level 1 with 3 lives intact.

We’ve restricted it to 1 level only. Would it have completed any more levels?

Let’s see. We’ll configure the AI Play. We also need to turn off “single level”.

Watching it, with 3 neurons, it chose to shoot a hole in the first shield and hide. It lost a life in level 2 around 1400 and 1600 eventually dying at 1800 points (8 invaders left). To be fair, we didn’t train it on level 2, so it did quite well.

We could theoretically train it on multiple levels, but that takes longer, so let’s not.

It finished level 1 on 990 (no saucers), we now reinstate the single level and ask it to learn level 2. Because the game is different as the score increases (until max reload speed), we need to put in the starting score.

Given this wasn’t terrible at level 2, we’ll use the level template as the basis for level 2.

Don’t forget to increase “first time at” quite large. If you use a “template” and don’t set this large, it will use the template but kill all the brains off at something like 250 frames. By that point, the template won’t differentiate from non-template and you won’t benefit from it.

Click Start AI Learning, and off it goes on mastering level 2. Time to get a cold drink from downstairs, and it’s already reached the goal – level 2 complete with 3 lives intact.

3 lives intact. Oh, not great. Remember at 1500 it gets an extra life. This level straddles that point (990->1980). So it’s not yet ready for us to move on to the next level. It needs to do it with 4 lives.

Previously I’ve just let it play and check when it reached the level with the required lives. But whilst writing this, I realised there is something we can do to speed things up – by adding a check box, and corresponding code avoiding finishing a level when we only want AIs that can do it perfectly.

/// <summary>
/// Player lost a life, decrement lives.
/// </summary>
private void DecrementLives()
{
  --gameState.Lives;

  // the game state controller will display lives remaining
  if (endGameIfSingleLifeLost)
  {
    gameState.GameOver = true;
  }
}

Now we have 4 lives and are ready for training level 3.

We update the play config and set it up for the next level. We’re going to try with the template from level 2. It may or may not be a good template given level 3 has faster bullets.

This one is taking a bit longer. 100 epochs is not a lot. You can see completing the level was done by 9 epochs, alas it lost lives in the process.

After leaving it whilst I did some gardening, it had completed the level with all its lives (level 3)

And the next (level 4), all lives are intact.

That’s level 6 done.

Level 7 is important as it provides for levels 7, 8 and 9. This is the forever level. Once you beat level 7, you are home and dry.

Level 10 is like level 2 with faster shooting aliens.

Level 11 is the last one to beat. It’s level 3 with faster shooting aliens.

With that done, you configure them in the “AI Play Config”, turn off “1 level only”, set the start level to “1”, and start score to 0.

Click the AI Play button.

Your computer will now spend the rest of eternity running thru level after level.

Please note: the actual trained AI files differ slightly from above, because I chose to (after training) prefix filenames with the input/output types, and had to correct the reload frequency for BCD (required retraining of some). But the principle is the same. The source code has the saved templates to win.

The next page summarises and mentions a few of the challenges.

Pages: 1 2 3 4 5 6