Anyone for a game of Space Invaders? – Page 4 – Educational thought experiments in AI/ML written up with fun in mind along with source-code in GitHub

Choosing the neural network configuration

This is where my latest creation deviates from my previous experiments.

Up until now, they all shared the same basic neural network code base. At the start of the experiments, the network layer sizes were set rigid, with the same definition between generations. Evolution came about from mutating weightings and biases. And in case you hadn’t noticed, I pretty much always used TanH as an activation function… They also never had self-connections.

You can still initialise the network in that manner, but the preferred way is to decide how many neurons, the desired types of them etc and let it create random networks.

The first choice is “Perceptron” or “Random”.

If you pick “Random” you get to choose the range of neurons and how many to initially start with. There are configuration options relating to the neuron’s behaviour, but we’ll cover them in due course. The range is because it is allowed to add/remove cells during mutation.

If instead you pick “Perceptron”, then you provide a comma-delimited list of hidden neurons in each layer. “0” is a special case that substitutes the number of inputs.

Training is performed by AIs competing to get the best overall fitness score (different to the game score). You could theoretically run 500 games at once, but how long each “round” of games will increase proportionately. I typically pick 50 games if using the video screen as an input, or 100 if not. The reason for 50, is the AI adds a performance hit due to the large number of inputs and connections required.

Next, you get to decide the mutation parameters. The “first time” can be used in one of two ways. The number of moves increases by 2% with each generation. If you start low, you get to reinforce the behaviour you want early (discarding non-conformant). But the games end prematurely (terminated due to running out of moves, not because of lives lost). You run a risk of discarding a potentially good AI that is slow but consistent.

Personally, I pick 200 or so for new configurations and see how it goes.

If you pick a template and forget to set the “first time at” sufficiently “large enough”, it certainly won’t start with the score it was saved under (because it is killed due to running out of moves).

You next need to decide the selection type, this as the UI says, determines how it chooses which parent brains to “breed”.

The process can be viewed as:

play 100 games
at the end of the game rank them based on configurable fitness scoring parameters
sort the highest scoring to the top of the list of “brains”, the best are known as elites
for the “% preserved” (aka elite), we put them through to the next round
for a number of brains (“% random”) we create completely new random brains for the next round
if “% templated” is non-zero, we create that many brains from the supplied template.
that leaves us with a number out of the 100, that have yet to be considered
for those, we make “offspring” by picking 2 parents and performing a cross-over (swap genes)
lastly for those “offspring”, we sometimes mutate them
we put the “offspring” into the next round
repeat, using the latest cohorts

Important to note:

Elites do not get mutated, and are preserved in that round, they can and should lose out to better-performing AIs in due course.
Offspring are based on 2 elites, but only sometimes with a random mutation. It could be that merging two good neural networks provides a better one. It could also provide a worse one. By sometimes applying a mutation, you get the best of both worlds.
It always starts each round with the same number of AI brains (the number you asked for).

Activation functions

An activation function decides whether a neuron should be activated or not. This means that it will decide whether the neuron’s input to the network is important or not in the process of prediction using simple mathematical operations. A byproduct of some functions is they stop the outputs from amplifying and hitting infinity like TanH which clamps -1..1,

I’ve provided the option to pick which activation functions it is allowed to use during creation and mutation:

Typically I will unselect most of them, and pick Gaussian, HardTanH, Absolute and Not.

I have no idea which ones work best. A fair question to ask is “Why?”. Simply because each time you run it, a new neural network is created, with random connections, biases and weightings. And that randomness is using a cryptographic generator… It’s hard to decide if the AI performs well because of the activation function or if the random biases/weightings are better.

The list of activation functions is based on those I found on the internet. If you know of one I should include, please add a comment and let me know.

Cell Types

We then get to decide what “cell types” it should create neurons as. Not content with simple perceptions, I wanted to introduce some more specialist cell behaviours! The weight determines how prevalent they will be when it randomly creates the brains.

I learnt basic electronics around the same time as coding and can see why you would want the brain to handle logic.

For example: IF I sense a bullet that will hit in 10 frames AND I am not under a shield, THEN I need to move.

Does this make my AI better than others? That I cannot answer, yet. It’s possible greater minds have tried and found good reasons it didn’t work, or use it too. My desire is to experiment and come up with originality in due course, not to spend years trying to keep up with others’ ideas…

Mutation

If you create a layered network, it may not be optimal. In fact, it’s possible no amount of bias/weighting tweaks will make that network ever do what you require. It might need extra neurons in certain places. Technically one could argue that you could simply make the network a lot deeper. You then use identity (passthrough) as an activation function. But you cannot do self-connections or have lower connections feeding higher up if it is a top-down / left-right network. Is that important? I would argue, absolutely. Real brain cells are not coupled in one direction (left to right layers) but to each other.

The interface allows the brain to grow additional cells, remove ones, change how cells work, grow new connections etc.

Determining the “fitness” of an AI

You run 100 games, which AI was best? You might well shout out “score” as the answer. Of course, our goal is to maximise the score.

But the goal is actually broader and more complex. For example, shooting a saucer (max 300pts) at the expense of moving to the next level becomes a limiting factor. Completing levels means you’ve earnt 990 points, if you do that an infinite number of times your score is infinite; the number of saucers becomes a moot point.

You want to end levels with all lives intact. By doing so, you stand more chance of playing for an infinite time. It is counterproductive to finish a level with 3 lives but the game ends because the invaders reached the ground.

Poor accuracy alone doesn’t necessarily cause failure, whereas better accuracy helps clear the level sooner.

Choosing between an AI that clears all 55 invaders + kills 2 saucers is better than an AI that kills no saucers, but only if you’re constrained by time (or dying).

Therefore I have included multipliers for a lot of performance indicators, enabling you to decide how best to move to the optimal player.

Settings

Configuring things more fundamental that help the AI know the expectation is somewhat limited. You choose what it must use to play (input), and how you want it to steer / fire. But there are a few other things worth mentioning.

AI has better computational speed than us carbon-based lifeforms. It doesn’t need shields, and they are more of an inconvenience. They reduce accuracy (blocking shots). Experiment by turning them off.

The “death score” is an optimization. If an AI player dies with a score of 900, and another AI player has a score of 20 and then loses a life, we can make an assumption that it isn’t going to reach 900 on the final life and kill it off early. This speeds up generations.

Another reason AI players can be killed off is if they blow up their neural network. Neurons have min/max restrictions, that when exceeded any subsequent processing is likely to cause an error. To protect the environment, it terminates them.

Starting level and score, and single level are interesting experiments.

You can easily train AI to complete ONE level without losing a life. As a result, you can train 10 AIs to complete individual levels and chain them together. It’s not cheating. The gotcha in this is to remember the game changes based on the score. The invaders fire more frequently as speed increases. That’s where the “starting score” comes in. If you train all 10 levels starting with “0” then it’s not going to fair well as it goes through the levels, in fact, it won’t get far. I got burnt by forgetting that!

“Single level” is useful because it stops the game from progressing to the next level. This enables it to end the level sooner, and therefore it learns sooner.

After training them, you can provide the respective start scores and templates to use. Don’t make the mistake I made. If you run the AI for level 6, and it gets midway thru level 9 (yes I did that despite the difficulty), the score you put is when it starts level 9, not when it dies in level 9.

You’re probably thinking, how do I save a template? Press “B”, and it puts it in the /bin/Templates folder for the next run.

I cannot realistically paste the code for the AI here. If interested, go to GitHub and look in the /AI folder.

It’s probably worth me mentioning “quick mode” – press “Q” to turn it on/off.

In quick mode, it runs as a background worker, without rendering Bitmaps, making it faster, instead plotting progress graphs at the end of each epoch. I am quite pleased with how the telemetry turned out (and my simple graph component that plots 500 epochs, and different data points).