Neural Network Football
Neural Network Football is an experiment of developing a football-playing artificial intelligence by utilizing neural networks that are trained using reinforcement learning and genetic algorithms.
Original Neural Network Football was published in summer 2012.
The environment
The goal of this project is to see if it is possible to teach a neural network AI to play a highly simplified version of football in an entertaining fashion.
“Entertaining” in this case does not necessarily mean “human smart” way of playing football but any kind of game play that is fun to observe.
To keep things simple, the game of football was reduced to few key elements: two teams, two opposite goals, one ball. The team that drives the ball to opponents goals more often is the winner.
If team scores, ball is moved back to the center of the pitch and teams are returned to their starting positions. There are no further rules, e.g. all ball out of play situations are omitted.
The edges of the pitch are solid so that the ball or the players can not accidentally get out of the pitch.
Screenshot of the match in progress:
Structure of the neural network
Each football player is controlled by neural network AI.
The same neural network is copied to each of the team members, which means that any member of the team would act similarly on same inputs.
The neural network is given the following input vector:
- Player’s distance to ball along x- and y-axis
- Player’s distance to opponent goal along x- and y-axis
- Player’s distance to nearest team mate along x- and y-axis
- Player’s distance to nearest opponent along x- and y-axis
- Player’s distance to nearest field edge
The neural network consists of two 10-node hidden layers and connections between the nodes. Each connection has its own weight that is applied to the value of the node.
This means we have three matrices of weights with the following sizes: (number of inputs)x10, 10×10 and 10x(number of outputs).
Each node calculates a nonlinear weighted sum of its inputs.
The four outputs of the neural network control four parameters of the player:
- Player movement speed
- Direction of player movement
- Power of kick
- Kick direction
Movement speed and kick power are limited certain ranges. Player can kick ball to any direction while moving to any other direction.
For simplicity, player tries to kick the ball on every game step, i.e. there is no decission making whatever or not player should try to kick.
This also allows player to run with the ball by applying small kicks to the ball on each game step.
The final structure of the neural network:
Evolving the network
Initially there are 24 randomly generated neural networks – one for each team. Teams attend to a tournament in which all teams play twice against each other, a home game and an away game.
For each match the teams are awarded with tournament points and fitness points. For winning a match, team receives 3 tournament points, a tie is worth 1 point and losing a match does not give any tournament points.
Fitness points are based on the overall performance of the team and they accumulate during the tournament:
- Distribution of the team’s players gives fitness points. The more distributed the better (to a certain extent)
- Each players’ distance to the ball gives points. The closer the better (to a certain extent)
- The distance of the ball from the opponent goal gives points.
- Each successful kick of the ball gives points. Powerful kicks are more valuable
- Each goal gives points.
Once the tournament is complete, the teams are ranked based on their tournament points, and if the points are equal, by their fitness points. The best five team are kept for the next tournament.
Rest of the teams are given new neural networks that are generated from the Top-5 teams’ neural networks by randomly selecting two of the top networks and mixing the connection weights of those networks.
The resulted network is then mutated by multiplying the weights with 1-mean normal distribution random value and by adding 0-mean normal distribution random value to them (standard deviation varies based on how “severe” mutation we want to apply).
This next generation of teams then starts a fresh tournament with all of their points resetted.
Creating new neural network from two winning networks:
Increasing the performance
As the neural networks require lots of matrix operations, resolving even a single match takes approximately 10-20 seconds on a modern CPU (using single core).
Each tournament has 552 matches and tens of tournaments need to be played before any progress in the evolved AI can be noticed.
In order to speed up the match resolving, a distributed solution was created. The Host initiates a tournament by creating a pool of all matches in that particular tournament.
An Agent queries the Host for a match that needs to be resolved. The Host serializes the match parameters (team names, their neural networks etc.) and sends them to the Agent.
The Agent then resolves the match results (team score and fitness points) and records the match events (player and ball movement) and sends the results back to the Host.
The Host collects the data and once all of the matches have been resolved, declares the tournament complete and ranks the teams based on their performance. Then the Host initiates the next tournament and the cycle continues.
Since the Host and the Agent communicate over network, Agents can be run on multiple PCs – and multiple Agents can be run in parallel on PCs with a multi-core CPU.
This gives a significant boost on the tournament resolving performance and therefore speeds up the evolution as more generations of neural networks can be tried in the same timeframe.
Random thoughts
- The actual neural network football software is written in Python, and uses Numpy for math operations and Pygame for runtime visualization.
- Each match record can be visualized via this web page’s HTML5 + JavaScript player. Currently, only the latest 2 tournaments are displayed completely in order to keep the loading times at bay.
- The structure of the neural network is far from perfect. It would be interesting to change (or add) the inputs to something completely different (e.g. distance from center of own/opponent team).
Endless possibilities but unfortunately changes to neural network pretty much mean scrapping the current networks and restarting from “generation 0”. - Currently there are 5 to 10 agents resolving the matches. If there’s enough interest, I might investigate a way to share the agent software.
- Sometimes evolution can become stalled. Good example was tournaments from ~250 to 286 where certain playstyle was so dominant that there was practically no difference between the top-5 teams which meant that crossover produced similar networks
and mutation itself was not able to produce enough variation to overcome the dominating playstyle as its standard deviation was ~0.1. This was solved by making the standard deviation of the addition component to have a relation to the mean value of the weights of connections between two layers.
This ensures that the addition component has same scale (e.g. 1, 10, 100, 1000 etc.) as the weight it is being added to.
~315700 matches played.
Match with highest points: Tournament 449, match 0: AA 8 – BB 0 (8375530 – 280569)
Match with most goals: Tournament 449, match 93: EE 8 – BB 0 (8342491 – 287079)
Tournament 571 | Matches played: 552/552
Tournament 570 | Matches played: 552/552
Random matches from older generations that show how the AI has improved over time
Tournament 286, match 258: LL 6 – FF 0 (6395939 – 237654)
Tournament 227, match 368: QQ 4 – AA 0 (4422505 – 288856)
Tournament 200, match 42: BB 0 – UU 0 (588763 – 205841)
Tournament 100, match 1: AA 0 – CC 0 (99621 – 179896)
Tournament 50, match 80: DD 0 – MM 0 (2271456 – 220127)
Tournament 10, match 24: BB 0 – CC 0 (256687 – 236597)
Tournament 0, match 13: AA 0 – OO 0 (347255 – 341457)
Leave a Reply