Optimization of CNN Architecture using Genetic Algorithm for Image Classification

Sai Nivedh
6 min readNov 11, 2020

--

To achieve best accuracy the CNN architecture must be modelled with various numbers of architectures with varying number of filters, kernel size, number of layers etc. To avoid the tedious trial and error method, this article uses Genetic Algorithm to obtain optimum CNN architecture. The project is implemented using TensorFlow and Keras over Face Database from AT&T Laboratories, Cambridge University [2].

1. INTRODUCTION

Feed forward Neural Networks can be used to solve any kind of regression or classification problems but lacks in the field of computer vision as the number of parameters to optimize is very high in fully connected layers also ANN’s cannot identify the objects in a given image due to these reasons ANN’s is not recommended for identifying object in an image. The usage of Convolution Neural Networks in the field of Image classification has achieved remarkable success in recent years. Automating the design of CNN’s is required to help ssome users having limited domain knowledge to fine tune the architecture for achieving desired performance and accuracy. Usage of different evolutionary methods such as Genetic Algorithms helps in simplifying, automating the architecture of CNN’s and also to improve their performance [1].

Section 2 presents working of Genetic Algorithm in a great detail. Section 3 shows the working methodology of proposed Genetic CNN and we wrap up with the results and conclusions in Section 4.

2. GENETIC ALGORITHM

GA is an heuristic search algorithm which is inspired from biological evolution analogy of crossing over fittest chromosomes to generate off springs. Genetic Algorithms work by applying “random” changes to current solutions in order to create new ones. To select the best parameters, a fitness function is used and solutions representing the higher fitness value is chosen. The figure below explains steps involved in implementing Genetic Algorithms.

Image by Author

INITIAL POPULATION

Image by Author

The initial population is a set of randomly generated chromosomes. Given the number of filters for each layer and also size of kernels for convolution, the NumPy random generator gives us the initial population

FITNESS FUNCTION

For each individual of randomly generated initial population, train the CNN using training data and calculate accuracy of CNN architecture using the test dataset. Here accuracy multiplied by 100 is used as the fitness function to be optimized or maximized. The parents in the next step are chosen based on this fitness values.

PARENTS SELECTION

Image by Author

At each step the top individual of the population having highest fitness values is selected, this process is continued till the required number of parents is obtained for crossover and mutation in the next steps

CROSSOVER FUNCTION

Crossover Operation. Image by Author.

Two top chromosomes of the parent population is chosen to create a new individual by applying crossover. Here a fixed cross over is used, randomly selected cross over can also be used. It works by selected half of values from parent I and the rest from parent II. The process is as shown in below figure and child population is presented in table 3

Image by Author

MUTATION FUNCTION

Image bu Author

The mutation function adds some newness to population by introducing new values in the individuals. It works by selecting random layer for each individual and adding or subtracting a randomly generated number from the current value. This helps genetic algorithm to try new parameters rather than continuing with the same initial population. The table above shows mutated child population.

GENERATION

The obtained parents and child populations are combined to create a new population on which fitness of all the individuals are calculated and then the whole process is repeated until the maximum fitness value of the population stabilizes or converges.

Awesome !!!. Be proud, you know one of the most powerful yet easiest search algorithm.

Photo by Josh Rakower on Unsplash

3. METHODOLOGY

Here we discuss the complete steps involved in optimization of CNN architecture using Genetic Algorithm.

DATA SET

Face Dataset (40 people, 10 images each). Image by Author

The dataset used for training and testing is obtained from Face Data of AT&T Laboratories at Cambridge University [2]. Since, I had no access to high computing GPU’s a small dataset is chosen but the procedure remains same for any complex dataset. The face dataset contains 10 images of 40 different people with different facial expressions (glasses, smiling, eyes closed etc.). The figure shows random faces with class labels from dataset.

CNN CONFIGURATION

Sample CNN Architecture to be optimized. Image by Author

The CNN to be optimized consists of 3 layers with max pooling layers in between and output of conv layer is connected to fully connected neural network which gives predicted probabilities over 40 different classes. The figure below describes a sample CNN architecture(? denotes the batch size given at compilation).

VGG16 CONFIGURATION

Complex VGG16 Architecture. Image by Author

This genetic CNN architecture is compared with pretrained complex VGG16 network on ImageNet challenge, the convolution layer weights were kept constant only the fully connected layers are trained on the face dataset. The figure shows layers and number of parameters involved in VGG16 architecture.

TRAIN AND TEST CNN

The dataset is split into training and testing datasets. First both CNN’s are tested on training dataset with maximum epochs of 20 and batch size of 32. At the end of generations the individual with maximum fitness population is selected and tested on test dataset to obtain final accuracy.

4. RESULTS AND DISCUSSION

The Genetic Algorithm is run with the following parameters.

Image by Author.

Due to various randomness involved in the initialization of TensorFlow tensors during training, maximum fitness accuracy on test data kept oscillating around 97.5 % as shown in figure.

Image by Author

After 10 generations the best architecture obtained is:

Image by Author.

It is to be noted that even having size of kernels a maximum of 20 at the end we obtained the best kernels of size 3,1,1 which is the common size in all the famous CNN architectures such as AlexNet, VGG16, ResNet etc,. The plot below shows Percentage classification accuracy of best genetic CNN architecture for each face label.

Image by Author

The complex VGG16 model achieves an accuracy of 100 % on test data.

In conclusion, this article demonstrates the application of Genetic Algorithm for finding best CNN architecture in the field of computer vision. This greatly helps people who are not well aware of complex available architectures without losing much accuracy and saves time when compared to following regular trial and error method. The applications of GA can be used in many different fields where search of parameters is required to maximize fitness function.

Please feel free to ask any queries regarding the implementation. See you soon. Till next time, enjoy Machine Learning !!!!

Github Repo: Genetic-CNN

5. REFERENCES

  1. Yanan Sun, Bing Xue, Mengjie Zhang, “Automatically Designing CNN Architecture Using Genetic Algorithm for Image Classification”. IEEE Transactions on Cybernetics, 2020.
  2. Face Database, AT&T Laboratories, Cambridge University: http://www.cl.cam.ac.uk/research/dtg/attarchive/face database.html

--

--