Face mask image classification using CNN in Google Colab

We will see how to do image classification on images with or without face mask using Convolutional Neural Network in Google Colab

illustration of the content

Pre-requisites: Image dataset(download)

We will be using Google Colab which is a platform for working with Machine learning. Colab provides storage and enhanced hardware resources for working computationally intensive processes. You need not install anything on your system as it works on the browser. All you need is a google account.

Convolutional Neural Net provides more accuracy when working with image data.

Importing dataset into Google Colab

Make sure you have the dataset downloaded. Go to Colab and click on the folder icon from left pane. Now you can either upload the dataset to the Colab session or to your Google Drive. Files uploaded to the Colab disappears once the session is expired. If you want the dataset to be persistent, you can upload it to google drive and then import the dataset into colab.

I’ll be uploading the dataset to drive and then importing it into Colab.
- After clicking on the folder icon from the left most pane
- Click on the drive icon in files section to mount your drive

google drive mounted in files section

Now lets view the contents of the dataset. Lets import the required python packages. You need not install anything. Just can just go ahead and import any python package you require.

import numpy as np
import cv2
from imutils import paths
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from sklearn.metrics import classification_report

Paste the above lines in a cell and click on the play button on the right of the cell or ctrl+enter to run the cell. On running a cell the contents within the cell will be executed. You can split the code into different cells and run each functionality separately which i feel is very clean to analyse data.

Go to the dataset in files section and you will find two folders named with_mask and without_mask. Right click on each and select copy path and paste them in Colab cell.

withMaskPath = "/content/drive/MyDrive/dataset/with_mask"
withoutMaskPath = "/content/drive/MyDrive/dataset/without_mask"

This is the path I got. This path is required to read the contents of those folders.

withMaskImagePaths = list(paths.list_images(withMaskPath))
withoutMaskImagePaths = list(paths.list_images(withoutMaskPath))

Using paths from imutils package we are creating a list of all the image paths. withMaskImagePaths and withoutMaskImagePaths holds list of image paths.

printing 10 items from withImageMaskPaths list

Lets read the images and create a list of image array values. We need to have inputs and outputs. Here input to the neural network will be image with mask or without mask. Output will be an integer based on which we will classify whether the image is with mask or not.

Dataset preprocessing

print('length of with mask data',len(withMaskImagePaths))
print('length of without mask data',len(withoutMaskImagePaths))
combinedDataset = withMaskImagePaths + withoutMaskImagePaths
print('length of combined dataset:',len(combinedDataset))
for imagePath in combinedDataset:
print('imagePath:', imagePath)
imageArray = cv2.imread(imagePath)
imageSize = 50
newImageArray = cv2.resize(imageArray, (imageSize, imageSize))

Above code will read each image and save it in the form of pixel in x. Original image has more resolution, to reduce the computation complexity and making all data unifor we are resizing to 50x50 pixels. To read and perform image manipulation we are using Opencv. Opencv is a library for computer vision available in both c++ and python.

Before and after image resizing

You should choose the resize resolution to a point the image is recognizable to some extent. Below 50 pixels the image loses its distinction.

y = np.array([1]*len(withMaskImagePaths) + [0]*len(withoutMaskImagePaths))
x = np.array(x)
print('length of y:', len(y))
print('shape of x:', x.shape)
print('shape of y:', y.shape)


length of x: 3843 
length of y: 3843
shape of x: (3843, 50, 50, 3)
shape of y: (3843,)

If we inspect the output, we see that the shape of x is 3843. x is complete list of both with_mask and without_mask image data. x is the input and y is the output. We have created a list of integers representing the output. 1 represents with_mask and 0 represents. If you see the shape of x its (3843, 50, 50, 3) which means it has 3843 items of 50x50 resolution and each pixel data is of BGR value. Its usally RGB but opencv reads it as BGR.

Here we are using color image, you can even convert that to grayscale and use it. Working with grayscale image less computationally intensive.

Training the model

As the data is processed and ready lets train the model. We now have 3843 items, lets split the dataset into two parts. We can use one for training the model and other for testing the trained model for its accuracy. For splitting the dataset we’ll use train_test_split method from sklearn.

x_train, x_test, y_train, y_test = train_test_split(x, y,test_size=0.3, random_state=1)
print('x_train shape', x_train.shape)
print('y_train shape', y_train.shape)
print('x_test shape', x_test.shape)
print('x_test shape', y_test.shape)


x_train shape (2690, 50, 50, 3) 
y_train shape (2690,)
x_test shape (1153, 50, 50, 3)
x_test shape (1153,)

inputs(x) and output(y) are passed to train_test_split method.
test_size: is the percentage we want the test dataset to be from the original dataset.
random_state: shuffles the dataset while splitting.

🗒️If you see the shape of train and test datasets you’ll see the division by 30 percentage. x is the image pixel data y is the integer(1 or 0) saying whether the image is with mask or without mask.

Training the model

Lets build our model first and then train that model with the training dataset we have.

model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(50, 50, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.Dense(128, activation='relu'),
model.fit(x_train, y_train, epochs=10)

We are defining the shape of the input data to be (50, 50,3). Output layer has ony 2 nodes as we have two types of output, either with mask or without mask.

maxPooling2D: downscales the input
Flatten: converts grid array data into uni-axis array
activation: function which defines how weighted sum of the input transforms output in each node.
epochs: Number of iterations the model will be trained


Now the model is trained, lets use the test data to test the accuracy of the model.

model.evaluate(x_test, y_test)

evaluate function gives the accuracy of the trained model.

Accuracy of the trained model
y_preds = model.predict(x_test)
y_pred_classes = [np.argmax(element) for element in y_preds]
print('classfication report:' , classification_report(y_test, y_pred_classes))

classification_report gives more details about the accuracy of the model

classification_report output
imageArray = cv2.imread(withoutMaskImagePaths[1000])
newImageArray = cv2.resize(imageArray, (imageSize, imageSize))
image = np.array(newImageArray, dtype="float32")
image= np.expand_dims(image, axis=0)
prediction = model.predict(image)
index = np.argmax(prediction[0],axis=0)
print('prediction:', classes[index])

We trained the model on the data with shape (50,50,3). For prediction also we need to pass the image data in the same dimension. We read some random image from with mask list, then resize it. We need to convert the image data to np array and convert grid image data to uni-axis data using np.expand_dims(). The output is an array with 2 values. Based on which index of the list has highest value we display the prediction. Whether it is with mask or without mask.

classes is a list containing [‘withoutMask’, ‘withMask’]. Based on which index value has highest value in the output list, the prediction is shown.

Predicting from a image from without mask path list
imageArray = cv2.imread(withMaskImagePaths[900])
newImageArray = cv2.resize(imageArray, (imageSize, imageSize))
image = np.array(newImageArray, dtype="float32")
image= np.expand_dims(image, axis=0)
prediction = model.predict(image)
index = np.argmax(prediction[0],axis=0)
print('prediction:', classes[index])
with mask prediction

In the above code the image is read from with mask path list. As you can see the prediction below is true for both with mask and without mask.

Try with different images. Most will be accurate as we got accuracy level of 96%

Exporting the model

We saw that the model we trained is giving good accuracy. What if we have to use the same model in some other project. We cant train each time right. There is a way, we can export the trained model and use it by just loading it back again.


Above funtion will save the model to a file with extension h5. h5 is a common format for saving models. We can import it back again using

new_model = tf.keras.models.load_model('/content/face-mask-image-classfication.h5')

Above line loads the trained model from h5 file. Just pass the path to the model. After loading the model, prediction is same as we did earlier.

prediction from a loaded model

Not just doing prediction on single image at a time. We can load live video from webcam using opencv and make predictions on each image frame to detect whether a person is wearing a mask or not. Make sure you follow all the image preprocessing before feeding the image frame into model.

Not just two images, we can classify lot many classes using more categories the same way.

🟡 You can view my Colab notebook here.

Hope this content was a good insight on training any image data and steps involved in training and testing a deep learning model.

Thank yourself for learning something new

Stay Curious… Stay Creative…

Techie Explorer