A facial recognition system is a biometric technology that relies on analyzing unique facial features to confirm or identify a person’s identity. The process works by detecting and analyzing distinct features of the face, such as the distance between the eyes, the nose’s shape, and the jawline’s contour.
Deep learning involves training artificial neural networks with vast amounts of data to perform complex tasks like image recognition, natural language processing, and speech recognition. Deep learning has been highly successful in various applications, including facial recognition.
Facial recognition using deep learning is accomplished by training a deep neural network to extract high-level features from facial images. These features are then used to identify or verify a person’s identity.
In this blog, you will learn how a face recognition system can be developed using modern machine-learning algorithms. i.e Deep Learning.
Step 1: Collect & preprocess the dataset
The first step in building a facial recognition system is to collect and preprocess the dataset. The dataset should include sufficient images of people’s faces that cover a range of poses, expressions, and lighting conditions. It’s essential to preprocess the dataset to remove noise, normalize the images, and extract facial features such as eyes, nose, and mouth.
Fo preprocessing purposes we will employ OpenCV. OpenCV is a famous computer vision library used to process image data for deeper insights. Here’s an example of how to preprocess an image using OpenCV library.
import cv2
# Load the image
image = cv2.imread('image.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Detect faces in the image
face_detector = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
faces = face_detector.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)
# Loop over each detected face
for (x, y, w, h) in faces:
# Extract the face region from the image
face = image[y:y+h, x:x+w]
# Resize the face image to (100, 100) for model compatibility
face = cv2.resize(face, (100, 100))
# Save the preprocessed image
cv2.imwrite('preprocessed_image.jpg', face)
- Line#4-7: We first load the image and convert it to grayscale.
- Line#10,11: We then detect faces in the image using the Haar Cascade face detection algorithm.
- Line#14-22: Loop over each detected face. For each face, we extract the face region from the image, resize it to (100, 100) & save the preprocessed image.
The above code will preprocess the image data and resize the detected face region image to a 100 by 100 segmented image. This segment is then used for unique features matching like eye color, nose, and facial dimensions, etc.
Step 2: Train the deep-learning model
The next phase in building a facial recognition system is to train the deep learning model using the preprocessed dataset. In this step, we use a pre-trained deep neural network, such as the VGG or ResNet model, and fine-tune it on our dataset. For this purpose, we will also use the Keras library for better neural network implementation. If you are not familiar with Keras. Let me define it for you. “Keras is a high-level deep learning framework that allows you to easily build, train and evaluate neural networks.”
# import keras library
import keras
# importing specific layers from Keras
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential
# Define the model architecture
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
- Line#2: This line imports the Keras library.
- Line#4: imports specific layers from Keras, namely Conv2D, MaxPooling2D, Flatten, and Dense. These layers will be used to construct the neural network.
- Line#5: imports the Sequential model from Keras. Sequential is a linear stack of layers, meaning that each layer is connected to the previous one.
- Line#8: creates an instance of the Sequential model.
- Line#9: adds a Conv2D layer to the model. The first argument, 32, specifies the number of filters in the layer. The second argument, (3, 3), specifies the size of the filters. The activation argument specifies the activation function to be used, in this case, the rectified linear unit (ReLU). The input_shape argument specifies the shape of the input, which is a 100×100 image with 3 color channels (RGB).
- Line#10: Adds a MaxPooling2D layer to the model. Max pooling is a technique used to reduce the spatial size of the representation. The argument (2, 2) specifies the size of the pooling window.
- Line#11: Adds another Conv2D layer to the model. This time, the layer has 64 filters.
- Line#12: This line adds another MaxPooling2D layer to the model.
- Line#13: This adds another 2D convolutional layer to the model with 128 filters, each with a size of 3×3. The activation function used is ReLU.
- Line#14: This adds another max pooling layer to the model with a pool size of 2×2.
- Line#15: This flattens the output from the previous layer into a 1D array. This is necessary so that the output can be passed through a fully connected layer.
- Line#16: Adds a fully connected layer to the model with 128 neurons. The activation function used is ReLU.
- Line#17: This adds another fully connected layer to the model with a single neuron. The activation function used is sigmoid. Sigmoid is commonly used as the activation function for binary classification problems because it outputs values between 0 and 1, which can be interpreted as probabilities.
- Line#19: This line of code will compile the model. The optimizer used is Adam, which is a popular optimizer for training deep neural networks. The loss function used is binary_crossentropy, which is a common loss function for binary classification problems. The metric used to evaluate the deep learning model.
Step 4: Deploy the model
Once we have trained the model, we need to test its performance on a set of test images. In this step, we use the trained model to predict the identity of each test image and calculate the accuracy of the predictions.
# Load the test images
test_images = []
# Preprocess the test images
preprocessed_test_images = []
# Predict the identities of the test images
predictions = model.predict(preprocessed_test_images)
# Calculate the accuracy of the predictions
accuracy = (predictions == true_labels).mean()
print(f"Accuracy: {accuracy}")
In the above code, we first load and preprocess the test images. We then use the trained model to predict the identity of each test image & calculate the accuracy of the predictions.
Step 4: Deploy the model
Once we have tested the model and are satisfied with its performance, we can deploy it in a real-world application. In this step, we integrate the model into a system that can capture live video or images, preprocess them, and use the trained model to recognize the faces in the images.
# Initialize the video capture device
cap = cv2.VideoCapture(0)
# Load the face detector
face_detector = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
while True:
# Capture a frame from the video stream
ret, frame = cap.read()
# Convert the frame to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces in the frame
faces = face_detector.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)
# Loop over each detected face
for (x, y, w, h) in faces:
# Extract the face region from the frame
face = frame[y:y+h, x:x+w]
# Resize the face image to (100, 100) for model compatibility
face = cv2.resize(face, (100, 100))
# Preprocess the face image
preprocessed_face = preprocess_image(face)
# Use the trained model to predict the identity of the face
prediction = model.predict(preprocessed_face)
# Draw a bounding box around the face and label it with the predicted identity
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.putText(frame, prediction, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
# Display the frame
cv2.imshow('Facial Recognition', frame)
# Exit the program if the 'q' key is pressed
if cv2.waitKey(1) == ord('q'):
break
# Release the video capture device and close the window
cap.release()
cv2.destroyAllWindows()
In this code snippet, we initialize the video capture device and load the face detector. We then capture frames from the video stream, detect faces in the frames, and use the trained model to predict the identity of each face. Finally, we draw a bounding box around each face and label it with the predicted identity, and display the frame in a window.
I hope this step-by-step guide helps you build your facial recognition system using deep learning. Let us know if you have any further questions or need any clarification. Happy learning with Algoideas!!