Face Detection and Recognition using OpenCV

Upgrade Your Skills, Upgrade Your Career - Learn more

In this article, we will learn how to use OpenCV to implement a face detection and recognition system. Let’s start!!

What are face detection and face recognition?

Face recognition is a method of identifying or verifying a person’s face from images or video frames. Without any effort, we can immediately recognize the faces. But it is a challenging task for a computer. Low resolution, occlusion, lighting fluctuations, and other factors all add to the complexity, and hence face recognition and detection process becomes difficult. These parameters have a significant impact on the computer’s ability to detect faces more accurately.

Let us understand the difference between face detection and face recognition.

Face detection is a computer vision process that uses artificial intelligence (AI) algorithms to recognize and identify human faces in digital photographs. Face detection technology can be used in a variety of industries to enable real-time surveillance and tracking of people, including security, biometrics, law enforcement, entertainment, and personal safety.

Face Detection: Face detection is defined as the process of locating and extracting faces (in terms of location and size) from an image for use by a face detection algorithm.

Face Recognition: Face recognition method is used to identify features in an image that are unique. It is the method of identifying or recognizing certain features of the face such as eyes or mouth or validating whether a certain person is present in the given image or not.

Working of Face detection and recognition algorithm

Face recognition programs employ algorithms and machine learning to detect human faces within large images that frequently include non-face items such as landscapes, buildings, and other human body parts such as feet and hands.

Human eyes are one of the easiest characteristics to recognize, thus face detection algorithms usually start with locating the eyes on a face. After that, the algorithm might try to recognize brows, mouth, nose, nostrils, and iris. Once the algorithm has determined that a facial region has been discovered, it performs additional checks to ensure that it has indeed spotted a face.

The algorithms must be trained on massive data sets containing hundreds of thousands of positive and negative images to achieve accuracy. The training enhances the algorithms’ capacity to determine whether and where there are faces in an image.

Implementation using HAAR Cascade Algorithm

The HAAR cascade is a machine learning technique that involves training a cascade function using a large number of positive and negative images. Those with faces are considered as positive images, whereas images without faces are considered negative. Image characteristics are viewed as numerical information taken from images that can distinguish one image from another in face detection.

On all of the training photos, every feature of the algorithm is applied. At the outset, each image is assigned the same weight. The algorithm determines the appropriate threshold value for categorizing the faces into positive or negative. Errors and misclassifications are possible but we aim to choose characteristics with the lowest error rate, which implies these features are the ones that best distinguish between face and non-facial photos.

To calculate the numerous characteristics, all feasible sizes and locations of the kernel are used.

HAAR-Cascade Detection in OpenCV

Both the trainer and the detector for the face recognition and detection algorithm are provided by OpenCV. Using OpenCV, we can train a classifier for any object, such as vehicles, planes, and buildings. The cascade image classifier has two primary states. The first is training, and the second is detection.

In OpenCV, the opencv_haartraining and opencv_traincascade applications are used for training the cascade classifiers. The classifier is saved in a distinct file format in these two applications.

A set of samples is required for training. There are two different kinds of samples:

Negative Samples

Negative samples are collected from random images. A text file stores the negative samples. Each line of the file contains the filename of the negative sample This file must be manually produced. The size of defined images can vary.

Positive samples

The opencv_createsamples method generates positive samples and hence the positive samples do not have to be generated manually. These samples can be made from a single image containing an object or a previous collection of images with objects. The opencv_createsamples utility only applies the perspective transformation on given input and thus we require a huge dataset of positive samples.

Implementation

# Importing OpenCV
import cv2

# Importing matplotlib.pyplot
import matplotlib.pyplot as plt

# Importing numpy
import numpy as np

# Reading the image
img = cv2.imread(r"C:\Users\tushi\Downloads\PythonGeeks\tom cruise 2.jpg")
img =  cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_original = img.copy()
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

# Reading the cascade classifier for frontal face and for eyes
face_cascade = cv2.CascadeClassifier(r"C:\Users\tushi\Downloads\images\DATA\haarcascades\haarcascade_frontalface_default.xml")
eye_cascade = cv2.CascadeClassifier(r"C:\Users\tushi\Downloads\images\DATA\haarcascades\haarcascade_eye.xml")

# Detecting the face and creating the bounding box
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
    img_face = cv2.rectangle(img.copy(),(x,y),(x+w,y+h),(255,0,0),5)
    roi_gray = gray[y:y+h, x:x+w]
    roi_color = img[y:y+h, x:x+w]
    # Detecting the eyes and creating the bounding box
    eyes = eye_cascade.detectMultiScale(roi_gray)
    for (ex,ey,ew,eh) in eyes:
        img_eyes = cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),5)

        
# Setting the grid size        
plt.figure(figsize=(20,20))

# Displaying the original image
plt.subplot(131)
plt.title('Original')
plt.imshow(img_original)

# Displaying the image with face detection
plt.subplot(132)
plt.title('Face detection')
plt.imshow(img_face)

# Displaying the image with eye detection
plt.subplot(133)
plt.title('Eyes detection')
plt.imshow(img_eyes)

Implementation using dlib and face_recognition libraries

dlib: The dlib library contains the implementation of “deep metric learning,” which is used to build the face embeddings, which are then used in the recognition process. The dlib library is a modern C++ toolkit that includes machine learning techniques and tools for writing complicated C++ software that solves real-world problems.

face_recognition: The face recognition package wraps the facial recognition functionality of dlib, making it easier to use.

OpenCV: OpenCV library processes the images and prepares our dataset.

Implementation

1. We read the image where we detect whether a face is present in the image or not using the face_recognition library

2. Identify face location and draw bounding boxes. After the location of the face has been identified on the image, we draw a bounding box around the detected face. The coordinates for the vertices of the box are calculated automatically using the coordinates information obtained from the face_recognition library’s face_location method. This library is designed in such a way that it finds the face automatically and only works on faces.

3. Next step is to convert the training image into encodings. We store the image encodings and another image is read to validate the performance of the algorithm.

# Installing the libraries
get_ipython().system('pip install cmake')
get_ipython().system('pip install dlib')
get_ipython().system('pip install face_recognition')
# Importing the libraries
import dlib
import face_recognition
import cv2
import matplotlib.pyplot as plt
# Reading the image for face detection using the face_recogntion library
img = face_recognition.load_image_file(r"C:\Users\tushi\Downloads\PythonGeeks\tom cruise.jpg")

# Displaying the original image
plt.imshow(img)

# Extracting the location of the face on the image using the face_locations function
face = face_recognition.face_locations(img)[0]
img_copy = img.copy()
# Creating a bounding box around the detected face
cv2.rectangle(img_copy, (face[3], face[0]),(face[1], face[2]), (0,255,0), 2)

# Displaying the result
plt.imshow(img_copy)

# Training using the image read and creating encodings
train_encodings = face_recognition.face_encodings(img)[0]
train_encodings
# Validating using the train encodings on another image read
test = face_recognition.load_image_file(r"C:\Users\tushi\Downloads\PythonGeeks\tom cruise 2.jpg")

# Creating encodings for the test image and comparing with encodings of the train image.
test_encode = face_recognition.face_encodings(test)[0]
print(face_recognition.compare_faces([train_encodings],test_encode))
# Output for comparison is true
True
# Displaying the read image
plt.imshow(test)

# Extracting the location of the face on the image using the face_locations function 
face = face_recognition.face_locations(test)[0]
img_copy = test.copy()
# Creating a bounding box around the detected face
cv2.rectangle(img_copy, (face[3], face[0]),(face[1], face[2]), (0,255,0), 5)
# Displaying the result
plt.imshow(img_copy)

Overview of face recognition

1. Face Detection: The initial objective is to detect faces in an image (picture) or a video stream. We have the exact coordinates and location of the face; we may extract it for processing.

2. Feature Extraction: Once we have separated the detected face from the rest of the image, we can extract specific features from it. We can extract certain aspects of the face using face embeddings. A neural network takes an image of a person’s face as input and produces a vector that reflects the most essential elements of that face. This vector is known as embedding in machine learning, and hence we call this vector face embedding. The neural network learns how to output comparable vectors for the faces that resemble one another based on its training.

3. Face Comparison: Face embeddings are saved in a file for each face in our data; we need to recognize an image not present in our data. As a result, the first step is to compute the image’s face embedding, and then compare it to the other embeddings saved for each face in our data. If the created embedding is similar or identical to any other embedding, the face is recognized in the image.

Implementation

# Importing OpenCV
import cv2

# Importing numpy
import numpy as np

# Importing face_recognition
import face_recognition

# Importing os
import os

# Defining the path for training image
path = r'C:\Users\tushi\Downloads\PythonGeeks\Face recognition'

# Creating empty lists to append image encodings and class names
images = []
classNames = []

# Reading the training images and the classes and appending to lists
for img in os.listdir(path):
    image = cv2.imread(f'{path}/{img}')
    images.append(image)
    classNames.append(os.path.splitext(img)[0])

print(classNames)

# Resizing the images
scale = 0.25
box_multiplier = 1/scale

# Finding the image encodings
def findEncodings(images):
    # Empty list to append the image encodings
    encodeList = []
    
    for img in images:
        # Converting images from BGR to RGB
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        
        # Extracting the face encodings and appending to list
        encode = face_recognition.face_encodings(img)[0]
        print(encode)
        encodeList.append(encode)
    return encodeList

# Known encodings are the encodings for the training images
knownEncodes = findEncodings(images)

# Reading the video from the webcam
video = cv2.VideoCapture(0)
 
while True:
    ret, frame = video.read()
    
    # Resizing the frame according to the scale we have defined
    Current_image = cv2.resize(frame,(0,0),None,scale,scale)
    Current_image = cv2.cvtColor(Current_image, cv2.COLOR_BGR2RGB)

    # Finding the face location and the encoding for the current frame
    face_locations = face_recognition.face_locations(Current_image, model='cnn')  
    face_encodes = face_recognition.face_encodings(Current_image,face_locations)

    # Finding matches for each face detection in the image
    for encodeFace,faceLocation in zip(face_encodes,face_locations):
        matches = face_recognition.compare_faces(knownEncodes,encodeFace, tolerance=0.6)
        faceDis = face_recognition.face_distance(knownEncodes,encodeFace)
        matchIndex = np.argmin(faceDis)

        # If the detected face matches, then display the corresponding class name
        if matches[matchIndex]:
            name = classNames[matchIndex].upper()

        else:
            name = 'Unknown'
            
        # Defining the coordinates for rectangular bounding box
        y1,x2,y2,x1 = faceLocation
        y1, x2, y2, x1 = int(y1*box_multiplier),int(x2*box_multiplier),int(y2*box_multiplier),int(x1*box_multiplier)

        # Drawing rectangle around the detected face
        cv2.rectangle(frame,(x1,y1),(x2,y2),(0,255,0),2)
        cv2.rectangle(frame,(x1,y2-20),(x2,y2),(0,255,0),cv2.FILLED)
        cv2.putText(frame,name,(x1+6,y2-6),cv2.FONT_HERSHEY_COMPLEX,0.5,(255,255,255),2)

    # Displaying the output    
    cv2.imshow('Webcam',frame)
    if cv2.waitKey(1) == ord('q'):
        break

video.release()
cv2.destroyAllWindows()

In the output, we can see the class labels. Each class label is associated with encoding of the class image. Encodings are created for each image in the training dataset and are stored. When we read a video or any other test data, the image obtained is first converted to encodings and then the encodings for the test image are compared with the encodings for each image in the data set.

In the output, we can see three class labels. This means that our training dataset contains images of three classes. Therefore, the picture for Robert Downey Jr. given as test input is identified and correctly labeled but the input image for Orlando bloom is marked as unknown as the model does not have any matching encodings.

['Elon Musk', 'Robert Downey', 'Tom Cruise']

Limitations of face recognition

1. Illumination: During image recognition, lighting is extremely important. A little adjustment in lighting conditions will have a significant impact on the findings. If the lighting condition varies, the outcome for the same object may fluctuate due to low or high illumination.

2. Background: The face detection algorithm takes into account the object’s background as well. Because the factors impacting its performance alter as soon as the locations change, the result may not be the same outside as it is indoors.

3. Pose: The Variations in the pose lead to major changes in output as the face recognition process is sensitive to pose variations. Changes in facial texture caused by head movement or varied camera positions can also lead to incorrect results.

4. Occlusion: Occlusion refers to the face’s beard, mustache, and accessories (goggles, caps, masks, etc.) interfering with a facial recognition system’s estimation.

5. Expressions: Another key consideration required for a face recognition algorithm is the fact that the same person can have multiple expressions. For the same person, a change in facial expressions can result in a different outcome.

6. Low Resolution: The recognizer must be trained on a high-resolution image, or the model will not be able to extract features.

7. Aging: As people become older, the form, wrinkles, and texture of their faces change, posing still another problem.

Applications of Face Recognition

Face recognition is being used in a lot of applications to make the world safer, smarter, and more convenient. Face recognition algorithms find use in many real-life tasks ranging from day-to-day chores to military applications.

1. To find a missing person or recognize a person through the database

2. Unlocking phones, laptops, and other gadgets.

3. Social media account identification

4. Attendance Systems in schools and institutes

5. Recognizing Drivers in Cars

6. Smart door locks operated through facial recognition algorithms.

Conclusion

In this article, we learned about face detection and face recognition algorithms in OpenCV. We learned the difference between the two and also understood their working. There are many ways to execute these operations using OpenCV and external libraries. In this article, we have explored various options to implement face detection and recognition in OpenCV. We understood the overview of face recognition in OpenCV. Furthermore, we learned about the limitations of face detection and recognition and also understood their applications.

Your 15 seconds will encourage us to work even harder
Please share your happy experience on Google | Facebook


PythonGeeks Team

At PythonGeeks, our team provides comprehensive guides on Python programming, AI, Data Science, and machine learning. We are passionate about simplifying Python for coders of all levels, offering career-focused resources.

Leave a Reply

Your email address will not be published. Required fields are marked *