AI: Programming a Face Detection with your Webcam

--

Artificial Intelligence Journey

When I joined Tinubu Square, I was impressed by the number of rules & decisions taken automatically by the Credit Insurance platform and was wondering if some typical Insurance use cases could be predicted by AI. Our Lab was already skilled in this area through its work, but I wanted to learn more and I looked into both TensorFlow (Google AI) and Torch (Facebook AI). Inspiration around the topic also came from reading Hands-On Machine Learning with Scikit-Learn and TensorFlow by Aurélien Géron. Clearly it’s possible to be a senior manager and an eternal beginner at the same time, still learning and marveling at the range of possibilities offered by a new technology.

A senior manager and an eternal beginner at the same time

Face Detection with your Webcam

Following is a playful example of what could be done with an AI Classification algorithm to recognize faces with a webcam. The complete code can be downloaded at the end of this article. (Available on my GitLab)

Top 4 Most Popular Ai Articles:

1. Neural networks for algorithmic trading. Multimodal and multitask deep learning

2. Back-Propagation is very simple. Who made it Complicated ?

3. Paper repro: “Learning to Learn by Gradient Descent by Gradient Descent”

4. Reinforcement Learning for Autonomous Vehicle Route Optimisation

Challenges

The first challenge was the following. There are plenty of tutorials and code examples explaining how to manage a face recognition and how to handle video streaming. Surprisingly, there was no end-to-end example. Moreover, they work only on a desktop, because they are written in Python. The idea here is to manage both a video stream and face detection within a web browser, using a mix of Python and Javascript. The second challenge is related to the fact that some limitations are reached due to the desktop CPU and memory. Python, as well as AI dependencies (OpenCV, numpy…), are expected to be installed on the desktop. To overcome this, I looked instead using a full web development environment: Google Colaboratory Notebooks ( https://colab.research.google.com/notebooks/welcome.ipynb#recent=true) to execute code within a web browser and benefit from the Google Cloud infrastructure power. (CPU, GPU and TPU could be selected)

Project setup with Google Colab and Google Drive

Please see the last section of this article to download the full code, do the setup and execution.

Step 1: Mount the Google Drive in order to store Videos and AI datasets

The first lines of code allow you to mount your Google Drive in order to store captured videos, images and config files:

from google.colab import drive
drive.mount('/content/drive')

Initially, you will be asked to login to your Google account and confirm permissions. Now you should be able to see your Google Drive folders directly from Google Colab.

Step 2: Manage the Webcam and prepare video captures

This part is a mix of Python and Javascript that initializes some global variables (as Root folders of captured videos…), then starts managing the webcam by using navigator.mediaDevices.getUserMedia(). Finally, it plugs the stream into an html <div> and display it on the screen.

def start_webcam():
js = Javascript('''
async function startWebcam() {

const div = document.createElement('div');

const video = document.createElement('video');
video.style.display = 'block';
const stream = await navigator.mediaDevices.getUserMedia({video: true});


document.body.appendChild(div);
div.appendChild(video);
video.srcObject = stream;
await video.play();


...



}
''')

display(js)
data = eval_js('startWebcam()')


start_webcam()

Your web browser should now display the webcam as follows:

Step 3: Capture a video that will be used for AI Training

The goal of this step is to record a video that will be used to train the AI. This is a kind of reference video on which AI will later rely to compare it with others videos in order to detect faces.

Basically, this Javascript code starts recording the webcam video stream and converts it into binary data. A button is also displayed to stop the recording. Once the button is pushed, the binary flow is returned and saved as a .mp4 file. Then, stored on your mounted Google Drive.

Here, the MediaRecorder object is used for the stream conversion.

...
var handleSuccess = function(stream) {
videoStream = stream;
var options = {
mimeType : 'video/webm;codecs=vp9'
};
recorder = new MediaRecorder(stream, options);
recorder.ondataavailable = function(e) {
var url = URL.createObjectURL(e.data);
var preview = document.createElement('video');
preview.controls = true;
preview.src = url;
document.body.appendChild(preview);


reader = new FileReader();
reader.readAsDataURL(e.data);
reader.onloadend = function() {
base64data = reader.result;
}
};
recorder.start();
};

...
...
navigator.mediaDevices.getUserMedia({video: true}).then(handleSuccess);

Now your screen should display the recorded video for the training phase.

The video will be stored on your Google Drive Video Dataset folder. (‘ video_file_train’ variable in the code)

Step 4: Capture a video that will be used for the Face detection

The goal of this step is to record a video that will be used to test the AI. The test video will be used to compare the data stream with the training video in order to detect faces.

I suggest you to behave differently to make for a more stark comparison with the previous video record. (Try to wear glasses, move, smile :))

Technically, it uses the same code as Step 3. Video is recorded as .mp4 stream and stored into your Google Drive Video Dataset folder. (‘ video_file_test’ variable in the code)

Step 5: Extract the first 30 images of the Training Video file and store them into the Dataset folder (as PNG images)

Now that the training and test videos are recorded and uploaded, we will be focusing again on the training video. The objective is to extract the first 30 frames of the training video. Then store them as .png images on the Google Drive Image Datasets folder. (variable named ‘ datasets’ in the code)

Each image will be first converted from color to gray scale. Then, faces will be pre-detected by an OpenCV algorithm, CascadeClassifier.detectMultiScale(). Itallows objects to be detected and returns a list of rectangles. OpenCV will use a specific config file: haarcascade_frontalface_default.xml. A Haar Cascade is basically a classifier which is used to detect particular objects from the source. The haarcascade_frontalface_default.xml is a haar cascade xml file designed by OpenCV to detect the frontal face.

...

# Analyse the video training file
face_cascade = cv2.CascadeClassifier(haar_file)
webcam = cv2.VideoCapture(video_file_train)

# Capture the first 30 images of the train video
count = 1
while count < 30:
if not webcam.isOpened():
print('Unable to load the video file.')
sleep(5)
pass
(_, im) = webcam.read()
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 4)
for (x, y, w, h) in faces:
cv2.rectangle(im, (x, y), (x + w, y + h), (255, 0, 0), 2)
face = gray[y:y + h, x:x + w]
face_resize = cv2.resize(face, (width, height))
cv2.imwrite('% s/% s.png' % (path, count), face_resize)

count += 1

Finally, images will be resized to a smaller format and stored in the Google Drive ‘ Image Datasets/Offer’ folder (Offer is my first name. You can change it in the code)

Step 6: Train the AI with the image dataset then try to recognize faces

Data Set Images root folder now contains a list of subfolders that have the first name of each person who could be potentially detected by the AI. This folder name will be used to display the name of the person near to the detected face rectangle in green. First, Image DataSet Folder (and subfolders) will be analyzed to create a list of images stored and their corresponding name. Then, OpenCV will be trained with this list of images/corresponding names:

# Train the model from the images with OpenCV Face Recognizer
model = cv2.face.LBPHFaceRecognizer_create()
model.train(images, lables)

Finally, the OpenCV Cascade Classifier class ( CascadeClassifier) will be used to detect faces. The first 30 frames of the test video will be used for the face detection. Each of them will be analyzed with the detectMultiScale() and predict() methods to recognize faces. The name of the detected person will be displayed with a rectangle and prediction confidence indicator. (in green)

# Use OpenCV Cascade Classifier to detect faces 
face_cascade = cv2.CascadeClassifier(haar_file)
webcam = cv2.VideoCapture(video_file_test)


count = 1
while count < 30:
...
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x, y, w, h) in faces:
...
# Try to recognize the face
prediction = model.predict(face_resize)
cv2.rectangle(im, (x, y), (x + w, y + h), (0, 255, 0), 3)

...


count += 1

You should see this kind of image displayed…

Or the following one. It also works if I wear glasses and have a scary face :)

Conclusion:

The CascadeClassifier is one of many usages of Classification algorithms. Classifiers could be used in various domains to predict and take decisions automatically: Finance, Insurance, Bank etc.

Many thanks for reading this article! Any comments or suggestions for the article and the code are welcome.

The next section will describe how you can download and run it.

How to download the code and execute it:

You can download the entire code on my GitLab : https://gitlab.com/osadey/face-detection

Once downloaded as a zip, please proceed as follows:

  • Unzip the archive ‘face-detection-master.zip’
  • Copy the ‘face-detection-master’ root folder into your Google Drive
  • Rename it ‘Colab Notebooks’ in your Google Drive
  • If needed, rename the sub-folder called ‘Offer’ (my first name) and change it in the code. This is the folder that stores the .png images (first 30 frames) extracted from the training video
  • Right-click on the Face_Detection.ipynb’ file and choose ‘Open With Google Colab
  • Run each step separately to see the results

Don’t forget to give us your 👏 !

--

--