Face mask detector using FaceNet — Live streaming

Make it live, have more fun!

In this last blog, we will walk you through how to put everything we have done into a live streaming version. After all this hard work, you can see yourself being recognized on your webcam!


The package we will use is imutils. Here are the essential packages for this section.

The basic idea is we use VideoStream and FPS to open a pointer to the live stream and start the FPS timer. During the loop of each frame of the live stream, we will extract faces, classify them in terms of wearing masks, wearing masks correctness, and recognize them if they are in our database. Given this project outline, we need to save our previous models as pickle files. Here are the models we need:

  • face align/face detection: to extract the faces in an image. Previously, we use MTCNN embedded in the FaceNet. However, this MTCNN using TensorFlow could not compatible with imutils’s FPS. Maybe due to some threads issue. So, here I just download another pre-trained model from OpenCV to serve the same purpose. I am happy to hear any possible solution for my problem of using MTCNN in living stram.
  • Embedding model: 20180402–114759.pb, the pre-trained model we used in the face mask detection section.
  • Recognizer: the classification model to tell whether the face is masked/bare, wearing mask correct or not, and who it is based on our database.
  • Encoder: a pickle file to match the classified label with the actual names of that class. For example 1: masked face, 0: bare face, etc.

Load all models we need

This is all you need to begin this living stream journey!

Open a video stream

First, initialize the video stream, then allow the camera sensor to warm up. Then, start the FPS throughput estimator.

Next, we loop through all the frames captured, treat them as a single image as we did in the previous sections.

Extract Faces

To capture the images and extract faces from them:

Add labels

Loop the detections to filter out not qualified faces and apply embedding and classification models.

Update and Output

Finally, we update the label to the output frame. Now you can see yourself recognized in the live video!

Find the original code here.

The above demo video shows how it works. It can also capture multiple faces in a single shot.

Have fun with it!