Table of Contents
- Face Detection with Landmarks and Emotion Recognition
- APIs’ Capabilities
- Implementation
- Technical Challenges
- Try it
- References
Face Detection with Landmarks and Emotion Recognition –> Try it!
This project demonstrates a comprehensive web-based face detection and emotion recognition system using JavaScript’s face-api.js
library. The system runs entirely in the browser, providing real-time face analysis capabilities including detection, landmark identification, and emotion classification.
Key Features:
- Face detection Gives a bounding box for every face detected.
- Face landmarks recognition Gets the coordinates of the eyes, ears, cheeks, nose, and mouth of every face detected.
- Emotion recognition Determine whether a person is happy - sad - angry - disgusted - fearful - surprised.
- Track faces across video frames Get an identifier for each unique detected face. The identifier is consistent across invocations.
- Process video frames in real time Face detection is performed on the device, and is fast enough to be used in real-time applications.
APIs’ capabilities
Face detection
The most accurate face detector is a SSD (Single Shot Multibox Detector), which is basically a CNN based on MobileNet V1, with some additional box prediction layers stacked on top of the network. Furthmore, face-api.js implements an optimized Tiny Face Detector, basically an even tinier version of Tiny Yolo v2 utilizing depthwise seperable convolutions instead of regular convolutions, which is a much faster, but slightly less accurate face detector compared to SSD MobileNet V1. The networks return the bounding boxes of each face, with their corresponding scores, e.g. the probability of each bounding box showing a face.
Face Landmarks
For that purpose face-api.js implements a simple CNN, which returns the 68 point face landmarks of a given face image:
Face expressions
The face expression recognition model is lightweight, fast and provides reasonable accuracy. The model has a size of roughly 310kb and it employs depthwise separable convolutions and densely connected blocks. It has been trained on a variety of images from publicly available datasets as well as images scraped from the web. Note, that wearing glasses might decrease the accuracy of the prediction results.
Implementation
Including the Script
First of all, get the latest build from dist/face-api.js or the minifed version from dist/face-api.min.js and include the script:
<script src="face-api.js"></script>
Loading the Model Data
Depending on the requirements of your application you can specifically load the models you need, but to run a full end to end example we will need to load the face detection, face landmark and face recognition model. The model files can simply be provided as static assets in your web app or you can host them somewhere else and they can be loaded by specifying the route or url to the files. Let’s say you are providing them in a models directory along with your assets under public/models
faceapi.nets.tinyFaceDetector.loadFromUri('/models'),
faceapi.nets.faceLandmark68Net.loadFromUri('/models'),
faceapi.nets.faceExpressionNet.loadFromUri('/models')
Making predictions
The neural nets accept HTML image, canvas or video elements or tensors as inputs.
const detections = await faceapi.detectAllFaces(video, new faceapi.TinyFaceDetectorOptions()).withFaceLandmarks().withFaceExpressions()
Displaying Detection Results
Preparing the overlay canvas:
const canvas = faceapi.createCanvas(video)
document.body.append(canvas)
const displaySize = {width: video.width, height: video.height}
faceapi.matchDimensions(canvas, displaySize)
face-api.js predefines some highlevel drawing functions, which you can utilize:
faceapi.draw.drawDetections(canvas, resizedDetections)
faceapi.draw.drawFaceLandmarks(canvas, resizedDetections)
faceapi.draw.drawFaceExpressions(canvas, resizedDetections)
Technical Challenges
Accessing Your Webcam in HTML
We can communicate with our webcam and access its video stream from our browser with just some JavaScript code. We only need a browser that supports the getUserMedia
function.
Two components are essentials in getting data from our webcam displayed on our screen.
- The HTML
video
element - The JavaScript
getUserMedia
function
The video element is pretty straightforward in what it does. It is responsible for taking the video stream from your webcam and actually displaying it on the screen.
<video id="video" width="720" height="560" autoplay="true"></video>
By setting the autoplay attribute to true, we ensure that our video starts to display automatically once we have our webcam video stream.
That is because we haven’t added the JavaScript that ties together our video element with your webcam. We’ll do that next! The getUserMedia function allows us to do three things:
- Specify whether we want to get video data from the webcam, audio data from a microphone, or both.
- If the user grants permission to access the webcam, specify a success function to call where you can process the webcam data further.
- If the user does not grant permission to access the webcam or your webcam runs into some other kind of error, specify a error function to handle the error conditions.
function startVideo() {
navigator.getMedia = navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia ||
navigator.msGetUserMedia;
// Capture video
navigator.getMedia({
video: true,
audio: false
}, function(stream) {
video.srcObject=stream;
video.play();
}, function(error){
// an error occoured
});
}
For what we are trying to do, we call the getUserMedia function and tell it to only retrieve the video from the webcam. Once we retrieve the video, we tell our success function to send the video data to our video element for display on our screen.
Some error handling:
This error is caused because the function createObjectURL no longer accepts media stream object for Google Chrome
I changed this:
video.src=vendorUrl.createObjectURL(stream);
video.play();
to this:
video.srcObject=stream;
video.play();
Try it
Live Demo
Experience the face detection system in action: Try it live!
Running the Browser Examples
To run the project locally:
- Download this repository
- Open the terminal and navigate to the downloaded repository
- Run the command
caddy file-server --browse --listen :8080
- Browse to http://localhost:8080/
System Requirements:
- Modern web browser with WebRTC support
- Webcam access permissions
- Stable internet connection for model loading
Performance Notes:
- Initial model loading may take 10-15 seconds
- Real-time processing at 15-25 FPS depending on hardware
- Works best with good lighting conditions