Facial Detection Tested

By IPVM Team, Published Nov 16, 2018, 07:17am EST

Facial detection and recognition are increasingly offered by video surveillance manufacturers.

Facial detection detects faces in an image/video but not whose face it is. However, even facial recognition (where the system attempts to determine the identity of each face) depends on facial detection first. That is, the system cannot attempt to recognize a face until it detects that an object is a face. As such, facial detection is a pre-requisite for facial recognition systems.

facial detection test

3 fundamental approaches exist to performing facial detection:

  • HAAR Cascades
  • HOG
  • CNN / Deep Learning

Performance varies on 3 fundamental metrics:

  • Accuracy of detecting faces - while it is 'easy' to detect a face looking directly at the camera in a well-lit scene, performance can vary significantly depending on how a person tilts their head (down, left, right, etc.) and the lighting conditions of the scene (shadows, darkness, noise, etc.).
  • Computing load to detect faces - Finding and determining what objects are in a scene and whether those objects are a face (instead of a tree, a car, a cat, a bowling ball, etc.) can be very computationally intensive while many video surveillance devices (e.g., IP cameras, NVRs) have significant constraints of processing power
  • Chip / hardware used - Intel (i5/i7, Movidius 2 & X, FPGA); Nvidia GPUs, etc.

This is the first of a new series of machine learning / video analytic testing that IPVM will be doing.

****** **** ****, ** test ***, **** *** CNN ********** ** ****** detection *********** ***** *********** (FPS) *** ********* ***** required ** ***** **-*****. In ******* *******, ** will **** ** ******** Myriad2, *******, *** ****** GPUs, ** **** ** add ******** ***** (******* Precision). *** **** *** this ****** ** ** Github [**** ** ****** available].

Facial ********* ********** ********

*** ***** ** *** not ******** **** ***, HAAR *** ***, *** 8-minute ***** ***** ********* each *** ******** *** tradeoffs ***** ****:

Code / ******* *** ****

*** **** *** ******* used *** *** **** is ****** ** ****'* public **** ********* **** repository ** ****** [**** no ****** *********].

** ******:

  • **** ***, ***
  • ****** ****
  • ******** ****-*********-******-****

** **********, *** ***** interested ** *** **************, see *** ****** ****** used [**** ** ****** available], ********* ***** *** context:

Test ******* *** ****** *********

***** **** ************ *** far **** ********* ********* than *** *** ****, in *** ****, ***** the **** ***** **-***** for *** * **********, the ******** *** ****** detection ***** ********* ******* frames *** ****** (~*****) as *** **** **** accurate **** ********** ***** delivering ***** *** *** of *** **** ****** accuracy ** ****. *** video ***** ********* ***** results:

*** ****** ** *********:

  • ***** ******** ********** *** ********* *** ** ****** on ***********. ******** ********* model *** ************* **** data-type ********* ********** (****->****) and***** ********* ****; (*) ********** *** model ** *** ****** hardware (**/**/**, ********, ****). This ******** **** * smaller ************ **********+***) ********* better **** ************ ******** results.
  • *******, ~***** *** ** entire ***** **-***** **** still ******** ********** *** video ************ ************. **** chips *** *** *** inside ** ******* (*** to **** *** ***** constraints) *** ***** **** may ** *** ** recorders, ********* ***** **** to ******* *** **** than ***** ** ****** 8, ** ** **** cameras ********* ***** ******** to * ********. ******** IPVM ***** **** ***** Intel's ******** ***** (** particular *** ******* *** MyriadX) ***** *****'* ****** Compute ***** (***) *** NCS2 (***** *** *******).

********

**** *** *** ******* overall ********* ************ ********** struggling **** ****** ***** as ***** ** *** clip *****:

*** ********* ******** ***** but ******** **** **** 'flickering' ** *** **** was *********** **** *** then **-******** ** ***** below:

** ********, *** *** detected ***** ***** ******** without *** ********** ** HOG, ** ***** *****:

Questions / ******** / ***********

**** ** *** ***** test ** **** *** series *** ** *** presenting *********** ******* ** show **** ** *** working ** *** ** encourage *** ** *** questions *** **** *********** for ********* ******, ******, hardware, ***. ** ****.

Comments (10)

I want to welcome and thank Tyler Renelle. Tyler has a popular podcast on machine learning. For those looking to learn more in this area, I'd encourage you to listen to his 30+ episodes.

Tyler started by giving us private lessons and he has expanded into doing tests on video surveillance related machine learning approaches.

Next up is a test of the brand-new Intel Neural Compute Stick 2 (which uses the MyriadX chip that, e.g., Avigilon will be using in their upcoming H5 cameras).

We are also working on training to help industry professionals better understand machine learning fundamentals as they apply to video surveillance.

Again, any questions or suggestions for testing / Tyler, please ask.

Very much looking forward to following the results John. The police in South Wales in the UK did some tests on facial recognition in high-volume spaces, but they haven't returned my mail yet.

Note that dlib has a CNN face detector built in (you can run it in the code as `--models dlib_cnn`). The CNN is an older model, not on of the more modern SqueezeNet/MobileNet varieties. If you have CUDA installed, it will use your GPU by default. If you want it to run without GPU-support (run on your CPU), you'll need to compile dlib from source. I tried both with and without GPU:

- i7: dlib_cnn ran so slow I couldn't even finish the process. Ie, FPS much less than 1.
- 1080ti: dlib_cnn was very fast this way; faster than any of the options in the report. But again, we're talking very powerful GPU; something not likely available at the edge.

Generally I'd be less interested in using dlib_cnn since it's a model from another time, and more interested in recent models from public repositories / papers. With these you'll also get more control on how they're run (CUDA on GPU, OpenVINO on Intel, etc).

Any thoughts on the viability of running cnn on chips that are used in cameras and nvrs, like ARM architecture, for instance?

That's a really good question. I'll keep an eye out for an answer here as I work with OpenVINO & reply back if I find one.

Excellent piece. Thank you.  I would like to suggest a similar comparison of these techniques with license plate detection.

Anyway to run the tests using our own videos from security cams?

Yeah, grab the code from Github (link in the report), follow the setup instructions, and drop your own videos into the vids/in directory (instead of downloading the recommended samples). I'll keep that repo updated & try to improve on quality, to make it easy to test on custom vids.

Thanks for this report.  Looking forward to learning more. 

 

I am also working on the project in Hong Kong now by using AI Deep Learning. Let me share some experience on this shortly. Good topic for discussion.

Read this IPVM report for free.

This article is part of IPVM's 6,653 reports, 896 tests and is only available to members. To get a one-time preview of our work, enter your work email to access the full article.

Already a member? Login here | Join now
Loading Related Reports