UK Facial Recognition Essex Errors Report

By John Honovich, Published Jul 05, 2019, 09:23am EDT

Facial recognition trials in the UK have generated significant controversy and debate over the past few years. This week, it flared again when Sky News / Guardian trumpeted that 81% of 'suspects' flagged were false. However, their posts did not analyze the details nor share the original research report. IPVM has.

london MPS police face rec report

In this report, we share and examine the 128-page report including:

  • The new and enhanced methodology that Essex used to analyze these results from more recent facial recognition trials in 2018 and 2019.
  • Why police / human review of matches was so critical and so difficult
  • How the 'accuracy' statistics vary and what the tradeoffs in different approaches were
  • How logs recorded show logistical challenges
  • How UK / EU / GDPR rules for consent complicate the challenges involved
  • China / PRC facial recognition systems contrasted

Report ********

************* ** ***** ***** Rights ****** *** ******** a ***-**** ******:

Observing ******

*** **** ********* ********** of **** ***** *** here ** ** ********** observe *** ****** ****** recognition *********** ** **** they ***** *** **** happened **** *** ****** were ******* *** ****** recognition *** ********** *******. The ***** **** ***** report ***** *** * trials **** ********:

Problems ********* **** *********** *******

*** ******* ******* *** perhaps **** ********** ******* of *** ****** ** that ********* *** ***** difficulty ******** ********* *******, i.e., ** *** ****** on *** ****** *** same ****** ** *** person *** ****** *********** system **** ********?

** ***** **** *** 6 ******, *** ****** were *** **** ******* the ***** *** *** person ** *** ****** so **** ******* *** person, *** **** ** 8 ** ***** ** times *** *** ****** flagged ** *** ****** the ****** ****** ** the ******, ** *** chart ***** *****:

***** ********* **** *** unreliability ** ****** ********* the ****** ** * key *******:

Sky **** **% *********

*** **% ******** ** Sky **** ******** ** matches **** *** ****** were **** ** ******* dismiss ** ***** ***** matches (*.*., *** ****-**** surveillance ***** *** ******* much ********* **** *** watchlist *****):

***********, *** **% ** wrong ** ***** **** only ** ******* **** were **** ** ** verified *** ** *****, 30 **** ******** ** being *****, *** * rate ** **.*%. *** difference ** **** * matches **** ****** ** be ********. ****** ***, the **** ** '****'.

Debate **** **********

**** *********** * ****** on *** ** ******* the *********** ** ***** systems.

************* ****** ******** **** use *** ***** ******* of ***** ******* ****** them ********** ** **% or ******. *** ****** mentions **** ********:

** **** ****, *** system ****** ******* ********* of ***** ** ****** walking ** **** ***** 6 ****** ********* **** ****** ** be *****. ********** ** this *** (*** ** false / **,*** ***** scanned) ***** * ***** positive **** ** **** 0.3% ** ** '********' of **.*%).

***** ** *****? **** are, ** **** *****. The ******** ******* ***** is **** ********.

Hard ** Visually ******

*** ******* ** **** it ** **** ** visually ****** ******* * match ** ***** ** wrong. ****** **** *****. For *******, ** ********* *********** ****, **** had ** ******** '*************'******* *** ****** *** generating *** **** ******, e.g.:

*** **** *** ********* high ******* ***** ******, not ************ ***** ***** where ******, ********, ******** variances, ***. *** **** it **** **** ** know *** **** ** just ******* ** * screen.

*************, *** ***** *** not ******* *** ******** of *** ******* ** show *** **** ** was ** **** ******* the ********* ***** *** the ****-**** *****.

Logs **** **********

*** *********** ******** * detailed *** ** ************ for *** ** *** test ***********. ** ******** one ******* **** ** front ** * ******* being ********:

** **** ******** ******** mistakes. ******* *** **** interesting ** * ** them ********* **** * minutes. ** *** *****, the ********* ******** ***** whether * ***** ** accurate *** *** ****** one ***** ***** *** ready ** ****** ***** they **** *** ******** processing *** *****:

*** *********** ******* ** that **** ** ***** have ********** *** ********** scores (** *.** *** 0.60). *** ****** ***** that *** **** *********** vendor (***) *** * threshold ** *.**:

*** ******* ********** ***** can ** ********, *** higher ** **, *** more ****** *** ****** rejects ***** ******* *** misses **** ******* *** vice *****. ** **** not ****** ***'* ****** recognition ** ** ****** opine ** *********** ******** but ******* **** ***** would ********* ****** ***** matches *** *** ******* "81%" ***** ****.

Technology ************

*** ****** ***** **** the **** ************ *** the ******* **** ********:

** ******* **** ******** about **** ******* **** used.

Positioning *** *******

*** ******* **** ********* mounted ********, ** ****, as *** ***** ***** from *** ****** *****:

*** *********** ******** ******* is **** *** *** strove ** ******* ******* and ***** ********* **** facial *********** *** ***** performed, ** **** ******* notes:

** *** **** ****, some ****** **** ******** after ******** *** ****** recognition:

China / *** **********

*** ******'* ******** ** China (***) ****** ** such ****** ******* ** facial *********** *** **** it **** *** ******* of *********** ******** ** it ** ************* ********** to ******** *** *********** of ***** *******.

**** ** **** ************** is *** **** ***** PRC ****** *********** ******* work (****** ***** **** been ******* ********* ** the ********, *.*.************************* ***********).

*** ******* **** *** PRC ****** *********** ******* is **** **** **** more ***** ***** ***** on ****, **** ******** data, **** ********* ****** (i.e., ******** ******** **), and **** (** **) legal ************.

*** ******* **** ** that ****** **** *******, especially **** *** *** dealing **** ***** ********** (e.g., ****** **** *** a *** ******* ** a *** ********, ***** PRC ******* ****** ***** have ********). ** ** inevitable **** ***** *** going ** ** ******** mistakes **** **** **** humans ** ******** ****** if **** **** ***** accuracy ** ********.

**********, ***** **** **** and ******* *** ****** about ***** *** ******* performance, ** **** ** hard ** ****** **** how **** ** *****.

UK ****** *********** ******?

***** ******* **** ********** about *** ****** ** facial *********** ** *** UK. ***** *** **** to *** **** (*******, software, *********, ***.) *** the **** ** ******* informed *******, ** ** hard ** *** **** as ***** * '**********' or *********** *******. ***** are ****** *** *** people ***** ******** *** the **** *** ********** involved.

**** ********* *** ****** as *** ********** ******** but *** ********** ** visually ********* ****** *** likely ** ************* ****** given *** **** ****** look ***** ********** ***** the *********** ** ***** quality ** ********** *******, open ****** ***********.

Comments (3)

Interesting review, thank you. It's useful to get real life usage results.

Somewhat similar report about NEC have been published by California Department of Justice https://www.vice.com/en_us/article/kbvkg3/california-spent-nearly-dollar18-million-on-controversial-facial-recognition-software

In our tests that NN trained on common face datasets (CASIA, VGG and such..) performs poorly in real life scenario, where regular 2MP/4MP cameras are used and face position is not as perfect as in datasets.

In comparison NEC "claims" (we had no chance to test ourselves) it has more advanced face recognition technology and historically performed well in regular NIST tests. 

Agree
Disagree
Informative: 3
Unhelpful
Funny

Manufacturers prefer measures that use the total numbers of faces scanned giving them accuracies of 99% or higher....

In this case, the system easily scanned thousands of faces of people walking by over these 6 trials and only 30 were proved to be false. Expressing it this way (say 30 false / 10,000 faces scanned) gives a false positive rate of just 0.3% or an 'accuracy' of 99.7%).

Do the manufacturers in any way take false negatives into account in their statistics? The way this is explained here sounds really fishy.

Agree
Disagree
Informative
Unhelpful
Funny

In marketing, manufacturers generally just give big vague numbers, e.g., here is a Hikvision explanation:

It 'can hit 99%' is pretty vague. When does it hit 99%? How much lower can it be, etc.?

There are certainly ways to take false negatives into account, as there is an inherent tradeoff between false positives and false negatives.

I believe ROC curves are the standard means to demonstrate this, e.g.:

Image result for roc curve facial recognition

However, in manufacturer marketing, they are rarely, if ever shown.

 

Agree
Disagree
Informative: 2
Unhelpful
Funny
Read this IPVM report for free.

This article is part of IPVM's 7,327 reports and 971 tests and is only available to subscribers. To get a one-time preview of our work, enter your work email to access the full article.

Already a subscriber? Login here | Join now
Loading Related Reports