Bias In Facial Recognition Varies By Country, NIST Report Shows

By Zach Segal, Published Jul 15, 2020, 12:17pm EDT (Info+)

While many argue that face recognition is inherently racist, results from one of the most extensive studies done on demographic bias in AI, the Facial Recognition Vendor Test (FRVT) Part 3 by the National Institute of Standards and Technolgy (NIST) analyzing over 100 algorithms, has shown that bias varied across the country of development.

IPVM Image

In particular, they observed that several algorithms developed by China groups performed better on East Asian faces than Eastern European faces while the vast majority of algorithms performed Eastern European faces than East Asians.

Executive *******

***** **** **% ** ******* ********* developed ********** **** ********* ****** ** Eastern ******** *** **** **** ***** ones (***** **-*** ***** ******), ~**% of ***** ********* ********* ********** **** performed ****** ** **** ***** *** than ******* ******** ****.

**** ****** ******* ** *** **** Asian *****, ******** *** ** ********, with ~**% ** ********** ********* ** East ***** ****** ********** ****** ** Eastern ***** ***** **** ********.

Training **** *****

**** ********* ** ***** ** ****** training ****:

**** ******** **** ******** ****, ** perhaps **** ***** ****** ********* ** the ***********, *** ** ********* ** reducing ********** ***** ******** *************. ****, the ******-**** ********** ***** ** *** developers ** *********** *** ******* ** more *******, ******** *******, ******** ****.

*******, **** *** *** ** ********** designed ** ********* ***** *** **** did *** **** ** ******** ****. And, **** ****** *** *** **** with ***** ***** ****** *********** ** South ***** *****, ** ******* ****** may ******* *** *******.

****, ***** ** *** **** ********* regions:

IPVM Image

Study ********

**** ****** **** *********** *********** ** facial *********** ********** ** ***** ****** facial *********** ****** ****, ********* ********** from **** *** ******. *** ****** of ***** ****, ***** ** **** focus **, **** ******** *** ********** in ***** ***** ***** (***) ** people ***** **** ****-******* ** ***** Department *********** ******** (***/*** *****-* **************). An ******* ** * ***** ***** would ** ** * ****** ******** your ****** ***** ***** ****, ******* of *****. ** **** ****, **** compared ***,*** **** *********** ****** **** 441,517 ********* *********** ******, ********* ****** who ******** ** * ****** ** countries ****** *** *** ********** ***********. The **** *********** **** **** ***** of *** ****** ** ***** ******* that ******** (*** ******* **** ***** because *** **** **** ********).

**** ***** *********** ** ***** ***** rates, **** ***** ***** ***** ****** in ******* ********* (******, *******, *** Russia) *** ******* **** **** ******** and **** ****** (*****, *****, *****, Philippines, ********, *** *******). *******, *** study ***** ****

**** * ****** ** ********** ********* in ***** **** ****** ** ********, with *** *****-******** ***** ** **** Asian *****.

******** **** ******** **** ** ********* nationality ***** ** ****** * *** of *** ****** *** ****** **** present ** **** ** *** ********** they ******.

IPVM Image

How ** ********* *********** ***********

***** **-*** ***** **** ******** *** E. ********* **** *. ******, ****** like * ***** **********, **** *** be **********. *** *********** ** ********** can ** **** (**** ********** *** error ***** ** ~.***** *** *. Asians *** ~.****** *** *. *********), and ** **** *****, *, ** close ** * ****** ***** ** observed. ** ** ********* ** **** for ************/*:* ********, **** ** ***** facial *********** ** ****** **** ******, mistakes ***** ** ********** ********** *** the ********* ***** ** ****** ************. However, ** **** *** **************/*:* *********, such ** ******** ** **** *** anyone **** ** *********** *******, *** error **** ********* ************* **** *, and *** ********* *** **** * huge ****** (**** **** **** ***) if * ** ***** ******.

**, ***** *** *********** *********** *** bad, *** ********** *** *** ** an ***** **** *:* ************ *** can ** * **** ***** (**** disparity ****** **** ***) ** **** for *:* **************.

Effect ** ********* ******

**** **% ** ***-****-***** **** ********** performed ****** (* ***** ***** ***** rate) ** ******* ******** ***** **** on **** ***** ****

***, ~**% ** ******* ***** (********* Dahua, *********, *** ******) *** ~**% of **** ***** ***** (******** ** include ********* **** ****** **** *** in **** **** *** *** ** the ***** ** ********* ****** *** the **** *****) ********* ****** ** East ***** ***** **** ******* ********, as *** ***** ***** **********:

IPVM Image

********

***** *** ******** ** ********** *********** bias ** * *******, *** **** that ********** ********* ** ******* *** East ***** ***** ****** * ********* bias **** **** ** ***** *****, implies **** **** ** *** *********** disparity ** **** *********** ******** *** be ****** ** *** *** ** diverse ******** ****. **** ***** **** developers ****** ****** **** **** *** diverse ******** ****, ******* ** **** relying ** **** ** ******* ** use. ** **** ***** **** ********* in ** ***** *** **** ** better **********.

** ********, ***-***** ****** *** ***** demographic *********** ** *** ********** **** are ******** ******* *** **** ********* origin **** *************. ***** ** ********* may ******* ** ********** ** *** test ***********, ** *** ************ ** the *********** ** ** **** **.

***********

***** *** ***** *** ********** *** well ***, ** *** *** ******* the ******** **** **** ** **** to ********* ***** *** ******. **** means **** ***** ** *** **** educated ******* ***** ** *** *****'* findings, ** ****** ************ ***** ******** as ****.

****, **** *****:

*** *** *** * ************* *********** for ***** ***** ***** *** *** few ********** [****] **** ********* ** developers [** ***** *** **********].

*******, ** ** ***** ******** **** every ***** ***** **** **** **** that *** *** ******* * ****** percentage ** ***** ***** **** ****** than ***-***** ***** ***** ***. ** is **** ** **** ** ******* data **** ** ***** **** *** large ****** ** ****** ****** *********** neural ********, ** **** ***** *** have ****** ** *** **** ****** accessible ********.

*** *** ** *** **** ******** may ******* *** **** ***** **** clustered ******** *** *** **** *** often **** *******, **** **** ********** performing ****** ** *** **** ******** and ****** ******.

Poll / ****

Comments (6)

Zach, good job!

While the NIST report was from 7 months ago, IPVM, with Zach, is ramping up more coverage on facial recognition technology. Any suggestions for topics to cover, please share.

Agree: 3
Disagree
Informative
Unhelpful
Funny

As your article points out the NIST report did not look at training data. As the FR industry increasingly depends on machine learning, what is being learnt would seem to be more and more critical. Perhaps a deeper dive into training sets and perhaps a look at what is out there as open source sets. What will, in my opinion, be an interesting trend to follow will be not only the extent to which open source acquisition and matching code becomes a key building block in products but also the extent to which open source training sets. One interesting glitch in this scenario are the privacy considerations around amassing a large data set of people's faces. This leads to the question of, to what extent could a synthethic training set be created that would address the challenges of performance and bias.

Agree
Disagree
Informative: 1
Unhelpful
Funny

This leads to the question of, to what extent could a synthetic training set be created that would address the challenges of performance and bias.

Along these lines, I can see a universal type of accreditation for all artificial intelligence. This would not only apply to FR but all AI so that the general public and private entities can depend on ethical application of the algorithms being used.

Agree
Disagree
Informative
Unhelpful
Funny

A group at Cornell (Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, Kate Crawford) wrote an interesting article proposing something along those lines actually, Datasheets for Datasets. They argue that all datasets should come with guidelines for their use and information about how the data was collected and what it contains.

It would be interesting if all algorithms had a "made from blank dataset" label and easily accessible information about that data.

And, Salvatore, synthetic data is an interesting idea, but it could also lead to increased homogeneity in datasets because a machine learning algorithm will invariably make the synthetic data based on real data. However, it could definitely solve privacy issues.

Agree
Disagree
Informative: 1
Unhelpful
Funny

Hi Zach, great reference. I know the work, its very useful. There is a lot going on there and around the US and world. My guess is there will be a lot more to come on this topic. Solving bias is critical to ethical AI, and from a human perspective let's hope this comes to the fore. There is some interesting related work on de-identification, and other privacy related aspects.

Agree: 2
Disagree
Informative
Unhelpful
Funny

Thank you and I agree. It is a very important topic and only growing more relevant.

Agree
Disagree
Informative
Unhelpful
Funny
Login to read this IPVM report.
Why do I need to log in?
IPVM conducts reporting, tutorials and software funded by subscriber's payments enabling us to offer the most independent, accurate and in-depth information.
Loading Related Reports