The advance of AI promises more sophisticated recognition. For example, Dahua markets the ability to distinguish between angry, calm, confused, disgusted, happy, sad, scared, and surprised.
Dahua's emotion detection made frequent mistakes, though its most common categorization was 'calm' for people who had neutral/expressionless faces, which is the most common state for people. Happy and confused were accurate most of the time, while angry, disgusted, sad, scared, and surprised worked poorly.
This chart provides an overview of the performance for each category Dahua offered:
Because the camera attempts to classify emotions any time a face is detected, even when blocked, facing away, or low PPF, along with its tendency to classify subjects as "Calm" in these instances, the analytic may appear accurate, since most subjects are likely to be expressionless most of the time.
Best Accuracy At High PPF/Shallow Angle (But Still Significant Issues)
In our testing, we began with a simple scene, well lit, at a low angle of incidence, with subjects facing the camera at close range to eliminate possible factors such as harsh angles or low PPF.
Even in these conditions, emotion detection was mostly inaccurate. The only expressions consistently accurately detected were Happy:
And Calm (although the camera classified users as Calm under many conditions, discussed below):
However, other expressions such as angry, sad, surprised, or scared were most often detected as calm, or sometimes happy.
Accuracy Worse At Harsher Angles and Lower PPF
At angles of incidence greater than ~45°, most subjects were classified as Calm, regardless of their actual facial expression.
The camera even classified the backs of subjects' heads as calm during testing:
Similarly, at low PPFs, subjects were only classified as Calm or Happy.
Incorrect Classification When Face Obscured
Since the camera attempts to classify emotions any time a face is detected, it attempted to classify even with much of the subject's face obscured. For example, when partially blocked by cardboard, the camera still classified the subject, but only as Calm and Happy.
Reduced Accuracy On People With Facial Hair
During testing, people with facial hair expressing a facial emotion were missed and instead mostly considered calm, below the person with a beard and angry face were not classified as angry.
No Sad Or Angry Classification
In our tests, no face was ever classified as sad or angry, instead, they were mostly classified as confused, happy, or calm. Below, the person has an angry face but is classified as calm.
Confused Accurate On People Without Facial Hair
People walking through the scene that were showing a clear confused face that was not obscured by facial hair were accurately detected.
Though when people were confused and had facial hair, Dahua emotion analytics did not accurately detect the emotion.
Disgusted Inconsistent Detection
The disgusted emotion was sometimes detected on a person walking through the scene with a clear disgusted expression.
However, most of the time the disgusted expression was not detected, even on an exaggerated attempt to get it to classify.
Simple Setup
Emotion detection setup is simple, the user must turn on video metadata in settings, and then it will show the metadata including the emotion classification on the live view screen.
Tech Support And Dahua Feedback
Dahua USA tech support tried but was unable to help improve emotion detection, beyond getting a clearer shot of the face, while Dahua management did not reply to our request to comment on the performance.
Pricing
The Dahua IPC-HDBW7442H-IPZ can be found online for ~$920 USD.
For the life of me, I can't think of a real world use case in the video security world where these analytics would improve a live monitoring or investigative/forensic situation. Particularly in light of the inaccuracies and impossibility to detect someone's real emotional state from a video camera. I am "surprised" and "disgusted" at the same time more frequently than I'd prefer. How does that get resolved with any analytic?
There may be use cases in interrogations (lie detection perhaps) or marketing analysis but that's about it...I think. And that's based on body language as much as it is facial expression.
If you're a super invasive retailer, you could use this to build profiles of shoppers and more accurately predict what they want.
If this were extended to microexpressions, you might be able to predict if someone in a crowd was about to start trouble. I'm pretty sure there was a paper several years ago about using emotional analytics to spot terrorists before they acted. (Though given some of the stories I've heard, the underlying emotions could be totally different than what you'd expect.)
Of course, if the analytic isn't accurate, then it's hard to see it being useful.
This was the funniest article in a long time... I love the faces (and the corresponding misclassifications). I bet you guys had a great time testing... thank you for making us smile at the end of 2020!
Any actor can make expressions of feelings that they are not really feeling and anyone can train to do so and to CHEAT any system. It is much easier to train someone to perform as an actor than it is to train someone to fly an airplane, remember 9/11. Why insist on an easily deceptive system that clearly will not deliver the expected results?Totally misleading. It is NOT SECURITY. It is a FAIRY TALE. What really exists and works are tiny movements caused by facial muscles that DO NOT DEPEND on the person's will. Which means that from an internal emotional sensation, the person INVOLUNTARILY produces a tiny muscular movement on the face. This movement is detected by people who are experts in the subject, or by someone who has studied this to perceive people's reactions at business meetings and conduct the themes according to the receptivity or not of the people, expressed by these imperceptible and again INVOLUNTARY facial movements. This does exist, it is real. But ONLY PEOPLE can perceive these muscle movements, TECHNOLOGY DOES NOT do this yet for a very simple reason, because there are MORE THAN 250,000 known micro movements, each one expressing a result, which would need to be written in the algorithm in order to individualize these patterns for each person. That simple. Easy and quick to do, isn't it?🤔
Yes, it would be impossible to get anywhere near 100% accuracy. But even if you can get a measly 50% accuracy when live monitoring or getting notifications, it could possibly prove helpful. If it prevents one violent act, I think it could all be worth it.
To be fair, it would be better to test this in real life. Trying to make expressions for feelings you are not really having is not an accurate test. I would love to see this used for a period of time in a real life situation and then compare what it finds by someone who is trained to read expressions. I do understand that this would be difficult. And I am sure it would not be much better, but it would be the right way to test it.
Trying to make expressions for feelings you are not really having is not an accurate test
Kris, if I understand you correctly, you believe Dahua realized that Rob (below) was just faking being upset and that Dahua realized that Rob was actually happy deep inside?
While I am partially joking, my point is that any of these video-based emotion detection systems are inherently claiming to recognize the visible expression of emotions, not the actual emotional state of the person. That's fair, no?
To be clear, because of this, there is an inherent risk in any video-based system since it assumes the visible appearance of 'happiness' is equal to actual happiness when for many reasons people might give the appearance of being happy when they are not (or vice versa).
That said, if a system categorizes Rob's expression above as happiness, it indicates it has some fundamental problems.
Btw, we tested this real-life as well with various recorded videos and the patterns were the same.
Yeah, I am not saying that it would turn out any different. Was Rob happy? When was the last time you saw an angry face actually look like that? Maybe the system was correct. (half joke) That is my point. It is just hard to judge the system when not using it the way it was intended. I would prefer to see the real world results rather than posed facial expressions. Also, I am sure it will get better over time. Again, no where near perfect, but better. Enough to possibly a first line of defense.
It is just hard to judge the system when not using it the way it was intended.
Kris, my point is that is the way it is intended. Dahua is not claiming that they can read minds. Dahua is claiming that when presented with an expression that stereotypically looks like 'calm', or 'anger' or 'surprise', etc., they will report appropriately.
In this case, the happy icon represents the lips physically turned upward, which people typically associate with happiness. Rob's lips are turned downward, which typically is associated with being unhappy:
And, again, we tested it in 'real life scenarios' as well, with the same fundamental results, but we are showing these simple, ideal ones because if we show a real-life one where Rob is angry but he is farther away or at an angle or his head is partially obscured, people will complain that it's just because of the bad camera positioning.
Enough to possibly a first line of defense.
Sure, if you want to use it, that's fine. Ultimately, like the flawed gender video analytics we tested, it depends on what accuracy one needs and how much they can tolerate mistakes.
Not saying that it works... But also, an emoji is not necessarily an actual representation of what the system may be looking for. What you may consider as the "stereotypical" expression may not be what the system is looking for. He looks like he has and itch in his mustache and is trying to scratch it without touching it to me. I think that would be more of an annoyed feeling though. Regardless of how you guys did it and posted your findings for this article, I would still really be interested to know what it would do over a period of time in the real world. It is obviously something that you can not recreate so it is not fair to you guys. This is not a feature that I would ever need to use or sell, just interesting to see what is being tried with technology.
I wonder if it saw his beard as a smile. (serious) You should have him shave then try it again. (joke)