Detection, Classsification, Recognition and Identification (DCRI)
Detection, classification, recognition and identification is all very easy until you put it in practice. Then you realize that there are big practical and subjective problems that make using them difficult.
Academically, the terms are straightforward and reflect what they mean in real English:
- Detection - The ability to detect if there is some 'thing' vs nothing.
- Recognition - The ability to recognize what type of thing it is (person, animal, car, etc.)
- Identification - The ability to identify a specific individual from other people
Obviously, each level is harder and requires more detail, going from detection to recognition to identification.
Many manufacturers overview these categories including an Axis tutorial and a Bosch white paper [link no longer available]. For instance, here's the PPF chart that Bosch uses to provide concrete numbers:
And here is the Axis table:
Both relate a targeted PPF number to a certain level. While Bosch uses imperial and Axis uses metric units, the two manufacturers numbers are way off.
- Bosch recommends ~50ppf
- Axis recommends ~80ppf
- The Swedish National Lab recommends ~150ppf (cited inside the Axis paper)
Your first reaction might be, "What is going on here? Is someone wrong, lying or confused?"
Any of them might be the case individually, but, unfortunately, two fundamental issues exist with using these metrics.
2 Fundamental Issues
First, the terms are inherently subjective and depend on the perception and preferences of the individual viewer.
What is good enough for 'identification' can vary considerably among people. In our trainings and presentations, we have seen passionate debates break out about what passes for identification.
Try this for yourself. Look at the 3 images side by side below:
Which one, for you, is sufficient for identification? 40 (on the left), 60 (in the center) or 80 (on the right)?
In an IPVM class, it was a three way tie.
You face the same challenges in determining what's good enough for detection. Again take these image side by side:
What makes this particularly challenging is that seemingly modest changes in visual details captured can mean the difference of 2x the number of pixels per foot, which can have a massive impact on camera layouts and project design.
When people look at an image by themselves, the answer often appears 'obvious'. It is only when they confer with other people that it becomes a debate as they realize many others have different perceptions.
How do you resolve this scientifically? It is a tough call and ultimately the 'winner' is whoever is paying for the system.
Pixels and Image Quality
The other issue is that the same number of pixels will deliver different image quality depending on the time of day. You might be OK in perfect even lighting conditions with 50ppf for identification but if the sun sets or rises in line with the camera you might need more. Worse, if you only have street lighting at night, you still might need twice as many pixels per foot.
What to Do?
Using these terms and thinking about them as a goal is useful. Just do not treat this as simple math that is guaranteed to always work.
Most importantly, set the right expectations:
- What level of details does the user/buyer/decision maker really want? Decide with a visual example, don't just say 'detection' and hope they agree when the system is commissioned.
- What lighting or environmental issues are faced and how much of an impact will it have? Make this clear up front with the user/buyer/decision marker so they understand the inherent challenges in a real world operation.