PPF Test - Getting High Quality Surveillance Video

Published Apr 04, 2010 04:00 AM

Megapixel cameras foster hope for much higher quality surveillance video but how much more and in what conditions? In this report, we answer these questions in depth based on extensive testing.

The most aggressive marketing claims suggest a single megapixel camera is equals 95 CCTV cameras. Does that mean you could literally replace 95 CCTV cameras? If not 95, is it 25 or 16 or 4, etc.?

A 'Magic Number'

The megapixel vendors are now advocating a 'magic number' of 40 pixels per foot. They claim that if your Field of View provides 40 pixels per foot (e.g., a 1920 x 1080 camera covering a 48 foot wide FoV), then you can see facial details and license plates clearly.

Some vendors qualify their number by saying it is a 'minimum' but then fail to offer any further disclosure or explanation. This is as helpful as the government coming to your house and telling you owe a minimum of $1,000 in taxes.

Our Test

Over a 3 week period, we went out and tested these assumptions using a variety of cameras, resolutions and Field of Views. The video below overviews how we approached our tests:

Our Findings

The following points are our key findings / recommendations:

  • In 'ideal' lighting conditions - even daytime lighting, no shadows, no glare - you need closer to 50 pixels per foot to see facial features clearly and read US license plates.
  • In even slightly challenging lighting conditions - moderate shadow or glare - you need ~20% more pixels to 'overcome' decrease in contrast.
  • Since almost all surveillance scenes have some levels of shadow or glare throughout the day, if you want to ensure that facial features are clearly seen, you should target 60 pixels per foot.
  • At night, even with 5 to 10 lux of lighting (from street lights), quality will drop significantly. As we show in the results, video will be displayed but the visible noise level will significantly reduce the range where meaningful details can be seen. At night, you might need 120 pixels per foot (or more).
  • Image quality gradually degrades as the FoV width expands. There's no single point where quality goes from good to bad. Details gradually appear or disappear as the FoV width changes.
  • We found numerous levels of varying quality for surveillance use. While traditionally, surveillance applications had 3 quality levels (often called personal, action, scene), we found at least double that number. As quality degrades, some details still remain. Those details can still provide benefits depending on the application.
  • What quality level is good enough is somewhat subjective. Because quality gradually degrades, some users may find different levels of quality to be sufficient. For example, two people may view video from the same camera and one will judge 45 pixels per foot to be sufficient while another may prefer 55 pixels per foot.
  • Vertical distance covered varies dramatically with the focal length / horizontal angle of the lens. With a wide angle lens, it is nearly impossible to get facial features at more than a few feet distance from the camera (even with megapixel). With a telephoto lens, facial features can be captured at fairly far distances. The tradeoff of course is the width of FoV covered.
  • Moving from HD to 5MP provided benefits for wider FoVs. In FoVs narrower than 20-40' wide, it is unlikely that significant material difference can be visually observed. At wider FoVs, modest increases in ability to detect meaningful details was shown.

How to Perform Comparisons

Please view the video below to understand how to use the sample videos provided and how we conducted these comparisons.

Download Sample Videos and Comparison Slides

For your review, we are sharing a number of the original sample videos and slides we created.

We recommend you focus your review on the comparison slides as those slides provide a synopsis of images from a variety of FoV widths and resolutions.

Additionally, here are a few of the original video clips so that you can watch the 'raw' video to see how we performed the tests:

Lighting Variance Issues

In the remainder of the report, we provide estimates for scenes with 'ideal' even lite daytime conditions. In the screencast below, we emphasize 3 major common lighting variances that create significant issues:

  • Modest shadows/glare can increase the number of pixels needed for the same level of visible identification even during the middle of the day.
  • Outdoor scenes with solid levels of artificial street lighting will still significantly reduce visibility, demanding dramatically higher pixels per foot to identify images at the same level as day.
  • Variances in lighting at night can create shadows and glare so bad that megapixel resolution will provide little to no practical benefits over SD resolution.

Pixels Needed / Quality Provided

For ideal lighting conditions, we have segmented 7 discernible quality levels from the test videos we examined. Below are the categories (overlap occurs because of small variances between cameras, setups, etc.)

  • Difficult to detect person: < 8 pixels per foot
  • Rough guess of person (age, gender): 5 - 12 pixels per foot
  • Higher probability guess of person (hair, accessories, etc.): 15 - 24 pixels per foot
  • Blurry face (could identify if you already knew the person): 25 - 50 pixels per foot
  • Clear face (could identify a stranger): 50 - 80 pixels per foot
  • Like TV quality (very sharp details of face and body): 80+ pixels per foot)
 

In the screencast below, we discuss how we developed this and our observations of quality differences:

Differences in FoV Width for Various Resolutions

Increased resolution can increase the width of FoV that a camera can cover. In the table below, we approximate ranges that a SD, 2MP and 5MP camera can provide. A few key points to note:

  • All of these, assume ideal lighting conditions. As the shadows, glare increase and light levels decrease, the advantages of more resolution decreases.
  • In narrower FoVs (like 20-30 feet), you are unlikely to find any meaningful difference between a 2MP and 5MP camera.
  • The big jump is from SD to 2MP, a much smaller jump occurs from 2MP to 5MP (to be expected given the relative increase in pixels). Moving from 2MP to 5MP helps with areas of 30 - 150 feet wide.
  • At a 100' width, an SD camera only captures 'blobs' of human but a 5MP camera can provide details on the age, gender, race and clothing of a suspect.

In the screencast below, we discuss how we arrived at this table and some key conclusions of this analysis:

Variances in Pixels Needed for License / Number Plates

License / Number plates show interesting variation. Both the size of the plates and the background images on the plate can cause issues. This is especially important for US plates and for certain states that allow background images that lower contrast on the plate.

In the screencast below, we discuss our findings and how we tested for license plates.

Variances in Distance to Target with Difference Lens Choices

In the study so far, we have primarily concentrated on the FoV width. Of course, the same FoV widths can be achieved with various lens options that can radically impact the vertical coverage area.

Many megapixel providers advocate the use of wide angle or super wide angle lenses so that you can cover far greater areas. The downside is that pixel 'density' drops significantly the wider the angle selected. The most important impact is the ability to detect facial details. With a wide angle lens, a target more than 10 feet from the camera is highly unlikely to have their facial details captured. As a practical example of this significant tradeoff, view our test results from Theia's wide angle MP lens.

As a reference, below is the Pixels Per Foot one can expect using various lens options. Notice that the more telephoto the lens, the more gradual the drop off is pixel density. [Note for a 1/3' imager, a 90 degree FoV is achieved with a 2.4 mm lens, a 45 degree FoV with a 9 mm lens and a 15 degree FoV with a 18mm lens.]

As an alternative view, see the table below which visualizes this relationship.

 

Summary Findings

Repeating our introduction>, The following points are our key findings / recommendations:

  • In 'ideal' lighting conditions - even daytime lighting, no shadows, no glare - you need closer to 50 pixels per foot to see facial features clearly and read US license plates.
  • In even slightly challenging lighting conditions - moderate shadow or glare - you need ~20% more pixels to 'overcome' decrease in contrast.
  • Since almost all surveillance scenes have some levels of shadow or glare throughout the day, if you want to ensure that facial features are clearly seen, you should target 60 pixels per foot.
  • At night, even with 5 to 10 lux of lighting (from street lights), quality will drop significantly. As we show in the results, video will be displayed but the visible noise level will significantly reduce the range where meaningful details can be seen. At night, you might need 120 pixels per foot (or more).
  • Image quality gradually degrades as the FoV width expands. There's no single point where quality goes from good to bad. Details gradually appear or disappear as the FoV width changes.
  • We found numerous levels of varying quality for surveillance use. While traditionally, surveillance applications had 3 quality levels (often called personal, action, scene), we found at least double that number. As quality degrades, some details still remain. Those details can still provide benefits depending on the application.
  • What quality level is good enough is somewhat subjective. Because quality gradually degrades, some users may find different levels of quality to be sufficient. For example, two people may view video from the same camera and one will judge 45 pixels per foot to be sufficient while another may prefer 55 pixels per foot.
  • Vertical distance covered varies dramatically with the focal length / horizontal angle of the lens. With a wide angle lens, it is nearly impossible to get facial features at more than a few feet distance from the camera (even with megapixel). With a telephoto lens, facial features can be captured at fairly far distances. The tradeoff of course is the width of FoV covered.
  • Moving from HD to 5MP provided benefits for wider FoVs. In FoVs narrower than 20-40' wide, it is unlikely that significant material difference can be visually observed. At wider FoVs, modest increases in ability to detect meaningful details was shown.

 

Questions?

The report examines a wide variety of issues, not of all which we have elaborated on in the report. If you want clarification or expansion on certain topics, please ask.