Aspect Ratio 16:9 vs 4:3 Shootout

Published May 03, 2012 04:00 AM
PUBLIC - This article does not require an IPVM subscription. Feel free to share.

In the past few years, as HD swept into living rooms, people have moved from watching video on 4:3 aspect ratios to the wider 16:9 format. This has carried over into surveillance where the wide screen HD format has become very popular. Indeed, in a recent reader's survey asking to choose between 4:3 and 16:9, the wide format won in a landslide:

IPVM Image

The premise behind 16:9 preference is that most scenes in surveillance are typically wide but not tall (i.e. there are no 10 foot tall people or 30 foot tall trucks, etc.)

Unfortunately, in practice, this assumption, while true, detracts from real world performance.

Our Tests

This surprised us as well but our series of tests, in a variety of real world scenes, showed over and over again clear practical benefits of 'full' 4:3 aspect ratio. These scenes include:

  • A small indoor conference room
  • An indoor lobby
  • An outdoor intersection - with both wide and telephoto FoVs
  • An outdoor parking lot - with both wide and telephoto FoVs

These sample images were produced by the same camera, in the same location, switching aspect ratio from 4:3 (1.3 MP - 1280 x 1024) to 16:9 (720p - 1280 x 720).

Key Factors

There are three key factors that affect which aspect ratio should be used:

Taller Not Wider: The term 'wide' is a misnomer for surveillance applications. For any given sensor, the FoV width for 4:3 and 16:9 are exactly the same. The only thing that differs is the height. For example, with a 1.3MP sensor, the 4:3 aspect ratio is typically 1280 x 1024 while the 16:9 version is 1280 x 720. The total pixels wide stay constant. In the 16:9 you simply lose 304 rows of pixels. Ultimately, this is the core of the problem.

What may cause confusion is that in TV, unlike surveillance, the wide-screen aspect ratio actually adds more content to the left and right side. By contrast, in surveillance, you simply lose on the top and bottom. This image demonstrates how the two applications differ:

IPVM Image

Notice that the wide TV shot shows more details on the left and right sides while the wide CCTV shot actually loses details on the bottom.

Downtilt: If cameras are installed with anything aside from slight downtilt, 4:3 better fits the field of view. This is because while 16:9 may remove potentially wasted ceiling or sky from the image, it removes portions of the scene closer to the camera, where pixel density is highest. Aiming the camera down further to compensate quickly begins to remove wanted areas from the scene, and may cut off subjects' heads, or simply not provide a deep enough FOV.

Telephoto zoom: In wide fields of view, objects at the periphery, such as trees, shrubs, walls, and other objects, are often irrelevant, adding nothing to the scene. However, when using telephoto lenses, the field of view is more likely to contain relevant information. When FOVs are only 5-10' wide, capturing an extra foot of depth provides proportionately more information than in a wide field of view.

Both of these factors and their effects, are demonstrated in the application images below.

Indoor Conference Room

In this example, with a camera located at ceiling height, nearly two full seats at the table are lost when switching to 16:9, without the benefit of removed wasted information from the image. Subjects' actions while seated in these two seats would likely be difficult or impossible to determine, though identification would be possible as they circled on either side of the table to sit down.

IPVM Image

Indoor Lobby

This lobby example demonstrates what gain the added depth of 4:3 provides:

IPVM Image

In the 4:3 example, the entry doors are covered, as well the walkway near the camera. Using 16:9 aspect ratio, one of the others is sacrificed. Aiming the camera down to better capture the near area moves the camera too low to capture many subjects' faces as they enter. Aiming it higher to capture faces, the area closest to the camera, with the best chance of recognition, is sacrificed.

Outdoor Intersection

This scene, a wide-angle view of the intersection, is one example where 4:3 provides a limited benefit over 16:9:

IPVM Image

The 16:9 image contains less wasted space, cutting off landscaping and skyline which the 4:3 image does not. Landscaping especially may create extra motion in the scene, and increase bandwidth and storage needs.

However, when zooming in, 16:9 becomes problematic. In this case, 16:9 is only able to capture the far part of the intersection, or the near, while 4:3 is able to capture the entire view with the same resolution:

IPVM Image

Outdoor Parking Lot

The advantages and disadvantages of 4:3 and 16:9 are prominent in this outdoor parking lot scene, as well. Using a telephoto lens, the 16:9 image loses an entire lane of traffic in this example:

IPVM Image

However, when using a wide angle lens, only part of a parking space is lost:

IPVM Image

Unlike the intersection example, however, 16:9 does not remove any completely irrelevant information. The portion of the parking space lost is relevant to the scene, however small it may be.

Corridor Format

We did a separate test on the mode that flips 16:9 to 9:16. See: Corridor Mode Tested

Image Cropping/Privacy Masking

Though 4:3 likely fits most scenes better, there still may be a desire to remove some unwanted portion of the scene. This is possible using two camera features:

  • Image cropping: Image cropping allows users to select a custom portion of the camera's FOV which is viewed and recorded, removing the rest. For example, users may remove drop ceiling from interior cameras or blank sky from exterior views, reducing bandwidth and storage. Similar to corridor format, custom crop views may leave blank space to the sides of video, as they are non-standard aspect ratios. Additionally, not all manufacturers support image cropping, even among majors, and those that do may not support it across all cameras in the line.
  • Privacy masking: Privacy masking may be used to remove irregularly shaped areas from the field of view, along edges (similar to cropping), as well as within the video itself. These masks reduce bandwidth and storage proportionately, as no video is sent for the masked portion of the FOV. This may be useful for masking out objects with regular movement irrelevant to the scene, such as the landscaping in our intersection example above. Users may see our Reducing Bandwidth Through Privacy Masks overview for more detail on this subject.

Camera Support

While many cameras will market their resolution as wide screen HD (either 720p or 1080p), most of them support the 'full' 4:3 aspect ratio as well. For instance a 1080p 'camera' that streams at 1920 x 1080 will often also support 1920 x 1440 (a 4:3 aspect ratio) stream.

Carefully check that cameras marketed as HD also support the 'full' 4:3 aspect ratio. Cameras that do not should be treated as, at least, a minor deficiency.