SVC - A Better H.264 Coming For Video Surveillance

Published Oct 02, 2008 00:00 AM
PUBLIC - This article does not require an IPVM subscription. Feel free to share.

SVC will solve a key problem of H.264: While H.264 generates a fixed quality and sized video stream, video surveillance users can benefit greatly from the dynamic re-sizing that SVC allows. The two main benefits of this are improved remote viewing and more efficient storage utilitzation.

This report provides an overview of the key elements and benefits. For greater depth, read a more in-depth and technical tutorial on SVC [link no longer available].

Using H.264 provides benefits but this may not be enough to meet video surveillance user's needs. H.264 is sufficient for small numbers of cameras to attempt to share the bandwidth of a corporate network, but it is not good enough to reach out over DSL to remote locations.  With megapixel cameras becoming increasingly common, even the bandwidth consumption of corporate networks is becoming an issue.

The compression efficiency of H.264 requires significant processing power in both the compression and decompression engines.  This raises the cost of encoding subsystems in cameras and DVRs, and makes decoding the stream on portable devices in the field prohibitively expensive.  To make the streams more accessible, the surveillance community has attempted to leverage the techniques of the past and either simulcasts or trans-rates multiple frame rate and resolution versions of the same stream.  Each version is targeted towards the specific compute and bandwidth characteristics of a particular client or application.  In doing so, the costs of encode and decode are incurred multiple times.  With the increasing diversity of video enabled portable devices in the field and the desire to view the exploding number of available feeds from remote locations, this problem is set to get geometrically worse.  Enter the Scalable Video Codec (SVC) extension to the H.264 standard.

SVC replaces the “all or nothing” approach to video compression (shard by MPEG4 and conventional H.264) with a layered, scalable approach.  In an SVC encoder, a low frame rate and low resolution version of the source video stream is first processed.  This forms a baseline layer of encoded video.  A second layer of information is then encoded from a higher frame rate or higher resolution version of the video stream using this baseline layer to guide the encode process.  A third layer of increased resolution or frame rate is then encoded using the second layer as a starting point.  This process continues on each successive layer.  This technique of using previously encoded information to guide subsequent encodes reduces the overhead that would otherwise be incurred in a multi-encode system.  At the end of the encode process, all layers are assembled into a single stream and transmitted.

The advantage of this approach is that a client device can decode the received stream, starting with the baseline layer, and then decode incremental information from subsequent layers until the desired frame rate and resolution is achieved.  A device having a lower resolution display or less compute power available for decode might elect to terminate the decode process after the first few layers.  A higher powered or high definition client device might decode all of the layers as they arrive, thus obtaining the video at full resolution and frame rate.  In this way, a single stream can be used to service any client device simply by allowing the client to decide how much to decode.  This characteristic of SVC streams will facilitate the adoption of high definition cameras whose streams would otherwise need to be re-encoded for legacy devices.

Another advantage to this approach is that a multi-layered stream can simply be truncated to yield a decodable stream with lower resolution and frame rate.  This can be done within the network itself, with the stream being truncated as it passes from a high bandwidth link to a lower bandwidth link.  In this way, the stream is sized to match network bandwidth and yield video with reduced resolution or frame rate without having to decode the stream.  This is a major improvement over the alternative, which requires a server in the network to decode the stream, scale the decoded video, and then re-encode the video as it is forwarded. 

This same decimation process might occur after the video is captured and stored.  Parsing a stored file to remove some of the higher order layers would quickly and easily recover disk space in a DVR, while having the effect of reducing the video’s resolution or frame rate.  Using the scalability of an SVC encoded stream, a surveillance operator could gracefully degrade video over time to manage storage consumption.  In this way, video could be archived for longer using less storage than would be consumed by a conventionally encoded stream.

SVC is set to revolutionize the way video is moved, consumed, and stored.  The flexibility afforded by the scalable stream will allow video to be accessed by a more diverse and increased number of consuming devices over myriad network bandwidths and technologies.  Operators will be able to cost effectively size encoded video and manage it over time with greater flexibility than ever before.