Is Client Live View Performance IP Video's Weakest Link?

Or am I doing something terribly wrong. I've noticed that several major VMS products really stress a client workstation CPU when displaying multiple camera images for live viewing. Much worse than the old DVR days. I tend to believe that it is a function of multi megapixel cameras and H.264 compression.

One example is the control room of a correctional facility we installed with 2 monitoring clients PC's (Dell I5 processors and Nvidia graphics cards). Each workstation has 2 large LCD displays where they monitor the facility. If we place, for example, 16 camera views on each monitor, the CPU utilization goes to 100%, yet the network utilization is only 2-3%. So the bandwidth is low, but the decoding must just really stress the CPU. The only solution we've found, after discussing it with the VMS company is to slow the live view frame rate way down. It greatly reduces CPU load, but causes the live view to be very choppy. We've seen this with other VMS's as well.

Anyone else noticed this or discovered any cures for it, (short of installing Quad Xeons)?

Solution in a few words: Setup a D1@30FPS sub/extra stream for live viewing in your VMS

Milestone XProtect and Luxriot VMS definitely supports this as we use it every day - however we know many others support this feature as well.


An i5-3570 is capable of only decoding 8 x 2MP 25/30FPS streams and we want to view more then that on the screen - what do we do?

Solution: Setup a D1@30FPS sub/extra stream for live viewing (when in 4x4 grid or above) in your VMS (in this case, both Milestone and Luxriot 2.3+ supports dual streaming/dedicated live stream to enable this).

Requirements: The IP cameras must be able to deliver a second extra stream at D1/VGA resolution and 30fps. Example Dahua, Hikvision, Vivotek, Dynacolor, ACTi, Bosch and Axis we have tested can all do this.

Why it works: The D1/VGA stream requires only 25% the CPU to decode - so now you can decode up to 32 cameras simultanously at 30fps with an i5-3570

Extra notes: an i7-3770 can decode 48 D1@30fps streams in our testing.

What if we need to do even more cameras?: If you have that many then the camera windows will be really small - so then you can set the substream to CIF/QVGA and then again increase the number of cameras desplayable by ~3X (not 4X because by this time the incrediable number of threads causes some overheads) to ~100 cameras on an i5 or ~120 cameras on an i7.

If you need even more then that, drop the framerate to 15FPS - in our tests that nets a ~30% saving in CPU usage.

Bohan, In than settings, can you still pick a few cameras and watch them in MP?

In Milestone you can manually switch to the full res main stream on the Xprotect Smart Client. In Luxriot it automatically switches to the full res main stream when you zoom in to a single channel view. Other VMS clients have similar features in between these two extremes.

The biggest problem is software decoding, which strains the CPU. One possible solution is to use hardware decoders connected to small monitors (ie instead of 4x42" monitors, perhaps 24x24" monitors) but the cost can be prohibitive.

That was something we looked at during system evaluations but only one of the systems (Dallmeier) we evaluated was really set up to use hardware decoders properly.

Bohan, you're mentioning CPU's being a factor, but isn't that only if you're using the CPU's embedded graphics? I think we've had better results installing aftermarket video cards with good GPU's (don't have to be the latest most expensive "gamer" types) and lot's of video memory, 1GB or better, to improve stream decoding.

Luis, it is my understanding that only a few VMSes and a few video cards make use of the video card's decoding capabilities. Most systems software decode and that's the bottleneck.

Luis: GPUs have almost NO effect on performance as long as the drivers are okay (i.e. no problems with EVR/overlay). We have verified this by both questioning VMS developers and comparing a performance between Intel HD4000 Graphics and a nVidia GeForce GTX Titan.

NO VMS client at the moment actually uses GPU accelerated H264 decoding (e.g. nVidia CUDA, Intel IQSV, AMD APP). Part of the reason is that most GPU acceleration SDKs can only support 1-4 streams at a time - creating negligible difference between on and off when decoding 16+ streams in surveillance applications.

Pretty much 100% of decoding is done soley using the CPU on VMS clients.

Note: Blu-ray playback on a PC typically DOES use GPU acceleration - but that is a single stream scenario that does not scale to 10/20/30/50+ streams surveillance apps.

We have actually found Intel HD400 graphics builtin in i7 to be the best GPU for VMS clients - because of the lack of funny "video quality enhancement" features in the drivers (yes AMD and nVidia - I am speaking of Catalyst and ForceWare here).

Thanks. I'll have to ask them about that. I never claimed to be the smartest of the bunch, just the better looking.

For flexibility Dual or Quad CPU Xeon 8 Core (16 - 32 CPU Cores total) workstations are an option for supporting up to about ~400 cameras across ~9 displays (most workstation mainboards will run out of bandwidth once you use more then 3 graphics cards and most graphics cards cannot support more than 3 monitors).

Pelco's Endura system has "video wall" style hardware decoders as well as what Carl mentioned. However these devices are in danger of being rapidly obsoleted with the increasing use of 5MP+ cameras, H265, 4K cameras and other non-standard/emerging standard video streams.

It all comes down to if you actually need to adjust the live view layout routinely. If not then the liveview can be spread accross multiple screens and client workstation/decoders.

In a recent project we convinced the customer to go for 8 dedicated live view monitors with 8 x 1080p@25fps IP camera streams on each. We used 8 x Dahua NVR5232 200FPS/1080p standalone Onvif NVRs without HDDs as 8CH full resolution realtime hardware decoders and a single i7 workstation that was just used for spot investigations. The operator got to know the the monitors pretty quickly and was able to bring up a full screen view of any camera in a couple of seconds onto the workstation's two monitors - after which he could do playback/search/export/etc...

Bohan, great info and thanks for the detail. Glad it's not just me. I like the idea of a workstation for spot investigations and an "inbetween client" for the video wall type use. Are you separating the Dahua units from the VMS and having them pull directly from the cameras? Are they very expensive? Does that create any bottlenecks from the camera having to sort of multicast?

In my client's situation, (and I have several more with the same type setups) the monitoring views being presented on the large flat screens are very static. They want the guards to simply be able to view the muliti camera layout and not touch it. They even go so far as to remove the keyboard and mouse. In fact, using a PC client for this purpose brings unintended other issues such as Windows updates, reboots, etc and the fact that I can't figure out how to start up a VMS client and have it populate dual monitors automatically with a 2 different saved view.

The Specific setup was like this:

64 x 2MP Dahua IP cameras powered directly by 8 x Dahua NVR5232P (each with builtin 120W 8 Port PoE switch and connected to its own monitor via HDMI).

For recording we have 4 x i7-3770 servers each with 4 gigabit NICs running Luxriot Advanced (16CH licenses) manually setup to pull RTSP streams straight off the NVR5232P units on a per channel basis. This is possible as Luxriot supports direct entry of RTSP URLs as camera channels and the Dahua NVR's can deliver 20 RTSP streams in total per NVR. This way we do not stress the cameras or the network bandwidth as each NVR is powering 8 cameras and has a dedicated gigabit connection to a dedicated NIC on its corresponding server. The servers do software motion detection which is more flexible and results in better smart search results then camera side motion detection. This setup could have been Milestone or Digifort and probably most VMS out there - RTSP is the universal "standard" here.

Finally we have the 4 VMS servers connected to a gigabit switch that links to the client workstation for spot investigations.

The setup is actually quite neat with no wasted resources/ports. Also if anything fails only part of the system goes down.

Power efficiency is also not bad as each of those NVRs only eat 20W (excluding the PoE switch of course).

There are quite a few manufactuers making these standalone "Green" NVRs now that can be equally deployed as live view decoders.

We have seen the Pelco Endura solution in action and its decoders are fully controllablr from the workstation but you mentioned you do not even want such a capability.

NOTE: none of this above standalone NVR business is actually necessary if you just setup a less resource hungrey live view extra/sub stream in your VMS - hoe many cameras are you dealing with anyway?

Regarding pricing the Dahua NVR5232P are ~$900 per unit, so similar to the cost of a PC workstation.

About 100 cameras total but only about 60 or so or Live View monitored. your solution would also require a change to the cabling topology, as it appears all of the cameras would be wired to the central control room

John: In your case I would just recommended reconfiguring your VMS - not add extra hardware. Let us know which VMS and cameras you use and we can help with the settings.

Excellent discussion. Bohan, thanks for the detailed feedback.

What Bohan described initially, is typically called / marketed as 'multi-streaming', where the VMS dynamically picks from multiple streams at different quality levels, depending on the size of the display frame. For example, if a VMS is showing a 3 x 3 matrix, a lower quality stream might be displayed, saving on CPU. However, if a single camera is displayed by itself, the VMS would dynamically switch to the highest quality stream.

As Bohan and others has mentioned, VMSes implement this differently but most high end ones support it.

John, what VMSes are you using? We could then speak more specifically about recommendations.

As for Bohan's suggestions of using NVRs as hardware decoders with separate VMS servers for recording, the main limitation (outside of the cost) I see is that those 'decoders'/monitors would only show live video. You'd have to switch client application to see recorded video. Am I understanding this correctly?

"the main limitation (outside of the cost) I see is that those 'decoders'/monitors would only show live video. You'd have to switch client application to see recorded video. Am I understanding this correctly?"

Yes John this is the main limitation - lack of management. I mentioned that a system like Pelco Endura does have managed hardware encoders - but at many times the cost of a basic standalone NVR.

A lot of good information here.

HD Witness from networkoptix got a nice feature where the VMS automatically switch to the second stream from the camera when you view more cameras and back to main stream when you open one camera in full screen. I find it kind of strange that this feature is not more common.

Fredrik, that's multistreaming and a lot of VMSes do have it. I think one differentiator for HD Witness is that it does it automatically whereas some VMSes require you to configure it for each stream.

Btw, as another example, Avigilon often markets 'HDSM' but for H.264 cameras, it's multi-streaming.

It is the lack of automatic switching between the streams based on how many cameras you view or the CPU usage I`m supprised is not more common. Like Bohan mentioned earlier, you can set up both streams in Milestone Xprotect, but you will have to manually switch between them.

Here's an example of what Fredrik describes done with OnSSI Ocularis:

Every camera in this system was multi megapixel, but (my previous employer) configured secondary streams at reduced resolution (800 x 600) and reduced framerate (8 FPS) in thumbnail. Video was fluid and resolute enough for the small thumbnail, but if other matrix views or a single view was clicked up, the stream changed to higher quality. Despite simultaneously handling over 30 streams, the customer powered this hotspot monitor using a 'Mac mini'.

I'm not sure this customer was ever convinced that so many thumbnails at once was pointless.

Are we talking about MP resolution or normal resolution (4CIF, VGA)?

For troubleshooting, I will do two things: 1) I will try to switch the live view from H.264 to MJPG to see the impact of H.264 decoding. 2) I will see what will happen if I used one monitor with 32 views instead of two monitors with 16 views each.

Tariq, if you have 32 cameras displayed on a monitor, you would ideally want your VMS to display CIF resolution streams for each camera because (1) each video pane would be fairly tiny and (2) this would radically reduce CPU consumption needed to display.

As others have said, I find best practice is to use MJPEG (sometimes at lower resolution) for live view, H.264 for recording. Trying to simultaneously decode 16 H.264 streams puts a real burden on the client workstation processor. Some VMS's like Genetec Security Center can adjust the viewing resolution depending on how many images you have displayed on the screen.

Hi John: Our cameras have issues with MJPEG - so I never got a chance to test this, but according to the IPVM H264 Codec Shootout (

  • We also tested MJPEG. Bandwidth consumption was 10-100x than main profile. Equally interesting, CPU usage was generally significantly higher than either of the H.264 streams.

Can you comment on this? At which resolutions and framerates does MJPEG have a CPU advantage vs H264?

Bohan, That's like Dallmeier. The difference is that Dallmeier's decoders are also recorders with hard drives. Darned expensive, though.

I uploaded some screen shots so you could see for yourself the CPU utilization we are getting. All cameras are 3MP 8fps and have a VGA 8fps sub streams. The client you see running is actually running on the server. You can see in the bottom right corner of the first image we are decoding 128fps which is 16 channels x 8fps = 128fps. We are not dropping a single frame.

John. This might be a good topic to do a shootout on. I would like to see which VMS client is more efficient at decoding video.