Thanks for the report Ethan. Prior to this feature, we had to use camera-side VMD which works good enough but probably not as good as the server-side VMD. It would be nice to see a report comparing server-side VMD to camera-side VMD. At lot of camera manufacturers seem to offer VMD now. With that in mind, you can purchase a much cheaper recording server and offload the VMD to the cameras (if you even need VMD processing).
You probably saw the recent announcement from Milestone that they are teaming with NVidia to use the NVidia GPU to perform the hardware acceleration. This should be interesting. We hope this feature is extended to the Milestone Smart Client video decoding which is currently our performance bottleneck.
Good report Ethan. I would like to add a little to the info.
We still see many customers request Windows7, and this OS does NOT support QSYNC. Only Win8 and newer OS versions do.
For the Professional and lower code versions, the QSYNC will also get used by the 'Mobile Server' service according to the documentation.
"If the processor on the mobile server supports hardware accelerated decoding, it is by default enabled"
This service is 'transcoding' a video stream wherever it is running. One has to be aware of this load because it can easily swamp an underpowered CPU and the primary recording function will get swamped.
Fortunately, one can install this service on a separate system.
Also note this limitation: "Hardware-accelerated decoding is not supported, if the mobile server is installed in a virtual environment."
This is simply a reflection of the fact that ESX currently does not support the integrated GPU.
Note: we removed "GPU" from parts of the test (including the title) to avoid the impression that this hardware acceleration was effective using a separate hardware GPU (e.g., Nvidia). QuickSync offloads processing to a "GPU", but it's actually a specific CPU architecture function, not a dedicated GPU.
However, if you use a tool such as GPU-Z (below, from our tests) to measure load, QuickSync load does show as GPU load:
QuickSync offloads processing to a "GPU", but it's actually a specific CPU architecture function, not a dedicated GPU.
Just to note newer Nvidia cards also use fixed-function ASIC decoding, not traditional GPU cores. Apparently everything has come full circle (for decoding), from fix-function to shader cores back to fixed function.
So basically milestone found a way to make up for their inefficiencies on the CPU usage on the VMD server side. Reminds me of Blue iris in a way of cpu load for vmd.
They could have just done it more efficient similar to how NX Witness has been handling VMD server side with such a minimal CPU load for all these years. So in a way i figure it is more of a band aid by throwing other technologies or brute force by handling the load.
Or Milestone can start specifying Quadro cards and saying, look how cuda makes the cpu less demanding. Even incorporating other GPU technology and go the OpenCL route and offload the resources to a more parallel architecture.
Milestone XProtect will leverage NVIDIA GPUs and the CUDA parallel computing platform and programming model to provide parallel processing capabilities of recording servers, mobile servers and other video processing services.
Besides analyzing fewer pixels which Nx Witness does by pulling a second lower-resolution video stream, how do they do motion detection more efficiently?
Milestone defaults to analyzing 12% of the image, achieving a similar (but different) effect. I'm not sure how much overhead there is when decoding a subset of pixels of larger frames for motion detection compared to Nx Witness where presumably all pixels of the low resolution stream are analyzed, but from experience it is very rare to see high CPU utilization as a result of our motion detection.
If it happens, it is either because there are 200-300+ cameras on the server or the motion detection settings have been turned up such that 25-100% of the image is analyzed and/or analysis is done on more than just keyframes.
The use of CUDA cores mean even virtual machines could offload the image processing tasks to hardware acceleration which would enable higher resolution and accuracy of motion detection without impact to the CPU or significantly higher camera/server density.
It's not something that has been a concern for most of our customers in the last 19 years, but it's an option which enables new system configurations or potential expansion of existing hardware with no additional cost to the software.