Intel Neural Compute Stick 2 / Movidius AI Test

By IPVM Team, Published Nov 21, 2018, 07:29am EST

AI is a major trend in video surveillance with manufacturers paying significant attention to Intel's Movidius Myriad chips. Indeed, Avigilon has announced that their next generation AI H5 cameras will use the Myriad X.

But how well do these Myriad X chips perform?

IPVM has done its first testing of the Myriad X chip (via the NCS2) and shares results inside this report.

Intel / Movidius / Network Compute Stick Overview

Cameras like Avigilon will use Intel / Movidius chips inside them to deep learning / Artificial intelligence. To the side is an image of a Myriad X chip. Video will be sent into to on-board chip, the chip will run deep learning models and it will output categorizations of people, objects, faces, etc.

Just a week ago, Intel announced its second generation of 'Neural Compute Sticks' (NCS). While the first NCS had the original / older Movidius Myriad 2 chip, this new one has the newer Myriad X inside, which Intel claims "delivers 10X the performance compared to previous generations".

IPVM bought the NCS2 and tested it.

Areas Analyzed

Inside this report, we share results across:

  • Face detection using Intel's face-detection-retail-0004 model
  • Testing across Intel i7, Myriad 2 and Myriad X
  • FPS, CPU and RAM consumption variances across devices
  • Accuracy comparison of detection
  • YOLO support and issues

Key ********

*** *** ******** **** our ******* ***:

  • Myriad * ~= *.** ****** * ***: The NCS2 ran roughly 2.7x frames-per-second more than the NCS1 in their face detection model we tested. Intel says that different optimizations and measurement changes would increase this significantly (discussed in caveats below).
  • Faster **** *** (**): The NCS2 / Myriad X outperformed the i7-7700K (a top-of-the-line in 2017) chip tested here, a significant advance given the Myriad X being significantly lower cost and lower power consumption.
  • Little/no ******** ****: Our tests noted no differences in output video accuracy in line's feedback that OpenVINO's model-optimization step should be considered "lossless." They pointed out that allowing lossy model compression (when converting a model to be run on Myriad; eg FP32->FP16 data-type downsizing) would cause issues with vendors training a model with certain expectations, and having a different (lossy) model for inferring.
  • Model ************* ** ******: While Intel has mentioned YOLO support for OpenVino, the current YOLOv3 is not yet compatible on Myriad chips due to a particular network layer. ***** **************** ***** *** ******* converting ****** ** ** run ** ***, *** YOLOv3 ** *** *** compatible ** ****** ******* due ** * ********** network *****. **** ** & **: ***** ******* these *** ** *** on ******, *** *** tests *******.**********+***: *** ***** in **** ****** **** variety, ************ ***** ****-*********-******-**** (SqueezeNet ***** **** * single-shot ********).*********+***: *** ***** pending. ** **** ******* public ****** **** ****** to *****'* *** ******.

*** ****** ***** ******** performance *******:

intel performance comparison2

Caveat ** *** / *****

***********, *****'* *** ******* are **** ****** **** our ***, *** **** reasons. **, *****'* ****-*********-******-**** model ******* *** *** on ***, **.* ** Myriad2 (**** ***** ******** comparable ******* ******* ** will ******).

* *** ****** ** our ***** ***** ******* out: (*) ** *** Python, **** ******* **** their *++ ****; (*) our ***-******* ******** ******* code (** ****** *******/******* frames), *** ********* ***** code; (*) ** *** not ***** ******* ************ handling ** ***** ***** loading, ***** ** ***** into ********.

Video **********

***** ** * ***** screencast **** ******* *** testing:

Questions / ******** / **** ** **** ****?

** **** ******* ******* of *** **** / Myriad * ***** **** other ********* *** **** more ****** **** *** next *** *****. **** would *** **** ** see ******?

Comments (12)

I'd love to understand this more so I can play. Do you "plug" this stick into a computer and use a program and any camera to start classifying objects? Are these programs open source or freeware for education?

You just insert it into a USB slot like any other USB device. As I understand, these Neural Compute Sticks aren't really for production applications, but for developing/testing your neural network / program on the Myriad chip (to see how it performs, ensure its compatible, etc). You'll be using an integrated Myriad chip in the real world.

Feel free to use our code in the repo above (I'll see about getting an open-source license in it), but it's just demo code for now. You'll be better served cloning OpenVINO's own sample C++ files. You can find these (after you install OpenVINO) at ~/intel/computer_vision_sdk/deployment_tools/inference_engine/samples. As to "are they open source / freeware", my educated guess is "yes" - they want to get you started. But you'll want to read the official license (lots of content) at ~/intel/computer_vision_sdk/licensing.

IPVM bought the NCS2...

Roughly speaking, are we talking $ or $$ or $$$?

 

NCS2 was $100. See here for purchasing, but they're all back-order currently 

Mouser says they have them in-stock. I ordered one 2 days ago and it is arriving Friday. 

 

Dumb question - does 622% CPU utilization on the i7-7700k mean that each processor was maxed out along with some of HT or is it relative against some other metric?

 

Also, how hot did the unit get during testing?  Is there a significant amount of cooling on the stick?  Just wondering what the impact will be to the cameras cooling/size.

Not dumb Q at all, one I'm investigating myself. I think it's as you intuit, it considers 100% full utilization of 1 core out of i7's 8 cores (or comparable aggregate across them). So conceptually I'm thinking it used 6.25/8 cores maxed.

I tried the stick now and felt it, no noticeable heat. I'll try to think on a better way of gauging. This might be a Q for Intel, temperature profile of Myriad X chip (on board rather than stick) compared to some common CPUs.

I think it's as you intuit, it considers 100% full utilization of 1 core out of i7's 8 cores (or comparable aggregate across them).

If that is indeed the case, I would suggest that the values need to be normalized before charting, e.g. 622%=78%.

Though what explains the 104%, since the X has 16 cores? Is it only using slightly more than 1?

I did some further digging on the thermal portion.  Since it is Intel the specs are available for TDP(Thermal Design Power).  It’s very low draw and heat.  I am not certain how it compares to the TDP of a standard chip used on cameras as it appears the industry likes to obscure this data as much as possible... or may not even know.  I haven’t even been able to find wattage or thermal info on Axis Artpec chips.  Here is how Myriad X compares to other common processors:

Myriad X - 1.5 watt TDP

i7-7700k - 91 watt TDP

Atom processor - 2-10 watt TDP

Nvidia GM206 - up to 120 watt TDP <— this is what is on the GPU that Avigilon uses for server side appearance search analytics currently

 

I haven’t been able to find any wattage/thermal info for mobile processors like Snapdragon or Apple.  Apple tends to obscure everything they can.  I suspect they are sub 1 watt but throttle back heavily based on thermal constraints.

 

 

Btw, we are going to do 101 / introductory material on AI / deep learning in the near future as well. 

It has been brought to my attention and, rightfully so, some may not know some terms / concepts described here. Upcoming materials will correct that.

Indeed introductory material will help to understand the concept behind.

I bought a pair of these to play with from mouser on the monday before thanksgiving and was able to play with them over the holiday. In windows 7 the openVino tool SDK works but the OS does not recognize the stick at all(WMI driver). Windows 10 recognized it and it worked but not as well as ubuntu.

(As far as I can tell, the "GPU" choice is only the intel GPU's not NVIDIA, etc.)

I am experimenting in C++, just getting started, but the samples look pretty good.

I have been building the samples and running them on the command line to see what performance is like. It is good to understand the way the layers feed each other and the impact that has on performance. For the license plate video the output of detecting cars is fed into the license plate detection which is fed into the license plate recognition. They make a distinction between detection and recognition.

So far, rather disappointed with performance, but it is really early to even guess why.

 

 

 

Read this IPVM report for free.

This article is part of IPVM's 6,653 reports, 896 tests and is only available to members. To get a one-time preview of our work, enter your work email to access the full article.

Already a member? Login here | Join now
Loading Related Reports