Member Discussion

Deep Learning Approach. Why Does It Work And What Is The Difference In Comparison With Traditional Video Analytics?

Over 20 years ago, the first product we developed was a motion detector. We did not have enough CPU power to compress video and not enough HDD size to record, so we controlled tape recorders (VCR) through IR signals emitted by computers when our AI detected motion; very clever! We saved many tapes! But from that moment, we expected more from video analytics. We wanted it to work more accurately and to tell us more information. However, after 20 years we are still using almost the same algorithm to detect motion! Why?

Two things. First, it is challenging to find a suitable algorithm to do more than just motion detection. Secondly, once you think you have a good idea, it is immediately apparent that you need 10 times more CPU resources for this algorithm! Furthermore, for the past 20 years we have never had enough resources.

When we have a more powerful platform, we want to connect more cameras to that platform. So, now we prefer to add 1500 cameras to one server and don't want to allocate resources for the advanced motion detector. We want cameras to report motion detection, and this is an excellent idea. So, after 20 years we finally don't need server-based motion detectors. This is good progress.


Deep learning approach. What is the difference?

Neural networks were invented 40 years ago. So, what happened to make it have such a huge impact in today's society? During  NVidia's development of parallel computing to accelerate graphics, it was accidentally discovered that this architecture works terrifically to accelerate neural networks calculations! Therefore, this made these calculations affordable for everyone! Now almost everyone can train neural networks and get robust recognition. You do not need to invent algorithms anymore, and you do not need to compete for the same CPU resources; you have a very precise technique to train your network to detect anything, and then run this network on separate processors. Intelligent!


Why is deep learning better than traditional algorithms?

In the traditional approach, you need to invent an algorithm in order to be able to detect something. Additionally, your algorithm cannot be perfect because you cannot predict all possible outcomes. The deep learning approach is different; you simply show pictures during the training process and tell AI what they are. Neural Network will adjust neurons and then will be able to recognize accordingly. Simple. This is exactly how the human brain works. I'll give you an example using facial recognition: all traditional FR algorithms were based on different ideas of how to determine which faces are similar. Translated to computer language: what parameters of the faces correspond to similarity? Is it the distance between the eyes or the nose shape? Or is it something else? We do not know. We do not actually know how to compare faces and decide which faces are the same; especially in different conditions! However, with deep learning and neural networks, we do not need to know - we can simply show different faces and tell the computer which faces are the same. That's it!

It is important to note that everything depends on your data set. If you have a set with a wide range of different faces, views from different angles, different ages, different lighting environment and so on, your network will be trained to recognize in different conditions. Moreover, you can train with different resolutions. This is the area where the human is not trained at all. So, that means that the neural network should perform better than the human brain - and we can already see evidence of this; a human cannot remember a million faces and immediately sort them by similarity. AI can!


What to expect next?

After the surprising results of Nvidia, we can see a very good progress towards implementing neural acceleration in different hardware. Intel acquired Movidius, and it works great! HiSilicon announced new camera chips with embedded neural acceleration processors. What does it mean? Every camera will have trained neural network to do what you want it to do. So, after 20 years of no progress in motion detection, we have made an incredible jump to be able to detect anything - even better than human eye! Yes, it's hard to believe, but who believed before computers age in Turing's prediction that machines will be able to calculate better than a human ? No one have a doubt now.

But, still, the fundamental question has not been answered - what can we really do with this new technology? What are the benefits we can enjoy? What kinds of features can we explore? In which area the efficiency of security can be improved? I'll give my thoughts in the next discussion.

Just got an interesting inside from another manufacturer, who is going to compete with HiSilicon, Intel and NVidia in the field of acceleration of neural networks.

Socionext Inc. is a company from Japan who now developing SoC for cameras! They presented PCI-card which is 10 times faster than current Movidius from Intel and comparable with top Geforce 1080Ti, but with much less power consumption - no even cooler on board! We'll get samples in May to evaluate it!

Definitely, AI is the future. The best thing about it is that it will progress much faster than the traditional video analysis approach. 

Does anyone know any other AI acceleration devices apart from Nvidia and Movidius? Any ideas on how to estimate the load?

The load depends not only on the equipment (CPU, GPU, Movidius, ...) but on the neural network configuration (number of neurons and types of relations between them). For example we have an optimized network that works pretty good on CCTV tasks, but has nearly 10x times less inference time than famous Inception v1 network.  Also the load depends on how often the network is used - to process each frame, or just some number of important frames.  So the load estimation is mostly experimental and should be reported by the network developers.

About the other acceleration devices. It is possible to use FPGA cards. For instance from Intel. In fact currently NVidia dominates on the market, but I think (and hope) that Intel will release new devices and brake this monopoly. Also good sign is that at ISC West several guys from China and Taiwan came to our booth and offered to try their accelerators. So it seems that the accelerators are going to become a commodity in a couple of years.

Thank you Murat for your insight into this amazing technology. I know how advanced your company’s deep learning technology is and I can’t wait to see what wonderful new features you introduce to the market. Keep up the good work and deep insight!

Thank you, Mark!

Just started another discussion about forensic search here.