Deep Learning has primarily been about giving the devices better understanding of the key elements of an image, or in other words, object detection/classification.
I would say the most commonly desired object to be detected is people, followed closely by vehicles. Thus, the most commonly requested feature for DNN-based video analytics is accurate detection and classification of people and vehicles in images.
Once a person/vehicle is accurately detected (and other objects like animals accurately ignored), you can use that for a behavior analysis, common examples being loitering, line-crossing, wrong-way travel, etc. Today, the behavior analysis rarely involves DNN's (though it can, it just does not come anywhere near "most used" applications right now), it is more based on simple measurement of where the object is in the image, what direction it moves and so forth.