Random Video Recording Gaps - Why?

IPVM

A member commented, "I have seen in more than one project where only IP cameras are used and that too recording 24*7 at around 20fps and 2MP or 3MP resolution - lot of recording gaps occurring. These are random, and vary from camera to camera , no particular pattern is observed. These gaps range from few seconds to few minutes (say 2 minutes or 3)."

He then asked what might cause this.

The most things that I have seen include (1) camera disconnects - the camera might reboot, network connection could go down, PoE switch could be overpowered or (2) server issue - the recording server could be overloaded, e.g.

What have you seen? Please share.

Reactions:

Undisclosed #1

IPVMU Certified

With ONSSI, classic at least, and possibly some versions of Milestone, the server does database administration once one day at a scheduled time. During this time no recording occurs.

On my system (small), this can last a couple of minutes before everything is back to normal. Worse, if you exceed the per camera storage limit during the day, it will force the camera off line for 1-2 minutes or so while it frees up space in the active DB. If this is happening everyday on several cameras, you will find all sorts of 2-3 minute gaps all over the place on different cameras.

Reactions:

John Honovich

In reply to Undisclosed #1

IPVM

You mean the live / archive database design?

Reactions:

Undisclosed #1

In reply to John Honovich

IPVMU Certified

Yes, that's the one. I think the forced intraday cleanup when you go over a single camera limit thing is just a NetDVR thing, I hope at least.

It can make Swiss cheese out of your database. Guessing SeeTec engine doesn't have it, but don't know.

Reactions:

John Honovich

In reply to Undisclosed #1

IPVM

'Milestone - it can make Swiss cheese out of your database"....

Reactions:

(1)

Mike Dotson

In reply to John Honovich

Formerly of Seneca • IPVMU Certified

The key to make this work well is to get the Archiving properly scheduled to match the input bitrate and to get the old data expired and deleted on a schedule. It is truly and 'AND' function.

If either one is not properly defined in the system...it will suffer.

The VMS was not mentioned in the query. Which one was it based on or was it a question about several VMSs?

All the VMSs I have tested in my lab have some sort of housekeeping chore where out goes out and looks to see how much data it has and initiates some sort of cleanup operation to make way for the new data.

John's hit list is a good one. In addition....here are a couple things I watch.

If you use perfmon....you can watch the NIC received data discarded counter....zero is the best answer here. This means the system had data that was not retrieveed from the NIC buffers by the system.

Perfmon also helps out when you watch how much data is being sent over to the storage subsystem. This MS article is a good read on the subject... Storage Performance Counters article

Lower quality switches that have limited internal max bandwidth will also play a role in missing data. You truly get what you pay for in this area.

Reactions:

Undisclosed Integrator #2

•Sep 30, 2015

One think that I have noticed from experience is that the Motion detection sensitivity need to be set higher on certain regions.

Most decent VMS allow for different motion regions on the same camera.

Reactions:

Wanchai Siriwalothakul

Smart Entry Systems • IPVMU Certified

Bandwidth spikes caused by high motion activities resulted in the total simultaneous disc writes from all cameras exceeding the capability of the NVR.

Reactions:

(1)

David Nelson-Gal

In the absence of data, it could be all or none of these problems. What makes these intermittent problems so unnerving is that if you haven't proven to yourself that you've identified the real issue, you are throwing darts in the dark, hoping that you've addressed it. Then you find yourself dreading the next phone call from the customer, yelling at you about not having fixed the problem.

What you need to have is some continuous monitoring of your video surveillance system so you can see whether the gaps in video are correlated with dropped packets, queue depth to storage, POE events or something else. In our data, a majority (>70%) of this kind of problem is configuration related.

Given that your cameras are configured to generate a significant amount of data even with compression, you need to check for load issues in your switches and servers. H.264 is a good codec but can be misleading if you've spec'ed the configuration assuming a level of motion that doesn't match actual experience.

Milestone (ONSSI) configurations have a particular problem in their Live-to-Archive migration regime if not configured properly. You need to measure how much video data is actually downloaded to the recorder and whether it is exceeding the capacity of the Live partition before it is migrated to the Archive partition. You may have it configured where it works most of the time but if you get more motion on some days, the amount of data may exceed the total capacity of the partition before migration. If that happens, the system will delete the video files from the Live partition prior to migration and you'll see these kinds of gaps.

Other times, we've seen situations where users are surprised to discover additional streams being pulled from a camera, recording to multiple servers unintentionally. Creating and managing additional streams can strain the camera's capacity.

Issues with fragmentation affecting performance of storage happens over time. Look at write queue depth and write latency for clues to those issues. If it is happening pretty soon after initial install, this is less likely the problem.

If you are looking for some technology that can help you monitor your installations, let me know.

Reactions:

(1)

Undisclosed

I have seen cameras get stupid when network error rates get too high. Axis M1054's used to do this. It appeared that network glitches would eventually cause a camera to stop recording, and sometimes it would "get better" and restart. The thing that ended up being useful was to compare network stats from "good" and "bad" cameras. Look at the camera, look at the switch, etc. Also vendors seem to not actually confess when they have network issues so this can also be one of those "a firmware update might help, no the vendor didn't give a reason" situations.

I assume this is a closed network with no other network traffic and so there isn't something else to correlate with. If this were an enterprise network I would suggest you identify a specific time and date of a "glitch" and take that to the IT team and ask if they were doing scans at that time. And of course I assume you weren't messing with the network at the time (ooh! look! this cable has both ends plugged into this switch. I wonder if that could do anything?(

Reactions:

Undisclosed #1

IPVMU Certified

In reply to Undisclosed

...this cable has both ends plugged into this switch.

That's called a poor man's cable tester...

Reactions:

(2)

Undisclosed

and it creates routing loops if you're using a cheesy unmanaged switch instead of a real switch with spanning tree. also if you're lucky IT will shut down your entire switch quickly so you'll learn of your error before you leave the job site. If you're unlucky IT will make sure your badge is cancelled so that when you go for the next service call you'll find guido the it security duded in the front lobby, ready to break your kneecaps.

Reactions:

(1)

Undisclosed #1

•Oct 09, 2015

IPVMU Certified

In reply to Undisclosed

...and it creates routing loops if you're using a cheesy unmanaged switch instead of a real switch with spanning tree.

False.

You are confusing broadcast loops with routing loops. Broadcast loops are layer2, frame based and cause duplicate frames which flood the switch with traffic.

Routing loops are layer 3, packet based and typically involve 3 or more routers.

Plugging both ends of a cable into a switch will result in a broadcast storm, not a routing loop.

...you'll find guido the it security duded in the front lobby, ready to break your kneecaps.

In that case guido would find his TTL shortened to 0.

Reactions:

(2)

(1)

Sean Chang

Rasilient Systems

Any resources that are shared are potential causes. Here is an example for storage.

One of our clients has ~150 cameras recording to a server with a 80TB volume. The random recording gaps appear after the volume is full. In other words, the file system fragmentation and VMS file creation/deletion come into play.

After close examination, we found out the recording gaps happened when the particular VMS is doing the house cleaning. During this period, the VMS scans through all the recorded directories, reading its particular index files, and updating the Windows NTFS Master File Table (MFT).

This problem is becoming challenging with large volume (e.g., 80TB) as the MFT is getting bigger (the amount of reading and updating) and all the recording directores are spreaded. The default MFT zone is 12.5% of the volume.

The visibility is important to find the cause.

Reactions:

(1)

John Honovich

IPVM

In reply to Sean Chang

Sean, excellent example! What is the VMS doing about this? Shouldn't the VMS be designed to handle this prior to allow itself to be full?

Reactions:

Sean Chang

In reply to John Honovich

Rasilient Systems

We only can observe what different VMS do. The sophisticaed ones try to optimize around the Microsoft MFT algorithm to reduce the access. Also, there is no issue for the volume to be full. Just the storage access pattern is changed afterward.

Reactions:

Undisclosed #1