How to Handle Hard Drive Failures

By Ethan Ace, Published on Feb 13, 2012

No matter how well maintained a surveillance system is, hard drives eventually fail. Indeed, A recent member's poll [link no longer available] showed that most find hard drive failures to be 'a common and significant problem'. What happens when a hard drive fails can vary significantly depending on the system type used, the priority of the system and the concerns of the end user. In this note, we examine the most common real world issues and provide guidance on how to best deal with hard drive failures.

Impact of Hard Dive Failure

How much of an impact a failed hardrive has on a system depends on the architecture of its storage. There are three typical scenarios:

  • Single drive for OS and archived video: In systems which use the same drive for the server or DVR/NVR OS, as well as video storage, failures are most critical. Loss of the drive results in the entire system going down, with users unable to view live or archived video. The unit is essentially dead until it can be replaced in its entirety, or a drive with the OS reloaded.
  • OS and archives on separate drives: In many DVRs, the operating system is kept separate from video storage, either in flash or on a seperate hard drive. This prevents video from being lost should the OS drive fail. Conversely, it prevents complete failure of the unit if the video drive fails. The system may still be used for live viewing until the drive is replaced and archiving may begin again.
  • RAID: When using RAID for storage, whether it be internal or external, the system, including archiving may continue to run when a drive fails. Depending on the RAID level, one or more drives fail without issue. When a new drive is inserted, the RAID array begins rebuilding the lost data. During this process, performance (mainly throughput handling) may be reduced. Some high-end enterprise systems allow for full performance during rebuilds, though lower-end units do not.

One key consideration is how and when users will know if archiving has stopped. For small systems especially, the recorder is often located in an unstaffed location, such as wiring closet, without any check of whether it is working or not for extended periods of time. Thus, often, the first time a user realizes the recorder is not working is after an incident has occurred. Some DVRs and VMS platforms are able to send notifications if archiving stops for any reason, but we suspect these are often not set up.

Client Response to HDD Failure

Depending on the client and how they view the criticality of their surveillance system, response to a HDD failure will be different. For some, going without their system while it is repaired is acceptable, as they rarely have incidents which need review. For others, downtime is absolutely critical, and immediate response and repair of the hard drive is required. 

Integrators may keep spares on hand for end users with stringent downtime requirements, as part of their service contract. Other users may simply receive a loaner unit, to provide basic recording functionality while the production unit is sent out for repair.

Written Procedures

No matter which steps are taken, due to the potentially negative reaction some users may have when their system is down, it is essential that response procedures are agreed upon, and put in writing as part of the warranty or service agreement. While users will still often direct their frustration at the integrator, this reduces the potential for contention or blame.

Recovery

Get Notified of Video Surveillance Breaking News
Get Notified of Video Surveillance Breaking News

While it is possible for data recovery services to read some, or even most, data from a failed hard drive, this is often a costly prospect. On average, recovery of a single hard drive can cost well over $1,000. Complex physical problems (caused by severe crashes) may cost upwards of $2,000 as more manual work is required. This makes data recovery an expensive process, reserved only for severe cases.

Potential Liability

No matter who the client is, the potential for liability issues is present when a surveillance system is in place, but not recording. Incidents, no matter how infrequent, may occur at any time. Video of a critical incident may be key evidence in litigation (such as accidents, vandalism, slip and fall, etc.), making hard drive failure a real risk to users. 

For users whose video surveillance is governed by regulations, hard drive failure is potentially more of a risk. For those in government or gaming verticals, who are required to store video for certain durations, an unnoticed hard drive failure could place them in violation of these regulations. For this reason, these users nearly always use RAID storage, sometimes RAID 6 and beyond, to guard against multiple hard drive failures. 

Recommendations

For users of small surveillance systems, the best choice is to use DVRs with separate drives for OS and video. For redundancy, inexpensive NAS units, which offer RAID 1 (disk mirroring) are available for under $300 USD. Not all DVRs are capable of using NAS storage, however, so users should ensure compatibility before purchasing.

For users of enterprise-level surveillance systems, RAID is very common. The main concern is the RAID array's performance during a rebuild. If a drive fails, no data is lost. However, low performance arrays can practically take down a sysem if read/write capability dips too low during a rebuild, which may take 6-10 hours or more. Users should beware of this, and verify with vendors that performance will remain high during rebuilds.

For all users, if video is critical, selecting a recording platform which will notify them of hard drive failure is always recommended. Immediate notification is the best way to protect against lost video.

1 report cite this report:

Top Video Surveillance Service Call Problems (Statistics) on Feb 28, 2018
In our most recent statistics series, over 150 integrators told IPVM the most...
Comments : Members only. Login. or Join.

Related Reports

Favorite Video Surveillance Hard Drive Manufacturer 2020 on Aug 27, 2020
Western Digital and Seagate effectively have a duopoly in hard drives but...
Bottom: Integrators Start To Stand Vs Coronavirus on Apr 20, 2020
Good news - IPVM integrator statistics show that while coronavirus has hit...
Average Hard Drive Size Statistics 2020 on Oct 07, 2020
Hard drive sizes keep increasing (up to 18TB announced recently) but what...
Biggest Problems Selling Access Control 2020 on Oct 29, 2020
Access control can cause integrators big headaches. What practical issues do...
Access Control ADA and Disability Laws Tutorial on Feb 17, 2020
Safe access control is paramount, especially for those with...
Terrible Convergint Coronavirus Thermal Camera Recommendation on Apr 01, 2020
A week after Convergint disclosed falling revenue, pay and job cuts,...
Exit Devices For Access Control Tutorial on Aug 25, 2020
Exit Devices, also called 'Panic Bars' or 'Crash Bars' are required by safety...
Door Fundamentals For Access Control Guide on Aug 24, 2020
Doors vary greatly in how difficult and costly it is to add electronic access...
Pivot3 Mass Layoffs on Mar 27, 2020
Pivot3 has conducted mass layoffs, the culmination of grand hopes, a quarter...
Delayed Egress Access Control Tutorial on Feb 04, 2020
Delayed Egress marks one of the few times locking people into a building is...
Top Video Surveillance Service Call Problems 2020 on Oct 23, 2020
3 primary and 4 secondary issues stood out as causing the most problems when...
Infinova, March Networks and Swann H1 2020 Financials Examined on Sep 02, 2020
While Dahua and Hikvision, helped by fever camera sales, are recovering from...
Forced Door Alarms For Access Control Tutorial on Aug 17, 2020
One of the most important access control alarms is also often ignored....
Gait Recognition Examined on Sep 14, 2020
Facial recognition faces increasing ethical and political criticisms while...
AHJ / Authority Having Jurisdiction Tutorial on Aug 06, 2020
One of the most powerful yet often underappreciated characters in all...

Recent Reports

Motorola Solutions Total Revenue Down, Video Revenue Up on Oct 30, 2020
Motorola Solutions' total revenue is down, but video (both fixed and...
Recruiters Show 2020 On-Demand Recordings on Oct 30, 2020
Recordings from the 12 recruiter presentations are now available...
Consultants Show 2020 On-Demand Recording on Oct 29, 2020
Recordings from the consultant show are available on-demand at the end of...
Hikvision AcuSense G2 Camera Test on Oct 29, 2020
Hikvision has released their next generation of AcuSense analytic cameras...
Biggest Problems Selling Access Control 2020 on Oct 29, 2020
Access control can cause integrators big headaches. What practical issues do...
Taiwan Geovision AI Analytics and NDAA Examined on Oct 29, 2020
Taiwan manufacturer Geovision's revenue has been falling for years. However,...
Bedside Cough and Sneeze Detector (Sound Intelligence and CLB) on Oct 28, 2020
Coronavirus has increased interest in detecting symptoms such as fever and...
Fever Tablet Thermal Sensors Examined (Melexis) on Oct 28, 2020
Fever tablet suppliers heavily rely on the accuracy and specs of...
Verkada Fires 3 on Oct 28, 2020
Verkada has fired three employees over an incident where female colleagues...
Eagle Eye Networks Raises $40 Million on Oct 27, 2020
Eagle Eye has raised $40 million aiming to "reinvent video...
Hikvision Q3 2020 Global Revenue Rises, US Revenue Falls on Oct 27, 2020
While Hikvision's global revenue rises driven by domestic recovery, its US...
VICE Investigates Verkada's Harassing "RawVerkadawgz" on Oct 26, 2020
This month, IPVM investigated Verkada's sexism, discrimination, and cultural...
Six Flags' FDA Violating Outdoor Dahua Fever Cameras on Oct 26, 2020
As Six Flags scrambled to reopen parks amid plummeting revenues caused by the...
ISC Brasil Digital Experience 2020 Report on Oct 23, 2020
ISC Brasil 2020 rebranded itself to ISC Digital Experience and, like its...
Top Video Surveillance Service Call Problems 2020 on Oct 23, 2020
3 primary and 4 secondary issues stood out as causing the most problems when...