Advantages of RAID6 over RAID5 For Video Surveillance

Author: Carl Lindgren, Published on Apr 15, 2009

For large scale video surveillance deployments, like casinos, the enhanced redundancy provided in RAID6 over  RAID5 is critical to minimizing video loss and ensuring system performance. [Note: If you are not familiar with RAID, view a RAID tutorial and a general comparison between RAID5 and RAID6]

Background

We have been recording all of our cameras using an NVR system since late 2003.  Our original system consisted of 28 servers, each recording up to 32 cameras.  The servers originally used 16-bay RAIDs with 250GB drives in a RAID 5 configuration.  The majority of the RAIDs were SCSI/PATA, which means they used standard IDE desktop drives in the RAID enclosure.  These drives were not designed to handle continuous video recording and began to fail at an alarming rate within a year.  Our drive vendor replaced these with RAID Edition drives in early 2005, which resolved some of the issues.  At the time, we had a bit over 830 drives in use.

Drive Failures

Even after replacing all 830 drives, we still experienced drive failures.  This is normal for any large system.  It has been estimated that approximately 1% of installed hard drives will fail in the first year of operation; with that rate climbing as the drives age.  There are many possible ways for hard drives to fail and RAID systems can recover from most failures by rebuilding the RAID system using the parity information that is striped across the drives.

RAID 5 uses one parity stripe to store data that can be used to reconstruct the contents of a failed drive onto a replacement drive.  That is the reason why most RAID manufacturers recommend installing at least one global hot spare in each RAID chassis.  When a RAID encounters an error with a hard drive, it “rebuilds” the data that was on the failed drive onto the spare using the parity data.  The failed drive can then be replaced with a new drive; which is designated as the new hot spare.  This process can be done over and over as drives fail and theoretically will keep the RAID storage operating continuously with no data lost.

Unfortunately, there are drive failure scenarios that can not be accommodated by most RAID storage systems that are used for recording video.  This issue is unique to video recording and seldom surfaces in RAID systems used by other applications.  The key is that for most applications, written data is “verified” during the write process.  This means that after a piece of data is written, it is read and compared to the original data before the next piece is written.  If the compare process fails, the area of the disk that failed is marked bad by the drive and the data is re-written to another area of the disk reserved for that purpose.

This process works well when the system has the time to verify the write and repair any errors encountered.  For most applications, there is no requirement to write data continuously and the computer’s operating system can wait the relatively short period required to verify each write and relocate data if an error is encountered.

Video recording is a completely different animal.  It has been estimated that CCTV video recording is 90% write versus 10% read.  I am of the opinion that is a conservative estimate.  An analysis of our system leads me to estimate that the percentages are somewhere between 99% to 1% and 99.9% to 0.01%.  RAID systems set up for video recording seldom, if ever, are set up to verify the data as it is written.

This sets up a possibly fatal scenario.  One of the failure modes of computer hard drives is something called “Read Element Failure”.  The best definition I can find of that is the drive is unable to read all or part of the data written to it.  This could be the result of a complete failure of one of the read heads, or just a bad area of a disk that has not been relocated by the drive’s automatic systems.

Get Video Surveillance News In Your Inbox
Get Video Surveillance News In Your Inbox

Since the drives in a video recording system don’t normally automatically read the data after it is written and the system operators only play back a very small fraction of the video being recorded, a drive could happily chug along writing data that is unreadable for a long time.  Neither the system nor the operators would ever know that there is a problem.  That is, until a drive fails with a problem that is recognized by the RAID system.

When the RAID system encounters a drive failure that it recognizes, it will attempt to rebuild the RAID set using the parity data recorded across all of the drives.  That is where the problem becomes acute.  If the RAID system also contains a drive that has a Read Element Failure, it is very possible that bad area contains parity data.  If it does, the rebuild will fail.

On a RAID 5 system, if a rebuild fails because the parity data is corrupt or unreadable, the system now has two bad drives and the RAID set is lost.  This happened to us at least six times during the three years that we used our original RAID 5 systems.

RAID 6

RAID 6 works a bit differently than RAID 5.  Although it can encounter the same drive failure scenarios as RAID 5, its ability to recover from them is greatly enhanced by the method RAID 6 records the parity data.  Instead of writing one parity stripe across all drives in a RAID set, RAID 6 writes two completely independent parity stripes.  There are two advantages to this: RAID 6 is able to recover from the simultaneous failure of two drives in the enclosure and its two parity stripes are in different areas, allowing the system to read parity even through multiple failures.;

This has been proven by us in our recording environment.  In 2006, we replaced all of our servers and RAIDs.  Our new RAIDs were set up, at our insistence, as RAID 6.  Although we have experienced at least three instances where two drives failed in an enclosure, including at least two instances where the second drive failed during the rebuild process, we have never lost any data.  The systems rebuilt both failed drives and continued to run flawlessly.

Conclusion

For these reasons, I would never recommend using RAID 5 in a critical video recording environment.  The risks of data loss are too great.

 

Carl Lindgren is the Surveillance Technician Manager for the Sycuan Gaming Commission at Sycuan Casino in El Cajon, CA.  Carl can be emailed at clindgren@sycuan.com

1 report cite this report:

How Costly are Hard Drive Failures? on Apr 29, 2009
Storage tends to be one of the more costly and problematic parts of video surveillance systems. Most video surveillance systems, even today, do not...

Related Reports on Failure

Favorite Request-to-Exit (RTE) Manufacturers 2018 on Sep 19, 2018
Request To Exit devices like motion sensors and lock releasing push-buttons are a part of almost every access install, but who makes the equipment...
Why Vivint / Best Buy Failed on Aug 31, 2018
DIY has bested Vivint. In 2017, Best Buy and Vivint partnered with Vivint employees on the floor of 400+ Best Buy stores, helping customers with...
Fail: Dahua "Didn't Check The Lux Levels but It Was Dark" on Jul 20, 2018
Dahua UK has been promoting their camera quality on LinkedIn: I, and others, asked what the lux level of the scene was. (background: Lux Rating...
FST Fails on Jul 17, 2018
FST was one of the hottest startups of the decade, selected as the best new product at ISC West 2011 and backed with tens of millions in...
Four Major Outdoor Camera Install Problems on Jun 14, 2018
Over 140 integrators told us the top four camera installation mistakes that lead to unexpected problems and failures. Their comments often...
Stats: IP Camera Dead On Arrival (DOA) Decline, Near Zero on Jun 01, 2018
New IPVM integrator statistics show that IP camera DoA rates have declined with zero dead on arrival units for many integrators and effectively...
Stats: Upgrading Cameras Far More Common Than Replacing Failed Cameras on May 30, 2018
The old saying "If It Ain't Broke, Don't Fix It" does not apply here. New IPVM statistics show that 60% of cameras that are replaced still work but...
ADT Stock Drops 50% Since IPO on May 17, 2018
It has been a brutal 4 months for ADT. They first expected to IPO at ~$18. They IPOed at $14, dropping immediately to $12.39 And now, not even...
Arecont CEO And President Resign on Apr 18, 2018
This is good news for Arecont. Arecont's problems have been well known for years (e.g., most recently Worst Camera Manufacturers 2018 and starting...
Favorite Camera Manufacturers 2018 on Mar 12, 2018
A number of major moves in integrator's favorite camera rankings for 2018: Two manufacturers make major moves up One major manufacturer moves...

Most Recent Industry Reports

Alexa Guard Expands Amazon's Security Offerings, Boosts ADT's Stock on Sep 21, 2018
Amazon is expanding their security offerings yet again, this time with Alexa Guard that delivers security audio analytics and a virtual "Fake...
UTC, Owner of Lenel, Acquires S2 on Sep 20, 2018
UTC now owns two of the biggest access control providers, one of integrator's most hated access control platforms, Lenel, and one of their...
BluePoint Aims To Bring Life-Safety Mind-Set To Police Pull Stations on Sep 20, 2018
Fire alarm pull stations are commonplace but police ones are not. A self-funded startup, BluePoint Alert Solutions is aiming to make police pull...
SIA Plays Dumb On OEMs And Hikua Ban on Sep 20, 2018
OEMs widely pretend to be 'manufacturers', deceiving their customers and putting them at risk for cybersecurity attacks and, soon, violation of US...
Axis Vs. Hikvision IR PTZ Shootout on Sep 20, 2018
Hikvision has their high-end dual-sensor DarkfighterX. Axis has their high-end concealed IR Q6125-LE. Which is better? We bought both and tested...
Avigilon Announces AI-Powered H5 Camera Development on Sep 19, 2018
Avigilon will be showcasing "next-generation AI" at next week's ASIS GSX. In an atypical move, the company is not actually releasing these...
Favorite Request-to-Exit (RTE) Manufacturers 2018 on Sep 19, 2018
Request To Exit devices like motion sensors and lock releasing push-buttons are a part of almost every access install, but who makes the equipment...
25% China Tariffs Finalized For 2019, 10% Start Now, Includes Select Video Surveillance on Sep 18, 2018
A surprise move: In July, when the most recent tariff round was first announced, the tariffs were only scheduled for 10%. However, now, the US...
Central Stations Face Off Against NFPA On Fire Monitoring on Sep 18, 2018
Central stations are facing off against the NFPA over what they call anti-competitive language in NFPA 72, the standard that covers fire alarms....
Hikvision USA Starts Layoffs on Sep 18, 2018
Hikvision USA has started layoffs, just weeks after the US government ban was passed into law. Inside this note, we examine: The important...

The world's leading video surveillance information source, IPVM provides the best reporting, testing and training for 10,000+ members globally. Dedicated to independent and objective information, we uniquely refuse any and all advertisements, sponsorship and consulting from manufacturers.

About | FAQ | Contact