Hard drives actually returning incorrect data, are there any real stats on that? permalinkembedsaveparentgive gold[–]txgsync 0 points1 point2 points 1 year ago(0 children) How long does it take for the ZFS versions to move downstream to say Solaris and OpenIndiana type variants? http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/ EDIT3: What I personally learned from the discussion below. For home usage, the risks of such events happening aren't really that big to worry about that. navigate here
There is never a case when RAID5 is the best choice, ever . If you encounter a URE during a RAID 5 rebuild there is no other copy of that information, nor any parity data from which to rebuild the data. At least our math is tracking. We’ll continue to watch this issue and continue to make backup reliable, but for now it’s safe to say RAID-5 is alive and well. 2015 UPDATE: Using RAID-5 Means the Sky
I hypothesize a bell curve distribution, for all I know the centre of the bell is 1016 reliable and only when you start sliding towards the tail do you get some A sector on disk will contain both the data and the CRC for that data. Hitachi’s bit error rate on their 4TB SATA drives are 10^15 and in my experience the two drives perform similarly from a reliability perspective.
Much worse Verity Stob Birmingham sperm bank pulls plug after just a handful of recruits Could Heather from EastEnders turn on Kettering if Lohan is no-show? SearchDataBackup Assess your need for cloud backup vendors' archiving services When deciding whether to use cloud archiving, you need to determine if your organization really needs it, and if what the Consumer SSD error rates are 10^16 bits or an error every 1.25PB. Ure Raid 5 That means Seagate will not guarantee that you can fully read the entire drive twice before encountering a URE.
Also as the stripe width grows, your minimum block size grows commensurately. Unrecoverable Read Error Ure I can understand that it may theoretically happen, but I'm very doubtful about the actual frequency of this scare. This email address is already registered. Because ZFS uses variable block sizes, it means you may end up in a situation where a freed block is much smaller than any new write can use it; you can
Synology Forum Synology Inc. Zfs Ure permalinkembedsaveparentgive gold[–]FunnySheep[S] 0 points1 point2 points 1 year ago(4 children)No I think a lot of what is said about RAID5 is way overstated. Take the Highly Reliable Systems Tour to Learn More and Win Prizes!!! With RAID 6 you'd have a second parity, usually diagonal parity, which you can then use to recover following a lost disk and a read error.
Be careful with RAIDz2 on newer 4kn drives, particularly for OS images and the like. For example, here are Seagate's Consumer grade NAS Drive specs (1 in 10E14, not 1 in 10E15 like the better Enterprise grade drives). What Is Unrecoverable Read Error WP Designer. %d bloggers like this: Facebook Twitter YouTube Instagram Sign up for our weekly newsletter! Unrecoverable Read Error Nero Oracle contributes code to many open-source and free software projects.
Robin Harris did the calculation on a 12TB array and got a whopping 62% chance of data loss during a RAID rebuild. http://crimsonskysoftware.com/unrecoverable-read/unrecoverable-bit-error.html If you have one error you've lost data. Since the mirroring is done at the block level the entire drive is ALWAYS read to create a new copy with its mirror partner. Disclaimer: I am an Oracle employee; my opinions do not necessarily reflect those of Oracle or its affiliates. Unrecoverable Read Error Rate Ssd
That is one URE every 12.5TB. Ask your vendor which drives they use. Oracle continues to follow the Solaris versioning scheme for zpools: increment the zpool version to define what features are supported. his comment is here The Real World Now, there are some great, and not so great, articles out there which go into a lot of detail about this problem.
That 1014 number is a worst-case scenario. Ure Rate I can speak to the "why" but not the rate (I love my job; my job depends on numerous NDAs with drive manufacturers, but I don't like NDAs. Know what I mean?).
The OpenZFS code appears to continue to attempt writing to vdevs that are very full, but at a dramatically reduced rate until they start to balance out. If you've already had a drive fail, you MUST read all data without error to recover the RAID set. Solaris code is available for partners and licensees through the Oracle Technology Network (OTN). Raid 6 When it happens, the entire array is failed.Also 12.5 TB is not a hard limit, just an average.
The error rates for some drives are appallingly high, for others strikingly low. I don't think it reflects real-life that we see an URE every 12.5 TB, do you? Apple's thrown you a lifebelt SpaceX to explosion conspiracy theorists: there's no grassy knoll at Cape Canaveral 130 serious Firefox holes plugged this year Spotlight Windows Server 2016 persistent memory support I occasionally see one with a disk with hundreds or even THOUSANDS of checksum errors.
And again you can't add a statistic up to give you a smaller number. I love this kind of technical discussion :-) It's the meat-and-potatoes of what I do for a living... Ideally firmware and OS work together hand in hand.Very often users or the operating system won't even know that a "URE" is present. Stick to stable releases for your production data.
Are the above Toshibas a good cost effective way to go for the money or is something out there more appropriate for the bucks? Let’s just take a stand alone Seagate 3TB drive and see what is the probability we’ll get a single non-recoverable read error if we fill and read the whole drive. Also in each and every case, what was the real reason for the catastrophe? It's just statistics as to why they seem more reliable that the datasheet.
TL;DR: An unrecoverable BIT in a SECTOR doesn't usually result in an error you can see from your operating system; the hard drive recovers the data and remaps it to a Offices in London, San Francisco and Sydney. ZFS will register a checksum error and attempt to reconstruct the data from available replicas if they exist. In order to rebuild one failed drive, the RAID controller must read all data from every surviving drive to recreate the failed drive.
If it were so easy to predict when the read errors come up, wouldn't we just prevent them to begin with? Reproduction in whole or part in any form or medium without specific written permission is prohibited. Once two drives failed, assuming he is using enterprise drives (Dell calls them "near-line SAS", just an enterprise SATA), there is a 33% chance the entire array fails if he tries Rather than try to walk you through it all, I will give you a link to an excellent forum post by user EarthwormJim.
Let’s not even talk about RAID. Plus, implement stringent controls on backup storage to ...