Here’s something interesting that showed up in my Google nets from Chris Mellor at Block and Files, an IT storage blog
Yesterday RAID INc. announced it was going to OEM NEC of America D-Series drive arrays because the array controller, amongst other things, carried out read integrity validation checks. This was necessary because RAID Inc. customers had reported ‘silent drive failures’ on SATA drives with not all the data on the drive being accessible by the RAID controller.
And a little more
NEC’s release about the RAID Inc. OEM deal says this about one of the things its D-Series controller carries out: ‘SATA read verification to detect silent read errors that other arrays do not.’
Another simple statement. A ‘silent’ read error, meaning that the controller doesn’t return all the information it was asked to.
I haven’t heard anything about such problems in my immediate HPC community, but would be interested in hearing from you if you have.
Is there a general problem here, or one that is only revealed in HPC configurations with hundreds or thousands of drives and a very low occurrence rate? Certainly there has been no whisper of a similar problem from other SATA drive array suppliers.
Of course I suppose it could also be the case that RAID Inc. is creating a crisis to sell its gear. Seems unlikely that this would be the case, but I’m 40 now and less trusting than I used to be.