|
ABSTRACT:
Many storage RAID (redundant array of inexpensive disks) systems employ data replication or error correction coding to support automatic recovery of data when disk drives fail; but most still require drive maintenance. Most oft en, maintenance includes hot-plug drive replacement to initiate data migration and restore data from replicated sources or to initiate error correction recoding or recovery after a single fault. Longer rebuild times increase the risk of double-fault occurrence and data loss. To minimize rebuild time and reduce the risk of data loss, replacement disk drives must be kept on hand and arrays need to be closely monitored. Given the cost of stocking replacement disk drives and operator monitoring, Atrato, Inc. has researched the concept of building spare capacity into a SAID (self-maintaining array of identical disks) for fully automatic fail-in-place recovery requiring no monitoring and minimizing data loss exposure. This article provides an overview of the Atrato system's unique approach to eliminate drive tending and minimize risk of data loss for a three-year operational lifetime. This design provides superior MTTDL (mean time to data loss), high service availability, lower cost of ownership, minimal spare capacity requirements, and enables deployments with mostly unattended operation.
|