6.2. Managing RAID
Redundant Arrays of Inexpensive Disks (RAID) is a technology for boosting storage performance and reducing the risk of data loss due to disk error. It works by storing data on multiple disk drives and is well supported by Fedora. It's a good idea to configure RAID on any system used for serious work.
6.2.1. How Do I Do That?
RAID can be managed by the kernel, by the kernel working with the motherboard BIOS, or by a separate computer on an add-in card. RAID managed by the BIOS is called dmraid; while supported by Fedora Core, it does not provide any significant benefits over RAID managed solely by the kernel on most systems, since all the work is still performed by the main CPU.
Add-in cards that contain their own CPU and battery-backed RAM can reduce the load of RAID processing on the main CPU. However, on a modern system, RAID processing takes at most 3 percent of the CPU time, so the expense of a separate, dedicated RAID processor is wasted on all but the highest-end servers. So-called RAID cards without a CPU simply provide additional disk controllers, which are useful because each disk in a RAID array should ideally have its own disk-controller channel.
There are six "levels" of RAID that are supported by the kernel in Fedora Core, as outlined in Table 6-3.
For many desktop configurations, RAID level 1 (RAID 1) is appropriate because it can be set up with only two drives. For servers, RAID 5 or 6 is commonly used.
Although Table 6-3 specifies the number of drives required by each RAID level, the Linux RAID system is usually used with disk partitions, so a partition from each of several disks can form one RAID array, and another set of partitions from those same drives can form another RAID array.
RAID arrays should ideally be set up during installation, but it is possible to create them after the fact. The mdadm command is used for all RAID administration operations; no graphical RAID administration tools are included in Fedora.
22.214.171.124. Displaying Information About the Current RAID Configuration
$ cat /proc/mdstat Personalities : [raid1] md0 : active raid1 hdc1 hda1 102144 blocks [2/2] [UU] md1 : active raid1 hdc2 hda3 1048576 blocks [2/2] [UU] md2 : active raid1 hdc3 77023232 blocks [2/1] [_U]
You can get more detailed information about RAID devices using the mdadm command with the -D (detail) option. Let's look at md0 and md2:
# mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Mon Aug 9 02:16:43 2004 Raid Level : raid1 Array Size : 102144 (99.75 MiB 104.60 MB) Device Size : 102144 (99.75 MiB 104.60 MB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Mar 28 04:04:22 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : dd2aabd5:fb2ab384:cba9912c:df0b0f4b Events : 0.3275 Number Major Minor RaidDevice State 0 3 1 0 active sync /dev/hda1 1 22 1 1 active sync /dev/hdc1 # mdadm -D /dev/md2 /dev/md2: Version : 00.90.03 Creation Time : Mon Aug 9 02:16:19 2004 Raid Level : raid1 Array Size : 77023232 (73.46 GiB 78.87 GB) Device Size : 77023232 (73.46 GiB 78.87 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Tue Mar 28 15:36:04 2006 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 31c6dbdc:414eee2d:50c4c773:2edc66f6 Events : 0.19023894 Number Major Minor RaidDevice State 0 0 0 - removed 1 22 3 1 active sync /dev/hdc3
Note that md2 is marked as degraded because one of the devices is missing.
126.96.36.199. Creating a RAID array
# mdadm --create -n 2 -l raid1 /dev/md0 /dev/sdb1 /dev/sdc1 mdadm: array /dev/md0 started.
There are a lot of arguments used here:
/proc/mdstat shows the configuration of /dev/md0:
# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1 sdb1 63872 blocks [2/2] [UU] unused devices: <none>
# mdadm --create -n 3 -l raid5 /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdf1 mdadm: largest drive (/dev/sdb1) exceed size (62464K) by more than 1% Continue creating array? y mdadm: array /dev/md0 started.
Note that RAID expects all of the devices to be the same size. If they are not, the array will use only the amount of storage equal to the smallest partition on each of the devices; for example, if given partitions that are 50 GB, 47.5 GB, and 52 GB in size, the RAID system will use 47.5 GB in each of the three partitions, wasting 5 GB of disk space. If the variation between devices is more than 1 percent, as in this case, mdadm will prompt you to confirm that you're aware of the difference (and therefore the wasted storage space).
# mkfs -t ext3 /dev/md0 mke2fs 1.38 (30-Jun-2005) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) 16000 inodes, 63872 blocks 3193 blocks (5.00%) reserved for the super user First data block=1 Maximum filesystem blocks=65536000 8 block groups 8192 blocks per group, 8192 fragments per group 2000 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345 Writing inode tables: done Creating journal (4096 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Then mount it and use it:
# mkdir /mnt/raid # mount /dev/md0 /mnt/raid
# pvcreate /dev/md0 Physical volume "/dev/md0" successfully created # vgcreate test /dev/md0 Volume group "test" successfully created # lvcreate test --name mysql --size 60M Logical volume "mysql" created # mkfs -t ext3 /dev/test/mysql mke2fs 1.38 (30-Jun-2005) ...(Lines skipped)... This filesystem will be automatically checked every 36 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. # mkdir /mnt/mysql # mount /dev/test/mysql /mnt/mysql
188.8.131.52. Handling a drive failure
# mdadm --fail /dev/md0 /dev/sdc1 mdadm: set /dev/sdc1 faulty in /dev/md0
# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1(F) sdb1 63872 blocks [2/1] [U_] unused devices: <none>
To place the "failed" element back into the array, remove it and add it again:
# mdadm --remove /dev/md0 /dev/sdc1 mdadm: hot removed /dev/sdc1 # mdadm --add /dev/md0 /dev/sdc1 mdadm: re-added /dev/sdc1 # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1 sdb1 63872 blocks [2/1] [U_] [>....................] recovery = 0.0% (928/63872) finish=3.1min speed=309K/sec unused devices: <none>
If the drive had really failed (instead of being subject to a simulated failure), you would replace the drive after removing it from the array and before adding the new one.
If you check /proc/mdstat a short while after readding the drive to the array, you can see that the RAID system automatically rebuilds the array by copying data from the good drive(s) to the new drive:
# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1 sdb1 63872 blocks [2/1] [U_] [=============>.......] recovery = 65.0% (42496/63872) finish=0.8min speed=401K/sec unused devices: <none>
# mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Thu Mar 30 01:01:00 2006 Raid Level : raid1 Array Size : 63872 (62.39 MiB 65.40 MB) Device Size : 63872 (62.39 MiB 65.40 MB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Mar 30 01:48:39 2006 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Rebuild Status : 65% complete UUID : b7572e60:4389f5dd:ce231ede:458a4f79 Events : 0.34 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 33 1 spare rebuilding /dev/sdc1
184.108.40.206. Stopping and restarting a RAID array
A RAID array can be stopped anytime that it is not in useuseful if you have built an array incorporating removable or external drives that you want to disconnect. If you're using the RAID device as an LVM physical volume, you'll need to deactivate the volume group so the device is no longer considered to be in use:
# vgchange test -an 0 logical volume(s) in volume group "test" now active
# mdadm --stop /dev/md0
The two steps above will automatically be performed when the system is shut down.
# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 mdadm: /dev/md0 has been started with 2 drives.
# mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Thu Mar 30 02:09:14 2006 Raid Level : raid1 Array Size : 63872 (62.39 MiB 65.40 MB) Device Size : 63872 (62.39 MiB 65.40 MB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Mar 30 02:19:00 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 5fccf106:d00cda80:daea5427:1edb9616 Events : 0.18 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 33 1 active sync /dev/sdc1
DEVICE partitions MAILADDR root ARRAY /dev/md0 uuid=c27420a7:c7b40cc9:3aa51849:99661a2e
In this file, the DEVICE line identifies the devices to be scanned (all partitions of all storage devices in this case), and the ARRAY lines identify each RAID array that is expected to be present. This ensures that the RAID arrays identified by scanning the partitions will always be assigned the same md device numbers, which is useful if more than one RAID array exists in the system. In the mdadm.conf files created during installation by Anaconda, the ARRAY lines contain optional level= and num-devices= enTRies (see the next section).
# vgchange test -ay 1 logical volume(s) in volume group "test" now active
220.127.116.11. Monitoring RAID arrays
# mdadm.conf written out by anaconda DEVICE partitions MAILADDR raid-alert ARRAY /dev/md0 level=raid1 num-devices=2 uuid=dd2aabd5:fb2ab384:cba9912c:df0b0f4b ARRAY /dev/md1 level=raid1 num-devices=2 uuid=2b0846b0:d1a540d7:d722dd48:c5d203e4 ARRAY /dev/md2 level=raid1 num-devices=2 uuid=31c6dbdc:414eee2d:50c4c773:2edc66f6
When mdadm.conf is configured by Anaconda, the email address is set to root. It is a good idea to set this to an email alias, such as raid-alert, and configure the alias in the /etc/aliases file to send mail to whatever destinations are appropriate:
raid-alert: chris, firstname.lastname@example.org
In this case, email will be sent to the local mailbox chris, as well as to a cell phone.
When an event occurs, such as a drive failure, mdadm sends an email message like this:
From email@example.com Thu Mar 30 09:43:54 2006 Date: Thu, 30 Mar 2006 09:43:54 -0500 From: mdadm monitoring <firstname.lastname@example.org> To: email@example.com Subject: Fail event on /dev/md0:bluesky.fedorabook.com This is an automatically generated mail message from mdadm running on bluesky.fedorabook.com A Fail event had been detected on md device /dev/md0. It could be related to component device /dev/sdc1. Faithfully yours, etc.
# mdadm.conf written out by anaconda DEVICE partitions MAILADDR raid-alert PROGRAM /usr/local/sbin/mdadm-event-handler ARRAY /dev/md0 level=raid1 num-devices=2 uuid=dd2aabd5:fb2ab384:cba9912c:df0b0f4b ARRAY /dev/md1 level=raid1 num-devices=2 uuid=2b0846b0:d1a540d7:d722dd48:c5d203e4 ARRAY /dev/md2 level=raid1 num-devices=2 uuid=31c6dbdc:414eee2d:50c4c773:2edc66f6
Only one program name can be given. When an event is detected, that program will be run with three arguments: the event, the RAID device, and (optionally) the RAID element. If you wanted a verbal announcement to be made, for example, you could use a script like this:
#!/bin/bash # # mdadm-event-handler :: announce RAID events verbally # # Set up the phrasing for the optional element name if [ "$3" ] then E=", element $3" fi # Separate words (RebuildStarted -> Rebuild Started) $T=$(echo $1|sed "s/\([A-Z]\)/ \1/g") # Make the voice announcement and then repeat it echo "Attention! RAID event: $1 on $2 $E"|festival --tts sleep 2 echo "Repeat: $1 on $2 $E"|festival --tts
When a drive fails, this script will announce something like "Attention! RAID event: Failed on /dev/md0, element /dev/sdc1" using the Festival speech synthesizer. It will also announce the start and completion of array rebuilds and other important milestones (make sure you keep the volume turned up).
18.104.22.168. Setting up a hot spare
When a system with RAID 1 or higher experiences a disk failure, the data on the failed drive will be recalculated from the remaining drives. However, data access will be slower than usual, and if any other drives fail, the array will not be able to recover. Therefore, it's important to replace a failed disk drive as soon as possible.
When a server is heavily used or is in an inaccessible locationsuch as an Internet colocation facilityit makes sense to equip it with a hot spare. The hot spare is installed but unused until another drive fails, at which point the RAID system automatically uses it to replace the failed drive.
# mdadm --create -l raid1 -n 2 -x 1 /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdf1 mdadm: array /dev/md0 started. $ cat /proc/mdstat Personalities : [raid1] [raid5] [raid4] md0 : active raid1 sdf1(S) sdc1 sdb1 62464 blocks [2/2] [UU] unused devices: <none>
Notice that /dev/sdf1 is marked with the symbol (S) indicating that it is the hot spare.
If an active element in the array fails, the hot spare will take over automatically:
$ cat /proc/mdstat Personalities : [raid1] [raid5] [raid4] md0 : active raid1 sdf1 sdc1(F) sdb1 62464 blocks [2/1] [U_] [=>...................] recovery = 6.4% (4224/62464) finish=1.5min speed=603K/sec unused devices: <none>
When you remove, replace, and readd the failed drive, it will become the hot spare:
# mdadm --remove /dev/md0 /dev/sdc1 mdadm: hot removed /dev/sdc1 ...(Physically replace the failed drive)... # mdadm --add /dev/md0 /dev/sdc1 mdadm: re-added /dev/sdc1 # cat /proc/mdstat Personalities : [raid1] [raid5] [raid4] md0 : active raid1 sdc1(S) sdf1 sdb1 62464 blocks [2/2] [UU] unused devices: <none>
Likewise, to add a hot spare to an existing array, simply add an extra drive:
# mdadm --add /dev/md0 /dev/sdh1 mdadm: added /dev/sdh1
Since hot spares are not used until another drive fails, it's a good idea to spin them down (stop the motors) to prolong their life. This command will program all of your drives to stop spinning after 15 minutes of inactivity (on most systems, only the hot spares will ever be idle for that length of time):
# hdparm -S 180 /dev/[sh]d[a-z]
#!/bin/sh # # This script will be executed *after* all the other init scripts. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff. touch /var/lock/subsys/local hdparm -S 180 /dev/[sh]d[a-z]
22.214.171.124. Monitoring drive health
Fedora provides smartd for SMART disk monitoring. The configuration file /etc/ smartd.conf is configured by the Anaconda installer to monitor each drive present in the system and to report only imminent (within 24 hours) drive failure to the root email address:
/dev/hda -H -m root /dev/hdb -H -m root /dev/hdc -H -m root
(I've left out the many comment lines that are in this file.)
It is a good idea to change the email address to the same alias used for your RAID error reports:
/dev/hda -H -m raid-alert /dev/hdb -H -m raid-alert /dev/hdc -H -m raid-alert
6.2.2. How Does It Work?
Fedora's RAID levels 4 and 5 use parity information to provide redundancy. Parity is calculated using the exclusive-OR function, as shown in Table 6-4.
Notice that the total number of 1 bits in each row is an even number. You can determine the contents of any column based on the values in the other two columns (A = B XOR C and B = A XOR C); in this way, the RAID system can determine the content of any one failed drive. This approach will work with any number of drives.
Parity calculations are performed using the CPU's vector instructions (MMX/3DNow/SSE/AltiVec) whenever possible. Even an old 400 MHz Celeron processor can calculate RAID 5 parity at a rate in excess of 2 GB per second.
RAID 6 uses a similar but more advanced error-correcting code (ECC) that takes two bits of data for each row. This code permits recovery from the failure of any two drives, but the calculations run about one-third slower than the parity calculations. In a high-performance context, it may be better to use RAID 5 with a hot spare instead of RAID 6; the protection will be almost as good and the performance will be slightly higher.
6.2.3. What About...
126.96.36.199. ...booting from a RAID array?
During the early stages of the boot process, no RAID driver is available. However, in a RAID 1 (mirroring) array, each element contains a full and complete copy of the data in the array and can be used as though it were a simple volume. Therefore, only RAID 1 can be used for the /boot filesystem.
188.8.131.52. ...mixing and matching USB flash drives, USB hard disks, SATA, SCSI, and IDE/ATA drives?
184.108.40.206. ...mirroring to a remote drive as part of a disaster-recovery plan?
Daily disk or tape backups can be up to 24 hours out of date, which can hamper recovery when your main server is subject to a catastrophic disaster such as fire, circuit-frying power-supply-unit failure, or theft. Up-to-the-minute data backup for rapid disaster recovery requires the use of a remote storage mirror.
iSCSI (SCSI over TCP/IP) is a storage area network technology that is an economical alternative to fiber channel and other traditional SAN technologies. Since it is based on TCP/IP, it is easy to route over long distances, making it ideal for remote mirroring.
Fedora Core includes an iSCSI initiator, the software necessary to remotely access a drive using the iSCSI protocol. The package name is iscsi-initiator-utils. Obviously, you'll need a remote iSCSI drive in order to do remote mirroring, and you'll need to know the portal IP address or hostname on the remote drive.
Create the file /etc/initiatorname.iscsi, containing one line:
This configures an iSCSI Qualified Name (IQN) that is globally unique. The IQN consists of the letters iqn, a period, the year and month in which your domain was registered (2006-04), a period, your domain name with the elements reversed, a colon, and a string that you make up (which must be unique within your domain).
# service iscsi start
You may see some error messages the first time you start the iscsi daemon; these can be safely ignored.
# iscsiadm -m discovery -tst -p 172.16.97.2 [f68ace] 172.16.97.2:3260,1 iqn.2006-04.com.fedorabook:remote1-volume1
The options indicate discovery mode, sendtargets (st) discovery type, and the portal address or hostname. The result that is printed shows the IQN of the remote target, including a node record ID at the start of the line (f68ace). The discovered target information is stored in a database for future reference, and the node record ID is the key to accessing this information.
# iscsiadm -m node --record f68ace --login
The details of the connection are recorded in /var/log/messages:
Mar 30 22:05:18 blacktop kernel: scsi1 : iSCSI Initiator over TCP/IP, v.0.3 Mar 30 22:05:19 blacktop kernel: Vendor: IET Model: VIRTUAL-DISK Rev: 0 Mar 30 22:05:19 blacktop kernel: Type: Direct-Access ANSI SCSI revision: 04 Mar 30 22:05:19 blacktop kernel: SCSI device sda: 262144 512-byte hdwr sectors (134 MB) Mar 30 22:05:19 blacktop kernel: sda: Write Protect is off Mar 30 22:05:19 blacktop kernel: SCSI device sda: drive cache: write back Mar 30 22:05:19 blacktop kernel: SCSI device sda: 262144 512-byte hdwr sectors (134 MB) Mar 30 22:05:19 blacktop kernel: sda: Write Protect is off Mar 30 22:05:19 blacktop kernel: SCSI device sda: drive cache: write back Mar 30 22:05:19 blacktop kernel: sda: sda1 Mar 30 22:05:19 blacktop kernel: sd 14:0:0:0: Attached scsi disk sda Mar 30 22:05:19 blacktop kernel: sd 14:0:0:0: Attached scsi generic sg0 type 0 Mar 30 22:05:19 blacktop iscsid: picking unique OUI for the same target node name iqn.2006-04.com.fedorabook:remote1-volume1 Mar 30 22:05:20 blacktop iscsid: connection1:0 is operational now
This shows that the new device is accessible as /dev/sda and has one partition (/dev/sda1).
You can now create a local LV that is the same size as the remote drive:
# lvcreate main --name database --size 128M Logical volume "database" created
# mdadm --create -l raid1 -n 2 /dev/md0 /dev/main/database /dev/sdi1 mdadm: array /dev/md0 started.
# mkfs -t ext3 /dev/md0 mke2fs 1.38 (30-Jun-2005) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) 32768 inodes, 130944 blocks 6547 blocks (5.00%) reserved for the super user First data block=1 Maximum filesystem blocks=67371008 16 block groups 8192 blocks per group, 8192 fragments per group 2048 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345, 73729 Writing inode tables: done Creating journal (4096 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 27 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. # mkdir /mnt/database # mount /dev/md0 /mnt/database
Any data you write to /mnt/database will be written to both the local volume and the remote drive.
# umount /mnt/database # mdadm --stop /dev/md0 # iscsiadm -m node --record f68ace --logout
# # Open-iSCSI default configuration. # Could be located at /etc/iscsid.conf or ~/.iscsid.conf # node.active_cnx = 1 node.startup = automatic #node.session.auth.username = dima #node.session.auth.password = aloha node.session.timeo.replacement_timeout = 0 node.session.err_timeo.abort_timeout = 10 node.session.err_timeo.reset_timeout = 30 node.session.iscsi.InitialR2T = No node.session.iscsi.ImmediateData = Yes node.session.iscsi.FirstBurstLength = 262144 node.session.iscsi.MaxBurstLength = 16776192 node.session.iscsi.DefaultTime2Wait = 0 node.session.iscsi.DefaultTime2Retain = 0 node.session.iscsi.MaxConnections = 0 node.cnx.iscsi.HeaderDigest = None node.cnx.iscsi.DataDigest = None node.cnx.iscsi.MaxRecvDataSegmentLength = 65536 #discovery.sendtargets.auth.authmethod = CHAP #discovery.sendtargets.auth.username = dima #discovery.sendtargets.auth.password = aloha
Change the node.startup line to read:
node.startup = manual
Once the remote mirror has been configured, you can create a simple script file with the setup commands:
#!/bin/bash iscsiadm -m node --record f68ace --login mdadm --assemble /dev/md0 /dev/main/database /dev/sdi1 mount /dev/md0 /mnt/database
And another script file with the shutdown commands:
#!/bin/bash umount /mnt/database mdadm --stop /dev/md0 iscsiadm -m node --record f68ace --logout
# chmod u+rx /usr/local/sbin/remote-mirror-start # chmod u+rx /usr/local/sbin/remote-mirror-stop
220.127.116.11. ...using more than one RAID array, but configuring one hot spare to be shared between them?
# mdadm.conf written out by anaconda DEVICE partitions MAILADDR root ARRAY /dev/md0 spare-group=red uuid=5fccf106:d00cda80:daea5427:1edb9616 ARRAY /dev/md1 spare-group=red uuid=aaf3d1e1:6f7231b4:22ca60f9:00c07dfe
The name of the spare-group does not matter as long as all of the arrays sharing the hot spare have the same value; here I've used red. Ensure that at least one of the arrays has a hot spare and that the size of the hot spare is not smaller than the largest element that it could replace; for example, if each device making up md0 was 10 GB in size, and each element making up md1 was 5 GB in size, the hot spare would have to be at least 10 GB in size, even if it was initially a member of md1.
18.104.22.168. ...configuring the rebuild rate for arrays?
Array rebuilds will usually be performed at a rate of 1,000 to 20,000 KB per second per drive, scheduled in such a way that the impact on application storage performance is minimized. Adjusting the rebuild rate lets you adjust the trade-off between application performance and rebuild duration.
$ cat /proc/sys/dev/raid/speed_limit* 200000 1000
To change a setting, place a new number in the appropriate pseudo-file:
# echo 40000 >/proc/sys/dev/raid/speed_limit_max
22.214.171.124. ...simultaneous drive failure?
Sometimes, a drive manufacturer just makes a bad batch of disksand this has happened more than once. For example, a few years ago, one drive maker used defective plastic to encapsulate the chips on the drive electronics; drives with the defective plastic failed at around the same point in their life cycles, so that several elements of RAID arrays built using these drives would fail within a period of days or even hours. Since most RAID levels provide protection against a single drive failure but not against multiple drive failures, data was lost.
For greatest safety, it's a good idea to buy disks of similar capacity from different drive manufacturers (or at least different models or batches) when building a RAID array, in order to reduce the likelihood of near-simultaneous drive failure.
6.2.4. Where Can I Learn More?