RAID (Redundant Array of Inexpensive disks)
5 min readRAID stands for Redundant Array of Inexpensive Disks which was later interpreted to Redundant Array of Independent Disks. This technology is now used in almost all the IT organizations looking for data redundancy and better performance. It combines multiple available disks into 1 or more logical drive and gives you the ability to survive one or more drive failures depending upon the RAID level used.
RAID is a way of storing the same data in different places on multiple hard disks or solid-state drives (SSDs) to protect data in the case of a drive failure. There are different RAID levels, however, and not all have the goal of providing redundancy.
Raid contains groups or sets or Arrays. A combine of drivers make a group of disks to form a RAID Array or RAID set. It can be a minimum of 2 number of disk connected to a raid controller and make a logical volume or more drives can be in a group. Only one Raid level can be applied in a group of disks. Raid are used when we need excellent performance.
Software RAID and Hardware RAID
Software RAID have low performance, because of consuming resource from hosts. Raid software need to load for read data from software raid volumes. Before loading raid software, OS need to get boot to load the raid software. No need of Physical hardware in software raids. Zero cost investment.
Hardware RAID have high performance. They are dedicated RAID Controller which is Physically built using PCI express cards. It won’t use the host resource. They have NVRAM for cache to read and write. Stores cache while rebuild even if there is power-failure, it will store the cache using battery power backups. Very costly investments needed for a large scale.
Featured Concepts of RAID
1. Parity method in raid regenerate the lost content from parity saved information’s. RAID 5, RAID 6 Based on Parity. 2. Stripe is sharing data randomly to multiple disk. This won’t have full data in a single disk. If we use 3 disks half of our data will be in each disks. 3. Mirroring is used in RAID 1 and RAID 10. Mirroring is making a copy of same data. In RAID 1 it will save the same content to the other disk too. 4. Hot spare is just a spare drive in our server which can automatically replace the failed drives. If any one of the drive failed in our array this hot spare drive will be used and rebuild automatically. 5. Chunks are just a size of data which can be minimum from 4KB and more. By defining chunk size we can increase the I/O performance.
RAID’s are in various Levels. Here we will see only the RAID Levels which is used mostly in real environment. RAID0 = Striping RAID1 = Mirroring RAID5 = Single Disk Distributed Parity RAID6 = Double Disk Distributed Parity RAID10 = Combine of Mirror & Stripe. (Nested RAID)
RAID 0
This level strips the data into multiple available drives equally giving a very high read and write performance but offering no fault tolerance or redundancy. This level does not provides any of the RAID factor and cannot be considered in an organization looking for redundancy instead it is preferred where high performance is required.
Calculation:
No. of Disk: 5
Size of each disk: 100GB
Usable Disk size: 500GB
Pros | Cons |
Data is stripped into multiple drives | No support for Data Redundancy |
Disk space is fully utilized | No support for Fault Tolerance |
Minimum 2 drives required | No error detection mechanism |
High performance | Failure of either disk results in complete data loss in respective array |
RAID 1
This level performs mirroring of data in drive 1 to drive 2. It offers 100% redundancy as array will continue to work even if either disk fails. So organization looking for better redundancy can opt for this solution but again cost can become a factor. Calculation: No. of Disk: 2 Size of each disk: 100GB Usable Disk size: 100GB
Pros | Cons |
Performs mirroring of data i.e identical data from one drive is written to another drive for redundancy. | Expense is higher (1 extra drive required per drive for mirroring) |
High read speed as either disk can be used if one disk is busy | Slow write performance as all drives has to be updated |
Array will function even if any one of the drive fails | |
Minimum 2 drives required |
RAID 5 (or) Distributed Parity
RAID 5 is mostly used in enterprise levels. RAID 5 work by distributed parity method. Parity info will be used to rebuild the data. It rebuilds from the information left on the remaining good drives. This will protect our data from drive failure. Assume we have 4 drives, if one drive fails and while we replace the failed drive we can rebuild the replaced drive from parity information's. Parity information’s are Stored in all 4 drives, if we have 4 numbers of 1TB hard-drive. The parity information will be stored in 256GB in each drivers and other 768GB in each drives will be defined for Users. RAID 5 can be survive from a single Drive failure, If drives fails more than 1 will cause loss of data’s.
* Excellent Performance * Reading will be extremely very good in speed. * Writing will be Average, slow if we won’t use a Hardware RAID Controller. * Rebuild from Parity information from all drives. * Full Fault Tolerance. * 1 Disk Space will be under Parity. * Can be used in file servers, web servers, very important backups.
RAID 6 Two Parity Distributed Disk
RAID 6 is same as RAID 5 with two parity distributed system. Mostly used in a large number of arrays. We need minimum 4 Drives, even if there 2 Drive fails we can rebuild the data while replacing new drives. Very slower than RAID 5, because it writes data to all 4 drivers at same time. Will be average in speed while we using a Hardware RAID Controller. If we have 6 numbers of 1TB hard-drives 4 drives will be used for data and 2 drives will be used for Parity.
Poor Performance. * Read Performance will be good. * Write Performance will be Poor if we not using a Hardware RAID Controller. * Rebuild from 2 Parity Drives. * Full Fault tolerance. * 2 Disks space will be under Parity. * Can be Used in Large Arrays. * Can be use in backup purpose, video streaming, used in large scale.
RAID 10 (or) Mirror & Stripe
RAID 10 can be called as 1+0 or 0+1. This will do both works of Mirror & Striping. Mirror will be first and stripe will be the second in RAID 10. Stripe will be the first and mirror will be the second in RAID 01. RAID 10 is better comparing to 01.