Reference and Sources:
RAID stands for Redundant Array of Inexpensive (or sometimes "Independent") Disks.
RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together to appear as a single device to the host system).
RAID technology was developed to address the fault-tolerance and performance limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or group of independent hard drives.
While arrays were once considered complex and relatively specialized storage solutions, today they are easy to use and essential for a broad spectrum of client/server applications.
RAID technology was first defined by a group of computer scientists at the University of California at Berkeley in 1987. The scientists studied the possibility of using two or more disks to appear as a single device to the host system.
Although the array's performance was better than that of large, single-disk storage systems, reliability was unacceptably low. To address this, the scientists proposed redundant architectures to provide ways of achieving storage fault tolerance. In addition to defining RAID levels 1 through 5, the scientists also studied data striping -- a non-redundant array configuration that distributes files across multiple disks in an array. Often known as RAID 0, this configuration actually provides no data protection. However, it does offer maximum throughput for some data-intensive applications such as desktop digital video production.
A number of factors are responsible for the growing adoption of arrays for critical network storage.
More and more organizations have created enterprise-wide networks to improve productivity and streamline information flow. While the distributed data stored on network servers provides substantial cost benefits, these savings can be quickly offset if information is frequently lost or becomes inaccessible.
As today's applications create larger files, network storage needs have increased proportionately.
In addition, accelerating CPU speeds have outstripped data transfer rates to storage media, creating bottlenecks in today's systems.
RAID storage solutions overcome these challenges by providing a combination of outstanding data availability, extraordinary and highly scalable performance, high capacity, and recovery with no loss of data or interruption of user access.
By integrating multiple drives into a single array -- which is viewed by the network operating system as a single disk drive -- organizations can create cost-effective, minicomputersized solutions of up to a terabyte or more of storage.
There are several different RAID "levels" or redundancy schemes, each with inherent cost, performance, and availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5 have been the most widely used. This is because popular NOSs such as Windows NT® Server and NetWare manage data in ways similar to how these RAID architectures perform.
Data striping without redundancy (no protection).
| DRIVE 1 | DRIVE 2 |
|---|---|
| Data A | Data A |
| Data B | Data B |
| Data C | Data C |
Disk mirroring.
|
| |||||||||||||||||||||||||||||
No practical use.
Byte-level data striping with dedicated parity drive.
Block-level data striping with dedicated parity drive.
Block-level data striping with distributed parity.
| DRIVE 1 | DRIVE 2 | DRIVE 3 |
|---|---|---|
| Parity A | Data A | Data A |
| Data B | Parity B | Data B |
| Data C | Data C | Parity C |
Combination of RAID 0 (data striping) and RAID 1 (mirroring). RAID 01 (0+1) is a mirrored configuration of two striped sets (mirror of stripes); RAID 10 (1+0) is a stripe across a number of mirrored sets(stripe of mirrors). RAID 10 provides better fault tolerance and rebuild performance than RAID 01. Both array types provide very good to excellent overall performance by combining the speed of RAID 0 with the redundancy of RAID 1 without requiring parity calculations.
| RAID 01 (0+1 mirror of stripes) | |||
|---|---|---|---|
| DRIVE 1 | DRIVE 2 | DRIVE 3 | DRIVE 4 |
| Data A | Data A | mA | mA |
| Data B | Data B | mB | mB |
| Data C | Data C | mC | mC |
| Original Data | Original Data | Mirrored Data | Mirrored Data |
| RAID 10 (1+0 stripe of mirrors) | |||
|---|---|---|---|
| DRIVE 1 | DRIVE 2 | DRIVE 3 | DRIVE 4 |
| Data A | mA | Data B | mB |
| Data C | mC | Data D | mD |
| Data E | mE | Data F | mF |
| Original Data | Mirrored Data | Original Data | Mirrored Data |
There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly better than another -- although software-based arrays are rapidly losing favor as high-performance, low-cost array adapters become increasingly available. Each array solution meets different server and network requirements, depending on the number of users, applications, and storage requirements.
It is important to note that all RAID code is based on software. The difference among the solutions is where that software code is executed -- on the host CPU (software-based arrays) or offloaded to an on-board processor (bus-based and external array controllers).
| Description | Advantages | |
|---|---|---|
| Software-based RAID | Primarily used with entry-level servers, software-based arrays rely on a standard host adapter and execute all I/O commands and mathematically intensive RAID algorithms in the host server CPU. This can slow system performance by increasing host PCI bus traffic, CPU utilization, and CPU interrupts. Some NOSs such as NetWare and Windows NT include embedded RAID software. The chief advantage of this embedded RAID software has been its lower cost compared to higher-priced RAID alternatives. However, this advantage is disappearing with the advent of lower-cost, bus-based array adapters. | |
| Hardware-based RAID | Unlike software-based arrays, bus-based array adapters/controllers plug into a host bus slot [typically a 133 MByte (MB)/sec PCI bus] and offload some or all of the I/O commands and RAID operations to one or more secondary processors. Originally used only with mid- to high-end servers due to cost, lower-cost bus-based array adapters are now available specifically for entry-level server network applications. In addition to offering the fault-tolerant benefits of RAID, bus-based array adapters/controllers perform connectivity functions that are similar to standard host adapters. By residing directly on a host PCI bus, they provide the highest performance of all array types. Bus-based arrays also deliver more robust fault-tolerant features than embedded NOS RAID software. As newer, high-end technologies such as Fibre Channel become readily available, the performance advantage of bus-based arrays compared to external array controller solutions may diminish. | |
| External Hardware RAID Card | Intelligent external array controllers "bridge" between one or more server I/O interfaces and single- or multiple-device channels. These controllers feature an on-board microprocessor, which provides high performance and handles functions such as executing RAID software code and supporting data caching. External array controllers offer complete operating system independence, the highest availability, and the ability to scale storage to extraordinarily large capacities (up to a terabyte and beyond). These controllers are usually installed in networks of stand alone Intel-based and UNIX-based servers as well as clustered server environments. |
| UDMA | SCSI | Fibre Channel | |
|---|---|---|---|
| Best Suited For | Low-cost entry level server with limited expandability | Low to high-end server when scalability is desired | Server-to-Server campus networks |
| Advantages |
The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance, offers no data redundancy at all.
A + B + C + D = PARITY
1 + 2 + 3 + 4 = 10
1 + 2 + X + 4 = 10
7 + X = 10
-7 + = -7
--------- ----------
X 3
MISSING RECOVERED
DATA DATA |
RAID technology does not prevent drive failures. However, RAID does provide insurance against disk drive failures by enabling real-time data recovery without data loss.
The fault tolerance of arrays can also be significantly enhanced by choosing the right storage enclosure. Enclosures that feature redundant, hot-swappable drives, power supplies, and fans can greatly increase storage subsystem uptime based on a number of widely accepted measures:
| User Contributed Notes |
|---|
| login or register to add a comment |