What is RAID and how to configure it, types of RAIDs

wwwuser · **Posted:** January 12, 2006, 2:43 pm

RAID - Redundant Array of Independent Disks

RAID is defined as Redundant Array of Independent Disks or a Redundant Array of Inexpensive Disks. The key point about RAID is redundancy or fault tolerance, which protects stored data from loss should a hard drive or controller fail. RAID implementations can be as simple as mirrored hard drives or as complex as multiple RAID 5 arrays striped together to create one or more large virtual hard disks or as complex as sharing storage in a Storage Area Network (SAN) across an enterprise level installation. In order to implement RAID solutions, special hardware or software is often required.

A RAID implementation is only as good as its backbone. A fault tolerant drive array is only as fault tolerant as the components of which is it made. Unless all the components of the array are fault tolerant, redundant, or hot swappable, the array can still suffer a complete failure. A fault tolerant array should utilize hot swappable and redundant power supplies, hot swappable fans, controllers, and hard drives. In some cases it is even advisable to have fail-over servers.

Software based RAID solutions, like those built into Windows NT Workstation, Windows NT Server, and Windows 2000, draw precious system resources from the host processor(s) to control the RAID storage, while hardware based RAID solutions free the host processor(s) to handle other applications, such as a nonlinear editing program. Anytime RAID is implemented with redundancy or with parity and fault tolerance, maximum data throughput will be lower than RAID without fault tolerance or parity. In video applications, this may or may not be a problem, but there are also ways to increase throughput, if necessary.

Most nonlinear editing systems utilize RAID 0 for maximum performance at the expense of data loss from potential crashing, while other video systems utilize some form of fault tolerant RAID to prevent crashing. RAID 0 is striped disks only without fault tolerance. Servers, on the other hand, require data to be available at all times. Servers utilize fault tolerant RAID like RAID 3, 4, 5, and 6 and many typically utilize mirrored disks in addition to the RAID 3, 4, 5, and 6 arrays. There are other "levels" of RAID too, but they tend to be variants of RAID levels 0, 1, 3, 4, 5, or 6. Many RAID controllers do not support RAID 6 and the majority of RAID controllers support RAID 3 and 4 but not both. In the following sections, RAID will be discussed in depth.

Solumedia follows the RAID Advisory Board definition of RAID which is based on a 1988 paper titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by David A. Patterson, Garth A. Gibson, and Randy H. Katz and presented during the ACM SIGMOD Conference on Management of Data in Chicago, Illinois. This paper has become known as the Berkeley Paper and the RAID types defined are known as Berkeley RAID levels. The consummate source of information on RAID technology is The RAIDbook published by the RAID Advisory Board. It and other sources were used as references in the development of this paper. The RAID Book costs approximately $49.

RAID 0
RAID 0 is not a Berkeley RAID level as discussed in the above 1988 paper because it doesn't offer any protection for hardware failure, but it does fall under the guise of an array of disks and is considered to be a RAID level. RAID 0 is known as disk striping in which data is spread, mapped, or interleaved across multiple disks in parallel to speed up the data transfer rate substantially. For the most part, the data transfer rate of a RAID 0 striped set is the sum of the transfer rates of the drives included in the set minus data and controller overhead. Utilizing SCSI or Fibre Channel architectures allows multiple commands to occur simultaneously on different disks, which increases data throughput. This is not so with EIDE hard drives because they can only handle a single command or I/O request at a time.

Spreading data among several disks also increases data throughput, because data can be written to one disk of the stripe set while the platters of another disk in the stripe set rotate into position (disk latency) to write the next section. The same occurs when reading data from the hard disks in a stripe set. The only limit to the number of disks that can be striped together is the maximum bandwidth of the SCSI or Fibre Channel bus. However, since a RAID 0 stripe set has no redundancy, if a drive fails, all data in that stripe set is lost. Consequently, in most cases, RAID 0 stripe sets are limited to 3 to 5 hard drives in size.

A RAID 0 array can be of any size up to the maximum supported by the operating system, but they are typically comprised of partitions from 2 or 3 hard drives and not the entire drives striped together. The maximum size of a RAID 0 array is determined by the smallest partition included in the stripe set. In other words, if a 20 MB partition is included with two 100 MB partitions, the maximum stripe set size is 60 MB (20 + 20 + 20).

RAID 0 arrays are used where data transfer rate is the primary factor, safety is not a factor, and most data is sequential. Therefore, RAID 0 is most often used for video applications where sustained scalable data transfer rates of 18 MB per second for uncompressed D1 video are common. Video editors are used to redigitizing video from a video tape deck should a RAID 0 drive fail. Now, imagine if that same array contained time consuming finished video compositing from programs such as Adobe After Effects. If that data was to be lost and wasn't backed up, the loss of time in recreating the effect could be substantial. And this is where a production house has to weigh maximum data transfer rate against a slightly slower transfer rate with a fault tolerant array. However, if the RAID 0 array was based on 20 MB/second fast wide SCSI-2, just upgrading to Ultra2 LVD SCSI at 80 MB/second in RAID 3 or Fibre Channel at 100 MB/second in RAID 5 would more than make up for any speed losses related to fault-tolerant redundant RAID storage. Of course, another option would be to utilize multiple RAID arrays; RAID 0 for video and RAID 3 or 5 for rendered video and animations.

Striping Primer
How any hard drive array is physically striped has a major effect on the performance of the array, but its major impact is on AV drive arrays used for video streaming and nonlinear editing. Only AV drive arrays are being addressed in this section. Most modern hard drives, notably those from Seagate use Zone Bit Recording, where more data are packed on the inner cylinder tracks of each platter than on the outer cylinder tracks. By a simple law of physics, it can be seen that the rotational velocity of the outer platter tracks is the fastest and slowly decreases toward the center of the platter. Therefore, the inner tracks of each platter have about a 35% slower internal transfer rate than the outer tracks. Until hard drives are developed that eliminate this physical problem and equalize the transfer rate across all the tracks, we have to work around this fact, which may or may not impact the application. Striping a sufficient number of hard drives together, essentially negates the effect of the rotational velocity of the hard drive platters dropping as the heads move toward the spindle.

A common way of striping hard drives is to split two drives equally into two partitions each and then stripe the outer partitions together and the inner partitions together. Of course, as discussed above, the throughput of the inner stripe set will be less than the throughput of the outer stripe set. Whether the difference in throughput is important or not is dependent upon the maximum throughput required. If 15 MB/second is required and the inner stripe set of two drives can only support a sustained 13 MB/second, one option is to restripe the array using 3 or 4 hard drives. Another option is to stripe the array in other ways. We like to call these other nonproprietary methods Striped Ape Technology because they have the power to equalize the throughput across the drives, while having the potential to make you act like a primate trying to set it up. But once set up, which really isn't difficult, the hard drive array will run like a striped ape.

Some companies are providing proprietary solutions to drive striping that lock you into their proprietary hardware, which also limits your alternatives. There is no reason to do this. Solumedia offers nonproprietary solutions to the aforementioned throughput problem that occurs as a hard drive gradually fills toward the zone of inner tracks. Rather than stripe the outer partitions of a pair of drives together and the inner partitions together, instead stripe the outer partition of one drive with the inner partition of the other and repeat this with the other drive pairs. What this does is lower the maximum throughput of the outer partition and increase the virtual maximum throughput of the inner partition and its tracks, to give a more sustained average throughput across the entire hard drive array stripe sets. This is what we term Striped Ape Technology (SAT).

This technique can be expanded to further increase the sustained throughput of the outer partitions, in the above two drive array, to give nearly the same maximum or average throughput across the entire array, by striping 3 drives together in 3 partitions each. In this case, the middle partitions are striped together and the outer and inner partitions are striped together, as in the above example. However, if striping the outer partitions of 3 drives together, the middle partitions together, and the inner partitions together provides sufficient throughput on each stripe set for the video application, including dual stream real time effects, then there is no reason to get creative with drive striping using Striped Ape Technology. A prime example here is the Pinnacle TARGA 2000 RTX dual stream video capture board. In single stream, the board can support up to about 450 kb/frame, but due to its chipset architecture it can only support a maximum of 220 kb/frame in dual stream mode, which translates to 13.2 MB/second. Two striped AV optimized Seagate Cheetah hard drives have no problem supporting that rate even on the inner cylinders.

Another technique used to increase striped drive throughput is to stripe across host bus adapters. This can be as simple as having the 2 hard drives connected to controller A striped with the 2 hard drives connected to controller B or as complex as striping whole RAID arrays together across two RAID controllers. In either case, due to the increase in performance, there would be no reason to be creative in which partitions are striped together.

RAID 1
RAID level 1 is only data redundancy, which is obtained by disk mirroring or duplexing as it is also known. In disk mirroring, data is written in duplicate to a second set of disks that mirror the primary set, so reliability is high. Unfortunately, it is also the most expensive to implement because of having to have two of everything. RAID 1 cannot be implemented in software RAID built into Microsoft Windows NT Workstation, but it is available in Windows NT Server. But RAID 1 can be implemented in either operating system by utilizing a hardware RAID controller, either onboard the computer in a PCI slot or via an external RAID controller.

Hard drives can be striped into a RAID 0 stripe set, and by utilizing disk mirroring with identical hard drives, a RAID 0+1 array can be created that offers fault tolerance and the performance of a RAID 0 stripe set. But once again, the costs of such configurations are double the cost of a single array. In the case of Microsoft Windows NT Workstation, the cost of a hardware RAID controller must be added. However, if high performance maximum data transfer rates are required, along with a safety net in case of a drive failure, RAID 0+1 is appropriate. Controller failures bring up another point of failure, and in such cases redundant controllers are often used. RAID 0+1 is an excellent choice for nonlinear video editing if cost is not an object and redigitizing from the original media is not a viable option.

RAID 2
RAID level 2 and 3 fall under a broad classification of parallel access arrays. RAID 2, however, utilizes an error correction type of algorithm that is often used in memory chips, known as Hamming code. Unfortunately, Hamming code when used in a RAID array limits the size of the array and because of this it is rare to find a RAID adapter that supports it.

RAID 3
RAID level 3 is the most supported fault tolerant RAID class to be used with nonlinear editing systems, because in general it offers the next highest level of data throughput on both reading and writing to and from the hard disk array. RAID level 3 uses byte striping of data and is optimized for high data transfer rates, unlike RAID 4 and 5 which are optimized for transaction processing and small file transfers. Both RAID 4 and 5 are designed more for reading from the disks, which is the primary activity in databases. Digital video utilizes large sequential data files, for the most part, and requires high sustained data transfer rates as well as a close balance between reading and writing. It does no good to digitize video at 240 KB per frame, if all that can be read back is a rate of 200 KB per frame. RAID 3 under SCSI is the best RAID level when high transfer rates and fault tolerance is required and cost is an object. Companies such as Avid Technology utilize RAID 3 in some of their nonlinear editing storage options.

RAID 3 adds parity data to RAID technology. Parity is a type of checksum based on the Boolean exclusive OR function that is written to one or more disks in the array as error correction information. Parity data allows damaged information to be regenerated from the remaining disks of the array. As such, it allows for the rebuilding of the information lost during a disk drive failure, but it also requires a substantial part of one or more hard drives for itself. RAID 3, 4, and 5 utilize parity. RAID 3 and 4, however, store the parity information on an entire single disk which can create a write bottleneck, while RAID 5 spreads or stripes it equally among the disks in the RAID array and has no write bottleneck. The bottleneck exists in RAID 3 and 4, because each time a write is made, an additional write must be done to the parity disk. RAID 5 is a more cost effective solution because less available storage on each drive is lost to parity information, especially when 5 or more drives create the array. There is also less of a write penalty in RAID 5. If the parity disk in RAID 3 fails, the stored data is still available but it is no longer protected from another disk failure, until the parity disk is replaced and the array is rebuilt. During the rebuilding time, the array is usually available but its operation may be slowed. RAID 3, through its parallel access, splits each disk block equally among all the disks used to create the RAID 3 virtual disk. On the other hand, RAID 4 and 5 arrays map each block in the virtual disks created to their individual disks in an independent fashion and do not require accessing each disk for every read and write.

RAID 3 is more efficient if the spindles of the members of the array are synchronized to eliminate latency. Latency is the time it takes the hard drive platters to make one rotation to position the heads at the proper sector to read or write. By synchronizing spindles, latency is essentially eliminated because the heads on each drive always in position to simultaneously read or write at the correct sector without additional platter rotation.

As stated in the RAID 2 section, RAID 3 is a parallel access array, which means all disks in the array must be accessed for every read and write, and consequently only one I/O request can be handled at a time. For video editing on shared storage, this can prove to be a configuration challenge. Also as mentioned previously, RAID 3 also utilizes byte striping on the disks. Fibre Channel is a serial architecture and is less efficient when using parallel access arrays and byte striping, but its high bandwidth can over shadow the parallel access penalty it suffers with RAID 3. Software RAID 3 is not integrated into Microsoft Windows NT Server or Windows NT Workstation.

RAID 4
RAID 4 is considered an independent access array and not a parallel access array, so it is a better choice than RAID 3 for Fibre Channel. RAID 4 also requires the use of a dedicated parity disk like in RAID 3, but unlike RAID 3 it does not require synchronized hard drive spindles. RAID 4 also favors disk reads over writes to the extent that writes are even slower than they would be with a single non-RAID disk. Since RAID 4 disks operate independently, multiple I/O requests can be executed simultaneously, which greatly increases I/O request performance over a RAID 3 array. RAID 4 arrays are appropriate for transaction processing with its nature of high I/O requests for small chunks of data and not for video editing applications. However, if the hard drives in the array are fast enough, such as the newer 10,000 RPM SCSI or Fibre Channel hard drives, even the data transfer rate in RAID 4 can support sustained video streams and nonlinear editing. RAID 4 is not available as an integrated software RAID solution within Windows NT Server or Windows NT Workstation.

RAID 5
RAID 5 is probably the most common implementation of RAID on business servers, because of its fault tolerance and cost effectiveness over RAID 1, 3, or 4. RAID 5 splits the parity information across all the hard drives in the array, which increases the percentage of each hard drive that is available for user data. In a hard drive failure, the lost data on that drive is regenerated on the replacement drive, during the array rebuild process, from the parity information on the remaining drives. Once again, the array has no fault protection until the failed drive is replaced and regenerated.

Like RAID 4, RAID 5 array read transfer rate is comparable to disk striping, but is considerably less than even a single disk for writes. However, much of the write penalty can be made up with memory cache on the RAID controller. Such cache is generally not available with software RAID implementations.

RAID 6
RAID level 6 is even more fault tolerant than RAID 5, but is also slower than even RAID 5 in disk writes. RAID level 6 protects data from two simultaneous failures and is basically a RAID 5 array with a second set of parity information. Once again, the disk write penalty is due to having to update the parity information. Solumedia RAID controllers do not support RAID 6.

Cache
The write penalty that occurs with parity RAID can be minimized by using cache on the RAID controller, but this too leads to a problem in which a failure occurs before the write is completed to the hard drive. However, with the large sequential files generated in digital video, any cache is rapidly filled and has limited effect.

Write-behind cache is quite fast, but writes the actual data to the hard drives after the host is notified that the data has been written. This can create a problem if a drive fails before data is written to it and the parity information is updated. Likewise, if the RAID controller fails before the data is written to the hard drives, the host believes no data was lost. The key points here are to have backup uninterruptable power supplies and a battery backup to the cache.

Write-back cache is another version of write-behind cache, but it notifies the host that the data has been written to the hard drives after it has physically been written to the drives. Therefore, it is a safer type of caching algorithm.

The bus speed of the cache is critical in that it determines the maximum speed most controllers can support regardless of the type of transfer protocol. Cache used to store command queues only has a minimal effect on the controller's maximum throughput, but cache used to store data and commands can become a significant bottleneck. A 66 MHz data memory cache bus speed can only support up to 66 MB/second, while a 100 MHz cache bus speed can support up to 100 MB/second. An Ultra2 SCSI protocol with a 66 MHz cache can only transfer data at up to 66 MHz, which is substantially less than the 80 MB/second bandwidth. Many RAID controller manufacturers are currently upgrading to 100 MHz cache bus speeds to eliminate this bottleneck.

Choosing a RAID Controller

Choosing a RAID controller is not as easy as one might think, especially with the newer host interfaces like Ultra2 SCSI and Fibre Channel. Many RAID controllers have been upgraded from SCSI-2 and Fast Wide SCSI-2 versions and do not support the maximum bandwidth of Ultra2 SCSI or Fibre Channel. These controllers tend to be advertised as having 80 MB/second throughput for Ultra2 SCSI or 100 MB/second for Fibre Channel, but the fine print states that these values are from the controller to the hard drives or from the controller to the host computer. And often these controllers use Fast Wide SCSI-2 or Ultra SCSI interfaces to the hard drives, with Ultra2 SCSI or Fibre Channel to the host computer. Fast Wide SCSI-2 or Ultra SCSI suffer from the 1.5 or 3 meter maximum cable length per channel, while Ultra2 SCSI supports a cable length of 12 meters per channel and Fibre Channel supports 25-30 meters or more depending on the type of cable. The Fibre Channel bus can extend to 10 kilometers between nodes by utilizing certain types of fiber optic cable.

Solumedia has carefully matched its RAID controllers to maximize the capabilities of the interface used, which means Ultra2 SCSI to the host and hard drives or Fibre Channel to the host and hard drives. However, using Ultra2 SCSI to the hard drives and Fibre Channel to the host computer is acceptable, because Ultra2 SCSI speed on two striped channels can meet or exceed Fibre Channel speed and the combination leverages currently owned SCSI drives.

Some RAID controllers only support RAID 3 or 5 only, while others support RAID 0, 0+1, 3 or 4, and 5. Proper selection depends on the application to be supported and whether the RAID controller is optimized to offset any differences between RAID levels, such as between RAID 3, 4, and 5.

Some RAID controllers are available in a hot swap configuration to keep the system up at all times, while others require taking the entire system offline while a replacement is made. Once again, proper choice depends on the application.

Some Fibre Channel RAID controllers and host adapters do not support switched fabric technology. Consequently, choosing an appropriate adapter or controller is critical, especially when using switched fabric for large scale installations of up to 16 million nodes.

Fibre Channel and Ultra2 SCSI offer the added benefit of being able to move hard drive arrays far enough from the work area to eliminate the noise caused by these arrays. Additionally, both topologies can support shared storage solutions with the proper hardware and software. Shared storage often requires server clustering and cluster-aware software, as well as good arrray management. While it is possible to used shared storage, such as a SAN, to concurrently store media files from several nonlinear editing systems, those systems cannot access each others data for collaborative editing unless their software is designed to support that feature through networking. It is possible, however, to import media from another workstation, providing the editor is given access permission to that data by the editor of the system that owns that data.

Summary
Every level of RAID has its drawbacks and an appropriate RAID level must be selected based on the applications to be used. Layering or striping multiple arrays together can increase performance substantially over a single RAID array. It is possible to combine disks of different capacities into a single RAID array, but with mirroring the two arrays must be identical. Choose RAID controllers and hard disks carefully.

As the areal density of hard drive platters continues to increase, so too does hard drive storage capacities. Some time in the not to distant future, the areal density of hard drive platters will hit a physical wall and other types of physical media technology will be required. However, as drive manufacturers continue to increase storage capacity, they will phase out many of the lower capacity hard drives. While this is not an issue with single and dual hard drive systems, it is a problem with RAID systems. For example, the maximum capacity a customer may need is 72 GB, but only 36 GB hard drives are the smallest available. If the optimal RAID 5 configuration utilizes five hard drives, the majority of the storage capacity available on each hard drive will never be used. Consequently, storage costs will actually increase per megabyte over using 9 GB or 18 GB hard drives.

On the other hand, companies that currently have large data centers to house their storage systems will be able to cut their costs by using fewer hard drives and fewer tower or rackmount enclosures.

Web Hosting Star support

What is RAID and how to configure it, types of RAIDs

Who is online