Ciosjune 20 TH RAID
Ciosjune 20 TH RAID
Ciosjune 20 TH RAID
What is RAID?
RAID (redundant array of independent disks) is a way of storing the same data in
different places on multiple hard disks or solid-state drives (SSDs) to protect data
in the case of a drive failure. There are different RAID levels, however, and not
all have the goal of providing redundancy.
How RAID works
RAID works by placing data on multiple disks and allowing input/output (I/O)
operations to overlap in a balanced way, improving performance. Because using
multiple disks increases the mean time between failures, storing data redundantly
also increases fault tolerance.
RAID arrays appear to the operating system (OS) as a single logical drive.
RAID employs the techniques of disk mirroring or disk striping. Mirroring will
copy identical data onto more than one drive. Striping partitions help spread data
over multiple disk drives. Each drive's storage space is divided into units ranging
from a sector of 512 bytes up to several megabytes. The stripes of all the disks
are interleaved and addressed in order. Disk mirroring and disk striping can also
be combined in a RAID array.
An image
of a hard drive in a RAID array.
In a single-user system where large records are stored, the stripes are typically set
up to be small (512 bytes, for example) so that a single record spans all the disks
and can be accessed quickly by reading all the disks at the same time.
RAID controller
A RAID controller is a device used to manage hard disk drives in a storage array.
It can be used as a level of abstraction between the OS and the physical disks,
presenting groups of disks as logical units. Using a RAID controller can improve
performance and help protect data in case of a crash.
Firmware-based RAID controller chips are located on the motherboard, and all
operations are performed by the central processing unit (CPU), similar to
software-based RAID. However, with firmware, the RAID system is only
implemented at the beginning of the boot process. Once the OS has loaded, the
controller driver takes over RAID functionality. A firmware RAID controller is
not as pricey as a hardware option, but it puts more strain on the computer's CPU.
Firmware-based RAID is also called hardware-assisted software RAID, hybrid
model RAID and fake RAID.
RAID levels
RAID devices use different versions, called levels. The original paper that coined
the term and developed the RAID setup concept defined six levels of RAID -- 0
through 5. This numbered system enabled those in IT to differentiate RAID
versions. The number of levels has since expanded and has been broken into
three categories: standard, nested and nonstandard RAID levels.
A
visualization of RAID 0.
RAID 1. Also known as disk mirroring, this configuration consists of at least two
drives that duplicate the storage of data. There is no striping. Read performance
is improved, since either disk can be read at the same time. Write performance is
the same as for single disk storage.
A
visualization of RAID 1.
RAID 2. This configuration uses striping across disks, with some disks storing
error checking and correcting (ECC) information. RAID 2 also uses a
dedicated Hamming code parity, a linear form of ECC. RAID 2 has no advantage
over RAID 3 and is no longer used.
A
visualization of RAID 2.
RAID 3. This technique uses striping and dedicates one drive to
storing parity information. The embedded ECC information is used to detect
errors. Data recovery is accomplished by calculating the exclusive information
recorded on the other drives. Because an I/O operation addresses all the drives at
the same time, RAID 3 cannot overlap I/O. For this reason, RAID 3 is best for
single-user systems with long record applications.
A
visualization of RAID 3.
RAID 4. This level uses large stripes, which means a user can read records from
any single drive. Overlapped I/O can then be used for read operations. Because
all write operations are required to update the parity drive, no I/O overlapping is
possible.
A
visualization of RAID 4.
RAID 5. This level is based on parity block-level striping. The parity information
is striped across each drive, enabling the array to function, even if one drive were
to fail. The array's architecture enables read and write operations to span multiple
drives. This results in performance better than that of a single drive, but not as
high as a RAID 0 array. RAID 5 requires at least three disks, but it is often
recommended to use at least five disks for performance reasons.
RAID 5 arrays are generally considered to be a poor choice for use on write-
intensive systems because of the performance impact associated with writing
parity data. When a disk fails, it can take a long time to rebuild a RAID 5 array.
A
visualization of RAID 5.
A
visualization of RAID 6.
Nested RAID levels
Some RAID levels that are based on a combination of RAID levels are referred
to as nested RAID. Here are some examples of nested RAID levels.
A
visualization of RAID 10.
RAID 01 (RAID 0+1). RAID 0+1 is similar to RAID 1+0, except the data
organization method is slightly different. Rather than creating a mirror and then
striping it, RAID 0+1 creates a stripe set and then mirrors the stripe set.
RAID 03 (RAID 0+3, also known as RAID 53 or RAID 5+3). This level uses
striping in RAID 0 style for RAID 3's virtual disk blocks. This offers higher
performance than RAID 3, but at a higher cost.
RAID 50 (RAID 5+0). This configuration combines RAID 5 distributed parity
with RAID 0 striping to improve RAID 5 performance without reducing data
protection.
RAID 7. A nonstandard RAID level based on RAID 3 and RAID 4 that adds
caching. It includes a real-time embedded OS as a controller, caching via a high-
speed bus and other characteristics of a standalone computer.
Adaptive RAID. This level enables the RAID controller to decide how to store
the parity on disks. It will choose between RAID 3 and RAID 5. The choice
depends on what RAID set type will perform better with the type of data being
written to the disks.
This method of RAID uses some of the system's computing power to manage a
software-based RAID configuration. As an example, Windows supports software
RAID 0, 1 and 5, while Apple's macOS supports RAID 0, 1 and 1+0.
Benefits of RAID
Advantages of RAID include the following:
Reads and writes can be performed faster than with a single drive with RAID
0. This is because a file system is split up and distributed across drives that
work together on the same file.
There is increased availability and resiliency with RAID 5. With mirroring, two
drives can contain the same data, ensuring one will continue to work if the
other fails.
Downsides of using RAID
RAID does have its limitations, however. Some of these include:
Nested RAID levels are more expensive to implement than traditional RAID
levels, because they require more disks.
The cost per gigabyte for storage devices is higher for nested RAID because
many of the drives are used for redundancy.
When a drive fails, the probability that another drive in the array will also
soon fail rises, which would likely result in data loss. This is because all the
drives in a RAID array are installed at the same time, so all the drives are
subject to the same amount of wear.
Some RAID levels -- such as RAID 1 and 5 -- can only sustain a single drive
failure.
RAID arrays, and the data in them, are vulnerable until a failed drive is
replaced and the new disk is populated with data.
Because drives have much greater capacity now than when RAID was first
implemented, it takes a lot longer to rebuild failed drives.
If a disk failure occurs, there is a chance the remaining disks may contain bad
sectors or unreadable data, which may make it impossible to fully rebuild the
array.
When a large amount of data needs to be restored. If a drive fails and data
is lost, that data can be restored quickly, because this data is also stored in
other drives.
When uptime and availability are important business factors. If data needs
to be restored, it can be done quickly without downtime.
When working with large files. RAID provides speed and reliability when
working with large files.
When cost is a factor. The cost of a RAID array is lower than it was in the
past, and lower-priced disks are used in large numbers, making it cheaper.
History of RAID
The term RAID was coined in 1987 by David Patterson, Randy Katz and Garth
A. Gibson. In their 1988 technical report, "A Case for Redundant Arrays of
Inexpensive Disks (RAID)," the three argued that an array of inexpensive drives
could beat the performance of the top expensive disk drives of the time. By using
redundancy, a RAID array could be more reliable than any one disk drive.
While this report was the first to put a name to the concept, the use of redundant
disks was already being discussed by others. Geac Computer Corp.'s Gus
German and Ted Grunau first referred to this idea as MF-100. IBM's Norman
Ken Ouchi filed a patent in 1977 for the technology, which was later named
RAID 4. In 1983, Digital Equipment Corp. shipped the drives that would become
RAID 1, and in 1986, another IBM patent was filed for what would become
RAID 5. Patterson, Katz and Gibson also looked at what was being done by
companies such as Tandem Computers, Thinking Machines and Maxstor to
define their RAID taxonomies.
While the levels of RAID listed in the 1988 report essentially put names to
technologies that were already in use, creating common terminology for the
concept helped stimulate the data storage market to develop more RAID array
products.
The rise of SSDs is also seen as alleviating the need for RAID. SSDs have no
moving parts and do not fail as often as hard disk drives. SSD arrays often use
techniques such as wear leveling instead of relying on RAID for data protection.
Modern SSDs are fast enough that modern servers may not need the slight
performance boost that RAID offers. However, they still may be currently used
to prevent data loss.
Hyperscale computing also removes the need for RAID by using redundant
servers instead of redundant drives.
Still, RAID remains an ingrained part of data storage and major technology
vendors continue to release RAID products. For example:
IBM offers IBM Distributed RAID, or DRAID, with its Spectrum Virtualize V8.3,
which promises to boost RAID performance.
The latest version of Intel Rapid Storage Technology supports RAID 0, RAID 1,
RAID 5 and RAID 10.
Questions:
Answers:
RAID (Redundant Array of Independent Disks) is typically used in scenarios where data storage and
reliability are critical. It is commonly employed in server environments, database systems, and
high-demand applications that require a balance between performance, data redundancy, and
fault tolerance. RAID offers increased data protection, improved performance, and enhanced
storage capacity.
2. Relation between RAID and servers:
RAID is closely associated with servers as it provides fault tolerance and data redundancy, which
are crucial for server systems. Servers often handle large amounts of data and serve multiple
clients simultaneously. By implementing RAID, servers can ensure that data remains accessible
even in the event of disk failures, minimizing downtime and maintaining data integrity. RAID also
enhances server performance by distributing data across multiple drives, allowing for parallel data
access and improved read/write speeds.
Yes, it is possible to combine different types of RAID within a single system, commonly referred to
as nested or hybrid RAID configurations. For example, RAID 10 (RAID 1+0) combines elements of
RAID 1 (mirroring) and RAID 0 (striping) to achieve both data redundancy and improved
performance. Similarly, RAID 50 and RAID 60 are combinations of RAID 5 and RAID 0 or RAID 6 and
RAID 0, respectively. These combinations allow for a customized balance between performance,
fault tolerance, and storage efficiency, depending on the specific requirements of the system.
The configuration process of RAID depends on the specific hardware or software implementation
being used. Here are the general steps involved in configuring a hardware-based RAID controller:
a. Install the RAID controller: Insert the RAID controller card into an available expansion slot on the
motherboard and connect the necessary power and data cables.
b. Connect the hard drives: Connect the hard drives to the RAID controller, ensuring they are
properly seated and connected via the appropriate cables (usually SATA or SAS).
c. Access the RAID controller settings: During the server startup process, access the RAID
controller's configuration utility by pressing a specific key combination (often displayed on the
screen) or through the server's BIOS/UEFI settings.
d. Create a RAID array: Within the RAID controller settings, create the desired RAID level (e.g.,
RAID 0, RAID 1, RAID 5, etc.), select the appropriate hard drives to include in the array, and define
any additional parameters, such as stripe size or cache settings.
e. Save and exit: Save the RAID configuration, exit the RAID controller settings, and allow the
server to continue the boot process.
It's worth noting that software-based RAID configurations, which are managed through the
server's operating system, follow a different setup process that typically involves using built-in
tools or third-party software.
RAID 0: Offers enhanced performance by striping data across multiple drives, enabling parallel
data access and increased throughput. It provides high storage efficiency since there is no data
redundancy overhead. However, it has no data redundancy, so if one drive fails, data loss occurs.
RAID 1: Provides complete data redundancy by mirroring data across multiple drives. If one drive
fails, the mirrored drive ensures data integrity and availability. It has excellent read performance
but reduced write performance due to the need to write data to both drives.
RAID 5: Offers a balance between performance, storage efficiency, and data redundancy. It
distributes data and parity information across multiple drives, allowing for improved read
performance and fault tolerance. RAID 5 can withstand the failure of a single drive without data
loss. However, its write performance is slower due to the need to calculate and update parity
information.
RAID 6: Similar to RAID 5, but with dual parity, RAID 6 provides increased fault tolerance. It can
withstand the failure of two drives simultaneously without data loss. The additional parity
information offers enhanced data redundancy. However, RAID 6 has slower write performance
than RAID 5 due to the additional parity calculations.
RAID 10 (RAID 1+0): Combines the benefits of RAID 1 and RAID 0. It offers excellent data
redundancy through mirroring and improved performance through striping. RAID 10 can tolerate
multiple drive failures as long as they do not occur in the same mirrored pair.
RAID 0: Lacks data redundancy, so the failure of a single drive leads to complete data loss. There is
no fault tolerance or recovery mechanism.
RAID 1: Requires double the storage capacity since data is mirrored on each drive. It has reduced
storage efficiency compared to other RAID levels.
RAID 5: The write performance is slower due to the need to calculate and update parity
information. Rebuilding a failed drive can also be resource-intensive and time-consuming.
RAID 6: Similar to RAID 5, the write performance is slower due to the dual parity calculations.
Rebuilding a failed drive can take longer than in RAID 5.
RAID 10 (RAID 1+0): Requires a higher number of drives, making it more expensive to implement
compared to other RAID levels. It offers good fault tolerance, but the loss of multiple drives in the
same mirrored pair can result in data loss.
References
https://www.gartner.com/en/information-technology/glossary/raid-redundant-array-of-
independent-disks#:~:text=A%20method%20of%20mirroring%20or,improved%20mean%20time
%20between%20failures.
https://www.techtarget.com/searchstorage/definition/RAID
https://www.spiceworks.com/tech/data-management/articles/what-is-raid-storage/
pendiente ojo
https://docs.oracle.com/cd/E19236-01/817-3337-18/appa_raid_basic.html
https://www.geeksforgeeks.org/raid-redundant-arrays-of-independent-disks/
https://www.slideteam.net/raid-storage-it-powerpoint-presentation-slides.html
capturar slides importantes si no se puede bajar