How to Configure RAID in Linux Step by Step Guide

This tutorial explains how to view, list, create, add, remove, delete, resize, format, mount and configure RAID Levels (0, 1 and 5) in Linux step by step with practical examples. Learn basic concepts of software RAID (Chunk, Mirroring, Striping and Parity) and essential RAID device management commands in detail.

RAID stands for Redundant Array of Independent Disks. There are two types of RAID; Hardware RAID and Software RAID.

Hardware RAID

Hardware RAID is a physical storage device which is built from multiple hard disks. While connecting with system all disks appears as a single SCSI disk in system. From system points of view there is no difference between a regular SCSI disk and a Hardware RAID device. System can use hardware RAID device as a single SCSI disk.

Hardware RAID has its own independent disk subsystem and resources. It does not use any resources from system such as power, RAM and CPU. Hardware RAID does not put any extra load in system. Since it has its own dedicate resources, it provides high performance.

Software RAID

Software RAID is a logical storage device which is built from attached disks in system. It uses all resources from system. It provides slow performance but cost nothing. In this tutorial we will learn how to create and manage software RAID in detail.

This tutorial is the last part of our article “Linux Disk Management Explained in Easy Language with Examples”. You can read other parts of this article here.

Linux Disk Management Tutorial

This is the first part of this article. This part explains basic concepts of Linux disk management such as BIOS, UEFI, MBR, GPT, SWAP, LVM, RAID, primary partition, extended partition and Linux file system type.

Manage Linux Disk Partition with fdisk Command

This is the second part of this article. This part explains how to create primary, extended and logical partitions from fdisk command in Linux step by step with examples.

Manage Linux Disk Partition with gdisk Command

This is the third part of this article. This part explains how to create GPT (GUID partition table) partitions from gdisk command in Linux step by step with examples.

Linux Disk Management with parted command

This is the fourth part of this article. This part explains how to create primary, extended, logical and GPT partitions from parted command in Linux step by step with examples.

How to create SWAP partition in Linux

This is the fifth part of this article. This part explains how to create swap partition in Linux with examples including basic swap management tasks such as how to increase, mount or clear swap memory.

Learn how to configure LVM in Linux step by step

This is the sixth part of this article. This part explains basic concepts of LVM in detail with examples including how to configure and manage LVM in Linux step by step.

Basic concepts of RAID

A RAID device can be configured in multiple ways. Depending on configuration it can be categorized in ten different levels. Before we discuss RAID levels in more detail, let’s have a quick look on some important terminology used in RAID configuration.

Chunk: - This is the size of data block used in RAID configuration. If chunk size is 64KB then there would be 16 chunks in 1MB (1024KB/64KB) RAID array.

Hot Spare: - This is the additional disk in RAID array. If any disk fails, data from faulty disk will be migrated in this spare disk automatically.

Mirroring: - If this feature is enabled, a copy of same data will be saved in other disk also. It is just like making an additional copy of data for backup purpose.

Striping: - If this feature is enabled, data will be written in all available disks randomly. It is just like sharing data between all disks, so all of them fill equally.

Parity: - This is method of regenerating lost data from saved parity information.

Different RAID levels are defined based on how mirroring and stripping are required. Among these levels only Level 0, Level1 and Level5 are mostly used in Red Hat Linux.

RAID Level 0

This level provides striping without parity. Since it does not store any parity data and perform read and write operation simultaneously, speed would be much faster than other level. This level requires at least two hard disks. All hard disks in this level are filled equally. You should use this level only if read and write speed are concerned. If you decide to use this level then always deploy alternative data backup plan. As any single disk failure from array will result in total data loss.

RAID Level 1

This level provides parity without striping. It writes all data on two disks. If one disk is failed or removed, we still have all data on other disk. This level requires double hard disks. It means if you want to use 2 hard disks then you have to deploy 4 hard disks or if you want use one hard disk then you have to deploy two hard disks. First hard disk stores original data while other disk stores the exact copy of first disk. Since data is written twice, performance will be reduced. You should use this level only if data safety is concerned at any cost.

RAID Level 5

This level provides both parity and striping. It requires at least three disks. It writes parity data equally in all disks. If one disk is failed, data can be reconstructed from parity data available on remaining disks. This provides a combination of integrity and performance. Wherever possible you should always use this level.

If you want to use hardware RAID device, use hot swappable hardware RAID device with spare disks. If any disk fails, data will be reconstructed on the first available spare disk without any downtime and since it is a hot swappable device, you can replace failed device while server is still running.

If RAID device is properly configured, there will be no difference between software RAID and hardware RAID from operating system’s point of view. Operating system will access RAID device as a regular hard disk, no matter whether it is a software RAID or hardware RAID.

Linux provides md kernel module for software RAID configuration. In order to use software RAID we have to configure RAID md device which is a composite of two or more storage devices.

How to configure software RAID step by step

For this tutorial I assume that you have un-partitioned disk space or additional hard disks for practice. If you are following this tutorial on virtual software such as VMware workstation, add three additional hard disks in system. To learn how to add additional hard disk in virtual system, please see the first part of this tutorial. If you are following this tutorial on physical machine, attach an additional hard disk. You can use a USB stick or pen drive for practice. For demonstration purpose I have attached three additional hard disks in my lab system.

Each disk is 2GB in size. We can list all attached hard disks with fdisk –l command.

fdisk -l command

We can also use lsblk command to view a structured overview of all attached storage devices.

lsblk command

As we can see in above output there are three un-partitioned disks available with each of 2G in size.

The mdadm package is used to create and manage the software RAID. Make sure it is installed before we start working with software RAID. To learn how to install and manage package in linux see the following tutorials

How to configure YUM Repository in RHEL
RPM Command Explained with Example

For this tutorial I assume that mdadm package is installed.

rpm -qa mdadm

Creating RAID 0 Array

We can create RAID 0 array with disks or partitions. To understand both options we will create two separate RAID 0 arrays; one with disks and other with partitions. RAID 0 Array requires at least two disks or partitions. We will use /dev/sdc and /dev/sdd disk to create RAID 0 Array from disks. We will create two partitions in /dev/sdb and later use them to create another RAID 0 Array from partitions.

To create RAID 0 Array with disks use following command

#mdadm --create --verbose /dev/[ RAID array Name or Number] --level=[RAID Level] --raid-devices=[Number of storage devices] [Storage Device] [Storage Device]

Let’s understand this command in detail

mdadm:- This is the main command

--create:- This option is used to create a new md (RAID) device.

--verbose:- This option is used to view the real time update of process.

/dev/[ RAID array Name or Number]:- This argument is used to provide the name and location of RAID array. The md device should be created under the /dev/ directory.

--level=[RAID Level]:- This option and argument are used to define RAID level which want to create.

--raid-devices=[Number of storage devices]:- This option and argument are used to specify the number of storage devices or partitions which we want to use in this device.

[Storage Device]:- This option is used to specify the name and location of storage device.

Following command will be used to create a RAID 0 array from disks /dev/sdc and /dev/sdd with md0 name.

mdadm create raid array

To verify the array we can use following command

cat /proc/mdstat command

Above output confirms that RAID array md0 has been successfully created from two disks (sdd and sdc) with RAID level 0 configurations.

Creating RAID 0 Array with partitions

Create a 1GiB partition with fdisk command

fdisk create new partition

By default all partitions are created as Linux standard. Change partition type to RAID and save the partition. Exit from fdisk utility and run partprobe command to update the run time kernel partition table.

fdisk command change partition type

To learn fdisk command and its sub-command in detail please see the second part of this tutorial which explains how to create and manage partitions with fdisk command step by step.

Let’s create one more partition but this time use parted command.

create new partition with parted

To learn parted command in detail please sees the fourth part of this tutorial which explains how to manage disk with parted command step by step.

We have created two partitions. Let’s build another RAID (Level 0) array but this time use partitions instead of disks.

Same command will be used to create RAID array from partitions.

madam create command

When we use mdadm command to create a new RAID array, it puts its signature on provided device or partition. It means we can create RAID array from any partition type or even from a disk which does not contain any partition at all. So which partition type we use here is not important, the important point which we should always consider is that partition should not contain any valuable data. During this process all data from partition will be wiped out.

Creating File system in RAID Array

We cannot use RAID array for data storage until it contains a valid file system. Following command is used to create a file system in array.

#mkfs –t [File system type] [RAID Device]

Let’s format md0 with ext4 file system and md1 with xfs file system.

format md device

RAID 0 Arrays are ready to use. In order to use them we have to mount them somewhere in Linux file system. Linux file system (primary directory structure) starts with root (/) directory and everything goes under it or its subdirectories. We have to mount partitions somewhere under this directory tree. We can mount partitions temporary or permanently.

Temporary mounting RAID 0 Array

Following command is used to mount the array temporary.

#mount [what to mount] [where to mount]

Mount command accepts several options and arguments which I will explain separately in another tutorial. For this tutorial this basic syntax is sufficient.

what to mount :- This is the array.

where to mount :- This is the directory which will be used to access the mounted resource.

Once mounted, whatever action we will perform in mounted directory will be performed in mounted resources. Let’s understand it practically.

  • Create a mount directory in / directory
  • Mount /dev/md0 array
  • List the content
  • Create a test directory and file
  • List the content again
  • Un-mount the /dev/md0 array and list the content again
  • Now mount the /dev/md1 array and list the content
  • Again create a test directory and file. Use different name for file and directory
  • List the content
  • Un-mount the /dev/md1 array and list the content again

Following figure illustrates this exercise step by step

temporary mount

As above figure shows whatever action we performed in mount directory was actually performed in respective array.

Temporary mount option is good for array which we access occasionally. If we access array on regular basis then this approach will not helpful. Each time we reboot the system all temporary mounted resources are get un-mounted automatically. So if we have an array which is going to be used regularly, we should mount it permanently.

Mounting RAID Array permanently

Each resource in file system has a unique ID called UUID. When mounting an array permanently we should use UUID instead of its name. From version 7, RHEL also uses UUID instead of device name.

The UUID stands for Universally Unique Identifier. It is a 128-bit number, expressed in hexadecimal (base 16) format.

If you have a static environment, you may use device name. But if you have dynamic environment, you should always use UUID. In dynamic environment device name may change each time when system boot. For example we attached an additional SCSI disk in system; it will be named as /dev/sdb. We mounted this disk permanently with its device name. Now suppose someone else removed this disk and attached new SCSI disk in the same slot. New disk will also be named as /dev/sdb. Since name of old disk and new disk is same, new disk will be mounted at the place of old disk. This way, device name could create a serious problem in dynamic environment. But this issue can solve with UUID. No matter how we attach the resource with system, its UUID will remain always fix.

If you have static environment, you may consider device name to mount the resource. But if you have dynamic environment, you should always use UUID.

To know the UUID of all partitions we can use blkid command. To know the UUID of a specific partition we have to use its name as argument with this command.

blkid command

Once we know the UUID, we can use it instead of device name. We can also use copy and paste option to type the UUID.

  • Use blkid command to print the UUID of array.
  • Copy the UUID of array.
  • Use mount command to mount the array. Use paste option instead of typing UUID.

Following figure illustrates above steps

temporary mount with uuid command

When system boots, it looks in /etc/fstab file to find out the devices (partitions, LVs, swap or array) which need to be mount in file system automatically. By default this file has entry for those partitions, logical volumes and swap space which were created during the installation. To mount any additional device (Array) automatically we have to make an entry for that device in this file. Each entry in this file has six fields.

default fstab file

Number Filed Description
1 What to mount Device which we want to mount. We can use device name, UUID and label in this filed to represent the device.
2 Where to mount The directory in main Linux File System where we want to mount the device.
3 File system File system type of device.
4 Options Just like mount command we can also use supported options here to control the mount process. For this tutorial we will use default options.
5 Dump support To enable the dump on this device use 1. Use 0 to disable the dump.
6 Automatic check Whether this device should be checked while mounting or not. To disable use 0, to enable use 1 (for root partition) or 2 (for all partitions except root partition).

Let’s make some directories to mount the arrays which we have created recently

mkdir command

Take the backup of fstab file and open it for editing

etc/fstab backup

Make entries for arrays and save the file.

fstab entries

For demonstration purpose I used both device name and UUID to mount the partitions. After saving always check the entries with mount –a command. This command will mount everything listed in /etc/fstab file. So if we made any mistake while updating this file, we will get an error as the output of this command.

If you get any error as the output of mount –a command, correct that before rebooting the system. If there is no error, reboot the system.

mount -a command

The df –h command is used to check the available space in all mounted partitions. We can use this command to verify that all partitions are mounted correctly.

df -h command

Above output confirms that all partitions are mounted correctly. Let’s list the both RAID devices.

list md device

How to delete RAID Array

We cannot delete a mounted array. Un-mount all arrays which we created in this exercise

umount command

Use following command to stop the RAID array

#mdadm --stop /dev/[Array Name]

mdstop command

Remove the mount directory and copy the original fstab file back.

If you haven’t taken the backup of original fstab file, remove all entries from this file which you made.

restore fstab file

Finally reset all disks used in this practice.

dd command linux

The dd command is the easiest way to rest the disk. Disk utilities store their configuration parameters in super block. Usually super block size is defined in KB so we just overwritten the first 10MB space with null bytes in each disk. To learn dd command in detail, see the fifth part of this tutorial which explains this command in detail.

Now reboot the system and use df –h command again to verify that all RIAD devices which we created in this exercise are gone.

df -h command

How to create RAID 1 and RAID 5 array

We can create RAID 1 or RAID 5 array by following same procedure. All steps and commands will be same except the mdadm --create command. In this command you have to change the RAID level, number of disks and location of associated disks.

To create RAID 1 array from /dev/sdd and /dev/sdb disks use following command

raid 1 array create

To create RAID 1 array from /dev/sdb1 and /dev/sdb2 partitions use following command

raid 1 partition

You may get metadata warning if you have used same disks and partitions to create RAID array previously and that disks or partitions still contain metadata information. Remember we cleaned only 10Mb starting space leaving remaining space untouched. You can safely ignore this message or can clean the entire disk before using them again.

To create RAID 5 array from /dev/sdb, /dev/sdc and /dev/sdd disks use following command.

raid 5 from disks

RAID 5 Configuration requires at least 3 disks or partitions. That’s why we used three disks here.

To create RAID 5 array from /dev/sdb1, /dev/sdb2 and /dev/sdb3 partitions use following command

raid 5 from partition

To avoid unnecessary errors always rest disks before using them in new practice.

So far in this tutorial we have learnt how to create, mount and remove RAID array. In following section we will learn how to manage and troubleshoot a RAID Array. For this section I assume that you have at least one array configured. For demonstration purpose I will use last configured (RAID 5 with 3 partitions) example. Let’s create file system in this array and mount it.

temporary mount md device

Let’s put some dummy data in this directory.

dummy data

I redirected the manual page of ls command in /testingdata/manual-of-ls-command file. Later, to verify that file contains actual data I used wc command which counts line, word and characters of file.

How to view the detail of RAID device

Following command is used to view the detailed information about RAID device.

#mdadm --detail /dev/[RAID Device Name]

This information includes RAID Level, Array size, used sized from total available size, devices used in creating this Array, devices currently used, spare devices, failed devices, chunk size, UUID of Array and much more.

mdadm detial

How to add additional disk or partition in RIAD

There are several situations where we have to increase the size of RAID device for example a raid device might be filled up with data or a disk form Array might be failed. To increase the space of RAID device we have to add additional disk or partition in existing Array.

In running example we used /dev/sdb disk to create three partitions. The /dev/sdc and /dev/sdd are still available to use. Before we add them in this Array make sure they are cleaned. Last time we used dd command to clean the disks. We can use that command again or use following command

#mdadm --zero-superblock /dev/[Disk name]

To check a disk whether it contains superblock or not we can use following command

#mdadm --examine /dev/[Disk name]

Following figure illustrates the use of both commands on both disks

mdadm exiamne

Now both disks are ready for RAID Array. Following command is used to add additional disk in existing array.

#mdadm --manage /dev/[RAID Device] --add /dev/[disk or partition]

Let’s add /dev/sdc disk in this array and confirm the same.

mdadm add aditional space

Right now this disk has been added as a spare disk. This disk will not be used until any disk fails from existing array or we manually force RAID to use this disk.

If any disk fails and spare disks are available, RAID will automatically select the first available spare disk to replace the faulty disk. Spare disks are the best backup plan in RAID device.

For backup we will add another disk in array, let’s use this disk to increase the size of array. Following command is used to grow the size of RAID device.

#mdadm --grow --raid-devices=[Number of Device] /dev/[RAID Device]

RAID arranges all devices in a sequence. This sequence is built from the order in which disks are added in array. When we use this command RAID will add next working device in active devices.

Following figure illustrates this command

mdadm grow array

As we can see in above output disk has been added in array and the size of array has been successfully increased.

Removing faulty device

{module in_art_slot_8}

If spare device is available, RAID will automatically replace the faulty device with spare device. End user will not see any change. He will be able to access the data as usual. Let’s understand it practically.

Right now there is no spare disk available in array. Let’s add one spare disk.

mdadm command spare disk

When a disk fails, RAID marks that disk as failed device. Once marked, it can be removed safely. If we want to remove any working device form array for maintenance or troubleshooting purpose, we should always mark that as a failed device before removing. When a device is marked as failed device, all data from failed device is reconstructed in working devices.

To mark a disk as failed device following command is used.

#mdadm --manage --set-faulty /dev/[Array Name] /dev/[Faulty Disk]

We recently increased the size of this array. So before doing this practice let’s verify once again that array still contains the valid data.

wc command

As above output confirms that array still contains valid data. Now let’s mark a device /dev/sdc as faulty device from array and confirm the operation.

mdadm set faulty disk

As above output confirms that device sdc which is number four in array sequence has been marked as failed [F] device.

As we know if spare disk is available, it will be used as the replacement of faulty device automatically. No manual action is required in this process. Let’s confirm that spare disk has been used as the replacement of faulty disk.

mdadm remove faulty device

Finally let’s verify that data is still present in array.

verify data

Above output confirms that array still contains valid data.

That’s all for this tutorial.

ComputerNetworkingNotes Linux Tutorials How to Configure RAID in Linux Step by Step Guide