How to resize a soft RAID using mdadm (by replacing the old disks)

soft raid mdadm resize linux expand replace

15 min read | by Jordi Prats

If you have a linux software RAID (in this case RAID 5) and you want to resize it, you can do it by replacing underlying disks with larger ones. This is a step-by-step guide on how to do it without loosing data and without having to copy data across.

The RAID device (md0) that we are going to use is configured as a RAID 5 array with 4 disks of 3TB each. We are going to replace them with 8TB disks. With the resulting array we are using LVM to manage the disks, so we will also need to extend the LVM volume group to use the full capacity of the new disks.

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               raid
  PV Size               <8.19 TiB / not usable 5.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              2146093
  Free PE               0
  Allocated PE          2146093
  PV UUID               h68yNc-5MPU-5lWd-3p7m-RqCi-weLs-18yhsY

# vgdisplay
  --- Volume group ---
  VG Name               raid
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  25
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                7
  Open LV               5
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <8.19 TiB
  PE Size               4.00 MiB
  Total PE              2146093
  Alloc PE / Size       2146093 / <8.19 TiB
  Free  PE / Size       0 / 0
  VG UUID               HRKcbj-VoNI-w1jr-Ksu0-cbp9-lFAe-MkuiAn

Initial check

Let's assume we have a RAID 5 array with 4 disks of 3TB each, and we want to replace them with 8TB disks. This is the device:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sdd1[2] sde1[3] sdb1[0]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

Using lsblk we can see the it's serial numbers so that we can identify and replace them later:

# lsblk -do +VENDOR,MODEL,SERIAL
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT                   VENDOR   MODEL                  SERIAL
(...)
sdb      8:16   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N4CZNE48
sdc      8:32   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N5LNHRP8
sdd      8:48   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N5LNHFN6
sde      8:64   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N2AHPJ7J

To make sure we have the array consistent, we can perform a data-check:

echo check > /sys/block/md0/md/sync_action

Depending on the size of the array, this can take a while. You can check the progress using cat /proc/mdstat:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sdd1[2] sde1[3] sdb1[0]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [===>.................]  check = 17.5% (513533708/2930134016) finish=275.1min speed=146361K/sec
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

As soon as this is completed, we should check for any errors in dmesg. If we find any, we should replace that disk first to make sure it doesn't fail meanwhile we are replacing some other disk.

Assuming we don't find any errors, we can proceed to replace the disks.

Replace each disk

Now we can start replacing the disks. We will replace them one by one, and we will use mdadm to mark the disk as faulty and remove it from the array. This will allow us to replace the disk without loosing any data.

We can now use smartctl to check the health of the disks. This is a good practice to do before replacing any disk, as it will help us identify any potential issues with the disks.

# smartctl -a /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-208-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD30EFRX-68EUZN0
Serial Number:    WD-WCC4N5LNHRP8
LU WWN Device Id: 5 0014ee 2b7b9458e
Firmware Version: 82.00A82
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Mar 24 19:14:12 2025 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
(...)

The are two things we should check:

  • The overall health of the disk. If it says "PASSED" we are good to go.
# for i in sdc sdd sde sdb; do echo $i; smartctl -a /dev/$i | grep overall; done
sdc
SMART overall-health self-assessment test result: PASSED
sdd
SMART overall-health self-assessment test result: PASSED
sde
SMART overall-health self-assessment test result: PASSED
sdb
SMART overall-health self-assessment test result: PASSED
  • The number of errors. We should check, at least, the "Raw_Read_Error_Rate" and "Seek_Error_Rate" values. If they are high, we should consider replacing that disk first.
# for i in sdc sdd sde sdb; do echo $i; smartctl -a /dev/$i | grep _Error; done
sdc
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       14
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0
sdd
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       28
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0
sde
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       20
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0
sdb
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       74
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0dmes  

Each brand can report in a different way, in some cases a high number of read errors can not necessarily mean that the disk is failing: A good example of that would be Seagate disks, which encode the number of actual errors with the total number of operations, making it very difficult to interpret. In this case, we are using Western Digital disks, which are easier to interpret. We can also use the SMART parser to simplify the process of interpreting the SMART data, we just need to paste the output of smartctl -a and it will show us the most relevant data.

Without any other data to use, we can just pick the disk with the highest number of read errors and replace it first. In this case, we can see that sdb has the highest number of read errors (74), so we will replace it first and continue in descending order.

To do so, we are going to use mdadm to mark the disk as faulty and then remove it from the array. This will allow us to replace the disk without loosing any data.

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sde1[3] sdd1[2] sdb1[0]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>
# mdadm --manage /dev/md0 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md0
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sde1[3] sdd1[2] sdb1[0](F)
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>
# mdadm --manage /dev/md0 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1 from /dev/md0
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sde1[3] sdd1[2]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

At this point, since we are doing it in a consumer-grade machine, we can shutdown the machine and replace the disk. In this case, physically replacing sdb with a new 8TB disk.

When the system comes back up, we can add the new disk to the array. Using lsblk we can see that the new disk is still detected as sdb:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdc1[1] sde1[3] sdd1[2]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      bitmap: 2/22 pages [8KB], 65536KB chunk

unused devices: <none>
# lsblk -do +VENDOR,MODEL,SERIAL
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT                   VENDOR   MODEL                  SERIAL
(...)
sda      8:0    0 447.1G  0 disk                              ATA      SanDisk_SSD_PLUS_480GB 2014J6460214
sdb      8:16   0   7.3T  0 disk                              ATA      ST8000DM004-2U9188     WSC2RGHW
sdc      8:32   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N5LNHRP8
sdd      8:48   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N5LNHFN6
sde      8:64   0   2.7T  0 disk                              ATA      WDC_WD30EFRX-68EUZN0   WD-WCC4N2AHPJ7J
# mdadm --add /dev/md0 /dev/sdb
mdadm: added /dev/sdb
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[4] sdc1[1] sde1[3] sdd1[2]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
      [>....................]  recovery =  0.0% (318076/2930134016) finish=307.0min speed=159038K/sec
      bitmap: 2/22 pages [8KB], 65536KB chunk

unused devices: <none>

Please notice how it is going to change from [4/3] to [4/4]. After recovery, it will automatically proceed to to a data-check:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[4] sdc1[1] sde1[3] sdd1[2]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [=================>...]  check = 86.7% (2542327552/2930134016) finish=76.4min speed=84543K/sec
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

Again, we'll have to check dmesg for any errors. If we find any, we should replace that disk back:

# dmesg
(...)
[   22.599212] aufs 5.4.3-20200302
[  113.986883] md: recovery of RAID array md0
[27856.422639] md: md0: recovery done.
[34228.300717] md: data-check of RAID array md0
[55830.808107] md: md0: data-check done.
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[4] sdc1[1] sde1[3] sdd1[2]
      8790402048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

At this point, we can proceed to replace the next disk. We can repeat the process for the rest of the disks.

Extending the RAID

Once we have replaced all the disks, we can extend the RAID array to use the full capacity of the new disks. Using mdadm with the --detail option, we can see the current size of the array:

# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri May 13 23:41:22 2016
        Raid Level : raid5
        Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
     Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Apr 13 11:23:05 2025
             State : clean
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : shuvak:0  (local to host shuvak)
              UUID : 799bf904:a38acec7:d86bfde7:39011a4a
            Events : 325722

    Number   Major   Minor   RaidDevice State
       4       8       16        0      active sync   /dev/sdb
       5       8       32        1      active sync   /dev/sdc
       7       8       48        2      active sync   /dev/sdd
       6       8       64        3      active sync   /dev/sde

Array Size is the total size of the array, while Used Dev Size is the amount of space used by each disk. In this case, we can see that the array is still using 3TB disks, so the total size of the array is 8TB.

Since we have replaced all the disks with bigger disks, we can now extend the array to use the full capacity of the new disks. To do so, we can use the --grow option with mdadm. We'll need to also specify the new active size of the array, which is the total size of the new disks. In this case, we can use --size=max to use the maximum size of the disks.

# mdadm --grow /dev/md0 --size max
mdadm: component size of /dev/md0 has been set to 7813895168K

If we recheck the size of the array, we can see that it has been updated, but it needs to resync the array to use the new size:

# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri May 13 23:41:22 2016
        Raid Level : raid5
        Array Size : 23441685504 (22355.73 GiB 24004.29 GB)
     Used Dev Size : 7813895168 (7451.91 GiB 8001.43 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Apr 13 11:53:37 2025
             State : clean, resyncing
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

     Resync Status : 37% complete

              Name : shuvak:0  (local to host shuvak)
              UUID : 799bf904:a38acec7:d86bfde7:39011a4a
            Events : 325731

    Number   Major   Minor   RaidDevice State
       4       8       16        0      active sync   /dev/sdb
       5       8       32        1      active sync   /dev/sdc
       7       8       48        2      active sync   /dev/sdd
       6       8       64        3      active sync   /dev/sde

We can also keep an eye on the progress of the resync using cat /proc/mdstat:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[4] sde[6] sdc[5] sdd[7]
      23441685504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [=======>.............]  resync = 37.7% (2953028220/7813895168) finish=680.1min speed=119102K/sec
      bitmap: 10/15 pages [40KB], 262144KB chunk

unused devices: <none>

Once it is finished, we can run the commands again to validate the final size of the array and that the new disks are not faulty:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[4] sdc[5] sde[6] sdd[7]
      23441685504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/15 pages [0KB], 262144KB chunk

unused devices: <none>
# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri May 13 23:41:22 2016
        Raid Level : raid5
        Array Size : 23441685504 (22355.73 GiB 24004.29 GB)
     Used Dev Size : 7813895168 (7451.91 GiB 8001.43 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Apr 14 08:55:49 2025
             State : clean
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : shuvak:0  (local to host shuvak)
              UUID : 799bf904:a38acec7:d86bfde7:39011a4a
            Events : 333143

    Number   Major   Minor   RaidDevice State
       4       8       16        0      active sync   /dev/sdb
       5       8       32        1      active sync   /dev/sdc
       7       8       48        2      active sync   /dev/sdd
       6       8       64        3      active sync   /dev/sde

Finally, using pvresize we can instruct LVM to resize the physical volume to use the new size of the array. This will allow us to use the full capacity of the new disks in our LVM volumes:

# pvresize /dev/md0
  Physical volume "/dev/md0" changed
  1 physical volume(s) resized or updated / 0 physical volume(s) not resized
# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               raid
  PV Size               21.83 TiB / not usable 0
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              5723067
  Free PE               3576974
  Allocated PE          2146093
  PV UUID               h68yNc-5MPU-5lWd-3p7m-RqCi-weLs-18yhsY

# vgdisplay
  --- Volume group ---
  VG Name               raid
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  26
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                7
  Open LV               6
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               21.83 TiB
  PE Size               4.00 MiB
  Total PE              5723067
  Alloc PE / Size       2146093 / <8.19 TiB
  Free  PE / Size       3576974 / <13.65 TiB
  VG UUID               HRKcbj-VoNI-w1jr-Ksu0-cbp9-lFAe-MkuiAn

Finally, we can extend the LVM volume to use the full capacity of the new disks. In this case, we are using lvextend to extend the logical volume to just 10TB:

# lvextend -L10TB -r /dev/raid/crap
  Size of logical volume raid/crap changed from <7.69 TiB (2015533 extents) to 10.00 TiB (2621440 extents).
  Logical volume raid/crap successfully resized.
resize2fs 1.45.5 (07-Jan-2020)
Filesystem at /dev/mapper/raid-crap is mounted on /var/crap/data; on-line resizing required
old_desc_blocks = 493, new_desc_blocks = 640
The filesystem on /dev/mapper/raid-crap is now 2684354560 (4k) blocks long.

# df -hP
Filesystem                         Size  Used Avail Use% Mounted on
(...)
/dev/mapper/raid-crap              9.9T  6.6T  3.0T  69% /var/crap/data

Posted on 14/04/2025

Categories