Software Raid

Create partitions with the type fd, e.g. /dev/sda1 und /dev/hda1

mdadm --create /dev/md0 --verbose --level=raid1 --raid-devices=2 /dev/hda1 /dev/sda1
mdadm --create /dev/md1 --verbose --level=raid5 --raid-devices=3 /dev/hda5 /dev/sda5 /dev/sdb5
mdadm --create /dev/md1 --verbose --level=raid5 --chunk=128 --raid-devices=3 /dev/hda5 /dev/sda5 /dev/sdb5

If there are not all hard disks available yet which you want to add to the raid, you might use the word missing instead of the partitions on that disk.

mdadm --create /dev/md0 -vv --level=raid1 --raid-devices=2 /dev/sda1 missing

If you want to mix traditional HDs and SSDs, every device after --write-mostly will not be used for reading if possible (so put the slow HDs there)

mdadm --create /dev/md0 --verbose --level=raid1 --raid-devices=2 /dev/sda5 --write-mostly /dev/sdb5

If you get this error

mdadm: error opening /dev/md0: No such device or address

and the device is really there, check if you loaded the module md.

Do not forget to add the output of

mdadm --detail --scan

to the file /etc/mdadm/mdadm.conf.

If you loose one of your hard disks, this is how you can see which partitions of your raid are still there

cat /proc/mdstat
mdadm --detail /dev/md0

You can check if a partitions was previously part of a raid

mdadm -Q /dev/hda1 /dev/hda1: device 0 in 2 device mismatch raid1 /dev/md0. Use mdadm --examine for more detail.

If you have a new hard disk you can add partitions to a raid like this

mdadm --manage --add /dev/md0 /dev/hda1

One of these errors

mdadm: add new device failed for /dev/hda1 as 2: No space left on device
mdadm: add new device failed for /dev/hda1 as 2: Invalid argument

probably indicates that your partition is too small for the raid. Check yourself

cat /proc/partitions

The size of the raid can also be changed (number of used partitions / utilisation of used partitions)

mdadm /dev/md1 --grow --raid-devices=4
mdadm /dev/md1 --grow --size=max

One day I noticed that one of my devices (in my case sdb) of a raid started to have problems, read errors, ... After getting a replacement device connect it additionally (in my case came up as sdd). Then tell the raid system it should replace the failing one:

mdadm /dev/md0 --add /dev/sdd1
mdadm /dev/md0 --replace /dev/sdb1 --with /dev/sdd1

This did not work at first

mdadm: Marked /dev/sdb1 (device 1 in /dev/md0) for replacement
mdadm: Failed to set /dev/sdd1 as preferred replacement.

For my the reason was that there was already a sync running in the background due to the failing device (check with cat /proc/mdstat). This is how to stop the sync and try again

echo "idle" > /sys/block/md0/md/sync_action

Another time I temporally lost a disk of a raid5, I had troubles restarting the raid. This is what I did to recover from this situation (written down from memory)

  • Boot from CD (e.g. Knoppix), if necessary.

/etc/init.d/mdadm start

mdadm --detail --scan >> /etc/mdadm/mdadm.conf
/etc/init.d/mdadm-raid restart
  • Check where the problem is
dmesg
mdadm --detail /dev/mdX
  • Although I only lost one drive, the raid refused to start again
raid5: cannot start dirty degraded array for md1
  • I got it back with assemble and the force option
mdadm --assemble --scan --run --force

Watch the progress of the recovery like this

watch -n5 cat /proc/mdstat

You can speed up the minimum speed progress, e.g. increase it to 50MB/s

echo 50000 >/proc/sys/dev/raid/speed_limit_min

You can remove partitions from a raid like this

mdadm --fail /dev/md0 /dev/hda1
mdadm --remove /dev/md0 /dev/hda1

Stop a running raid

mdadm --stop /dev/md1

Remove raid markers from a partition

mdadm --zero /dev/hda5

If you like to boot from a raid1, it is important to write the bootmanager to both harddrives. Just act as if the both partitions of your raid are regular partitions and no raid1 devices. First add both hard disks to the file

/boot/grub/device.map

Call one hd0 and one hd1. Then start grub, specify the device where /boot is installed (,0 is the first partition) and then write the bootsector of both hard disks

root (hd0,0) setup (hd0)
root (hd1,0) setup (hd1)

Now try to boot with only one of the hard disks attached. Do not forget to repair the raid in between.

Software Raid on SSDs

Trim

It is important that trim works for your filesystem. Create a filesystem, mount it to e.g. /mnt/ and then do

fstrim -v /mnt/mnt/

If you delete files since the last run, it should do something. This should work with a plain ext4 filesystem, ext4 on raid1 and even ext4 on lvm on raid1