Hot-swapping non-hot-swap SATA disks on Linux
Uncategorized 3 CommentsOnce I had a project where I needed to swap two SATA disks every week. Shutting down, swapping, and booting up again took a lot of time, and – to be honest – was a bit uncomfortable, so I thought I will try to make the change a little bit hot.
My disks and motherboard AFAIK were not hot-swap-compatible, but with some SCSI/SATA commands I was able to have a behavior very similar to that.
The first part of hot-swapping of course is the hardware – I used mobile racks for this purpose (something like this, but with a separate power switch), but you can simply unplug/plug your disks as well in case you are brave enough to do so. I think the mobile rack is far more safe (I have seen dead hard drives because of an improperly connected power connector).
The magic behind the hot-swapping is basically two SCSI commands: remove-single-device and add-single-device. With these you are able to tell the SCSI driver to release and to grab the disks again.
Be sure to unmount and to have the drive absolutely unused when you tell the driver to release the devices – it will not check if the drives are in use or not and your kernel may get confused about the disappearing drives. (Believe me, I tried 🙂
One more thing you need to know before using those commands: the host, channel, ID, LUN numbers of your drives. You can find these numbers easily, simply cat /proc/scsi/scsi, or ls /dev/disks/by-path/*.
Okay, enough of chattering, here is the Bash script (assuming you have a proper fstab entry for /mnt/swaptest, and the SCSI id of 0:0:1:0):
# unmount the file system
umount /mnt/swaptest || exit 1
# synchronize the remaining data to the disk if any (this will
# flush the cache on all disks! we might not need this as the
# umount should sync, but it will never hurt)
sync
# give time for sync to finish
sleep 5
# remove specific host, channel, ID, LUN (this is the magic)
echo "scsi remove-single-device 0 0 1 0" > /proc/scsi/scsi
# give a little time for the driver to do its job
sleep 5
echo "Please change disks and press enter."
read tmp
# wait a bit to get the drive initialized
sleep 15
# and here comes the new disk:
echo "scsi add-single-device 0 0 1 0" >> /proc/scsi/scsi
# give a little time for the driver, udev, ...
sleep 15
mount /mnt/swaptest || exit 2
echo "Swap was succesful."
Basically that’s all.
You might use mdadm to have a RAID-1 array, and have a copy of your system in a safe place, you can handle this easily as well:
# mark the disk as faulty for further remove from the array
mdadm /dev/md0 --fail /dev/sdb1
# sleep a bit - sometimes mdadm might need it...
sleep 5
# remove the disk from the array
mdadm /dev/md0 --remove /dev/sdb1
# synchronize the remaining data to the disk if any
# (this will flush cache on all disks!)
sync
# give time for sync to finish
sleep 5
# remove specific host,channel,ID,LUN (this is the magic)
echo "scsi remove-single-device 0 0 1 0" > /proc/scsi/scsi
# give a little time for the driver to do its job
sleep 5
echo "Please change disks and press enter."
read tmp
# wait a bit to get the drive initialized
sleep 15
# and here comes the new disk:
echo "scsi add-single-device 0 0 1 0" >> /proc/scsi/scsi
# sleep a bit, to give a little time for the driver, udev, ...
sleep 15
mdadm /dev/md0 --add /dev/sdb1 || exit 2
echo "Swap was succesful."
Of course these swaps can become more complex, using more disks, more RAID arrays, LVM volumes, but these are the basics – I assume you can figure out the rest 🙂
Oh, and the disclaimer: always have a backup of your data before trying these steps/commands, and I have read not all the drives are capable of doing such things, I personally do not think this can lead to hardware failure but I cannot be certain about it. Please try anything at your own risk.
Pingback: bilan carbone
Pingback: breastcancer
Pingback: website design