Well, I have just received the RMA replacement of my failed Samsung HD204UI 2TB drive. Now it is time to put the drive back into my Linux server and rebuild the RAID-1 (mirror) array so that the data resumes its highly-available state. Here are the steps I took:
1) Install the hard drive. Since I am not using hot-swap drive bays, I will have to shutdown the server to attach the drive the SATA-II controller.
2) Locate the disk (new disk is /dev/sdd). This was found by looking at "fdisk -l". My replacement is the unpartitioned drive that has resumed the device spec of my failed unit.
3) A Linux RAID (fd) partition must be created on the new drive. Create the partitioning table on the new drive, /dev/sdd, identically to the drive in the array. Use the "sfdisk" command:
# sfdisk -d /dev/sdc > sdc_partition.out
# sfdisk /dev/sdd < sdc_partition.out
Checking that no-one is using this disk right now ... OK
Disk /dev/sdd: 243201 cylinders, 255 heads, 63 sectors/track
sfdisk: ERROR: sector 0 does not have an msdos signature /dev/sdd: unrecognized partition table type Old situation: No partitions found New situation: Units = sectors of 512 bytes, counting from 0
Device Boot Start End #sectors Id System /dev/sdd1 63 3907024064 3907024002 fd Linux raid autodetect /dev/sdd2 0 - 0 0 Empty /dev/sdd3 0 - 0 0 Empty /dev/sdd4 0 - 0 0 Empty Warning: no primary partition is marked bootable (active) This does not matter for LILO, but the DOS MBR will not boot this disk. Successfully wrote the new partition table
Re-reading the partition table ...
If you created or changed a DOS partition, /dev/foo7, say, then use dd(1) to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1 (See fdisk(8).)
4) Verify that the disk was partitioned properly and similar to the surviving drive. In the example, I am comparing the partition tables of the drive currently in the RAID array with my new drive. Notice that the newly created partition is /dev/sdd1 on my replacement drive /dev/sdd.
# sfdisk -l /dev/sdc
Disk /dev/sdc: 243201 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
Device Boot Start End #cyls #blocks Id System
/dev/sdc1 0+ 243200 243201- 1953512001 fd Linux raid autodetect
/dev/sdc2 0 - 0 0 0 Empty
/dev/sdc3 0 - 0 0 0 Empty
/dev/sdc4 0 - 0 0 0 Empty
[root@aquila ~]# sfdisk -l /dev/sdd
Disk /dev/sdd: 243201 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
Device Boot Start End #cyls #blocks Id System
/dev/sdd1 0+ 243200 243201- 1953512001 fd Linux raid autodetect
/dev/sdd2 0 - 0 0 0 Empty
/dev/sdd3 0 - 0 0 0 Empty
/dev/sdd4 0 - 0 0 0 Empty
My degraded raidset is /dev/md3. Here is it's current status.
# mdadm -v -D /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Sat Jun 4 21:34:09 2011
Raid Level : raid1
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Tue Jul 12 20:27:30 2011
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 5de5eb25:d02318e7:da699fd5:65330895
Events : 0.13444
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 0 0 1 removed
5) Now I need to reconstruct the degraded RAID array with the partition on the new drive. Since the replacement drive is now properly partitioned,it can simply be added to the array, /dev/md3, using the mdadm --manage command.
# mdadm -v --manage /dev/md3 --add /dev/sdd1
mdadm: added /dev/sdd1
Now taking a look at the /dev/md3 raidset, the new partition has been added as a spare and the array has automatically started recovering. This is a 2TB drive, so it will take a little while to become in-sync.
# mdadm -D /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Sat Jun 4 21:34:09 2011
Raid Level : raid1
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Tue Jul 12 20:27:30 2011
State : clean, degraded, recovering
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
Rebuild Status : 0% complete
UUID : 5de5eb25:d02318e7:da699fd5:65330895
Events : 0.13444
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
2 8 49 1 spare rebuilding /dev/sdd1
Here is what the healthy mdadm status of the 2TB mirror raidset looks like (after about 8-10 hours with my /proc/sys/dev/raid/ configurations):
# mdadm -D /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Sat Jun 4 21:34:09 2011
Raid Level : raid1
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Wed Jul 13 08:20:22 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 5de5eb25:d02318e7:da699fd5:65330895
Events : 0.13498
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
That was all that there was to it. If you follow these steps with your own server, your mileage may vary, so be careful. Make sure that you take care of your data first and make a consistent, recoverable backup before you start. Remember, a backup that has never been restored is not a backup.
No comments:
Post a Comment