Thursday, April 25, 2024
 Popular · Latest · Hot · Upcoming
5
rated 0 times [  5] [ 0]  / answers: 1 / hits: 96908  / 1 Year ago, tue, december 13, 2022, 11:19:49

After a power cycle I found my RAID 5 Array no longer working. I tried various methods to reassemble the array but nothing has worked so far. I believe I need to recreate the superblocks and UUIDs somehow, but was reluctant to barrel into something as to not lose a bunch of data. Thanks for reading.



cat /etc/mdadm/mdadm.conf:



DEVICE partitions
ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1
MAILADDR root


Which is normal. It should have 4x2000GB drives (sda, sdc, sde, sdd).



cat /proc/mdstat:



Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd[1](S)
1953514496 blocks

unused devices: <none>


This is a problem. It only shows one drive in the array and it is also inactive. The array should have sda, sdc, and sde in there as well. When I do a mdadm --examine /dev/sdd everything looks fine. On the other drives examine says no RAID superblock on /dev/sdX.



mdadm --examine --scan:



ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1


No help there.



mdadm --assemble --scan -v:



mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/sde
mdadm: /dev/sde has wrong uuid.
mdadm: cannot open device /dev/sdd: Device or resource busy
mdadm: /dev/sdd has wrong uuid.
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdc has wrong uuid.
mdadm: cannot open device /dev/sdb5: Device or resource busy
mdadm: /dev/sdb5 has wrong uuid.
mdadm: no RAID superblock on /dev/sdb2
mdadm: /dev/sdb2 has wrong uuid.
mdadm: cannot open device /dev/sdb1: Device or resource busy
mdadm: /dev/sdb1 has wrong uuid.
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: /dev/sdb has wrong uuid.
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sda has wrong uuid.


From this it looks like I have no UUIDs and no Superblocks for sda, sdc, and sde.



sudo fdisk -l



Disk /dev/sda: 2000.4 GB, 2000397852160 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 250.1 GB, 250058268160 bytes
255 heads, 63 sectors/track, 30401 cylinders, total 488395055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x353cf669

Device Boot Start End Blocks Id System
/dev/sdb1 63 476327249 238163593+ 83 Linux
/dev/sdb2 476327250 488392064 6032407+ 5 Extended
/dev/sdb5 476327313 488392064 6032376 82 Linux swap / Solaris

Disk /dev/sdc: 2000.4 GB, 2000397852160 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 2000.4 GB, 2000397852160 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table


So from this it looks like none of my RAID disks have a partition table or UUID. The closest thing I found to my problem was this thread, which suggested running mdadm --create /dev/md0 -v -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd and checking for a valid filesystem with fsck -fn /dev/md0. However, the first command spit out mdadm: no raid-devices specified. I retried the command using sda1, sdc1, etc, but then I get this:



mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: super1.x cannot open /dev/sda1: No such file or directory
mdadm: ddf: Cannot open /dev/sda1: No such file or directory
mdadm: Cannot open /dev/sda1: No such file or directory
mdadm: device /dev/sda1 not suitable for any style of array


If I do a create and leave sda1 as a "missing" variable in the command then it just says the same thing for sdc1.



I am sure that I am making this more complicated than it needs to be. Can someone with experience please help me? Thanks for your time in advance.



*edit*
When i run dumpe2fs /dev/sda i get:



dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: bbe6fb91-d37c-414a-8c2b-c76a30b9b5c5
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366288896
Block count: 1465135872
Reserved block count: 73256793
Free blocks: 568552005
Free inodes: 366066972
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 674
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Filesystem created: Wed Oct 28 12:23:09 2009
Last mount time: Tue Oct 18 13:59:36 2011
Last write time: Tue Oct 18 13:59:36 2011
Mount count: 17
Maximum mount count: 26
Last checked: Fri Oct 14 17:04:16 2011
Check interval: 15552000 (6 months)
Next check after: Wed Apr 11 17:04:16 2012
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 17e784d8-012e-4a29-9bbd-c312de282588
Journal backup: inode blocks
Journal superblock magic number invalid!


So stuff is still there. Still researching...


More From » 11.10

 Answers
2

Yikes! What a pickle. let's see if we can get you sorted. Starting with a recap of your disks and partition tables:



sda - no partition table
sdb - sdb1 [Linux] sdb2 [Linux extended] sdb5 [swap]
sdc - no partition table
sdd - no partition table
sde - no partition table



  1. None of these are marked fd Linux raid autodetect, which is the default

  2. You're not using partitions to organize your disk space [0]

  3. You appear to have the entire disk formatted for ext2/3 and are using the entire
    disk as part of the raidset



The last point is where I think you became undone. The initscripts probably thought you were due for an fsck, sanity checked the volumes, and wiped out the MD superblock in the process. dumpe2fs should return nothing for volumes part of the RAID set.



Take my RAID for example:



root@mark21:/tmp/etc/udev# fdisk -l /dev/sda

Disk /dev/sda: 640.1 GB, 640135028736 bytes
255 heads, 63 sectors/track, 77825 cylinders, total 1250263728 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000ffc4

Device Boot Start End Blocks Id System
/dev/sda1 2048 1240233983 620115968 fd Linux raid autodetect

root@mark21:/tmp/etc/udev# dumpe2fs /dev/sda1
dumpe2fs 1.41.14 (22-Dec-2010)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda
Couldn't find valid filesystem superblock.


That you were able to recreate the RAID set at all is extremely lucky, but that doesn't change the fundamental flaws in your deployment. This will happen again.



What I would recommend is:




  1. Backup everything on that raid set

  2. Destroy the array and erase the md superblock from each device (man mdadm)

  3. Zero out those disks: dd if=/dev/zero of=/dev/sdX bs=1M count=100

  4. Create partitions on sda, sdc, sdd, & sdf that span 99% of the disk [0]

  5. Tag those partitions as type fd linux-raid wiki

  6. never ever format these partitions with any sort of filesystem

  7. Create a new RAID 5: mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1

  8. Update new UUID in /etc/mdadm.conf

  9. Live happily ever after



I presume from your description that sdb is your system disk, and that's fine. Just make sure you don't accidentally include that in your raid set creation. After this, you should be on the right track and will never encounter this problem again.



[0] I encountered a very nasty fault once on SATA disks that had lots of bad blocks. After using the vendor tool to reconstitute the disk. My once identical set of disks was now unique, the bad drive now had a few blocks less than before the low level format had begun, which of course ruined my partition table and prevented the drive from rejoined the MD RAID set.



Hard drives usually have a "free list" of backup blocks used for just an occasion. My theory is that that list must have been exhausted, and since this wasn't an enterprise disk, instead of failing safe and allowing me the opportunity to send it off for data recovery, it decided to truncate my data and re-size the entire disk in.



Therefore, I never use the entire disk anymore when creating a RAID set, and instead use anywhere from 95-99% of the available free space when creating a partition that would normally span the entire disk. This also gives you some additional flexibility when replacing failed members. For example, not all 250 GB disks have the same amount of free blocks, so if you undershoot the max by a comfortable margin, then you can use almost any disk brand to replace a failed member.


[#42799] Tuesday, December 13, 2022, 1 Year  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
oraoming

Total Points: 354
Total Questions: 105
Total Answers: 124

Location: Iraq
Member since Sat, Apr 3, 2021
3 Years ago
oraoming questions
Fri, Aug 20, 21, 10:08, 3 Years ago
Mon, May 24, 21, 21:56, 3 Years ago
Mon, Dec 12, 22, 23:21, 1 Year ago
Mon, Sep 12, 22, 11:38, 2 Years ago
;