Question

5

MDADM Superblock Recovery

rated 0 times [ 5] [ 0] / answers: 1 / hits: 96908 / 1 Year ago, tue, december 13, 2022, 11:19:49

After a power cycle I found my RAID 5 Array no longer working. I tried various methods to reassemble the array but nothing has worked so far. I believe I need to recreate the superblocks and UUIDs somehow, but was reluctant to barrel into something as to not lose a bunch of data. Thanks for reading.

cat /etc/mdadm/mdadm.conf:

DEVICE partitions

ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1

MAILADDR root

Which is normal. It should have 4x2000GB drives (sda, sdc, sde, sdd).

cat /proc/mdstat:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md0 : inactive sdd[1](S)

  1953514496 blocks



unused devices: <none>

This is a problem. It only shows one drive in the array and it is also inactive. The array should have sda, sdc, and sde in there as well. When I do a mdadm --examine /dev/sdd everything looks fine. On the other drives examine says no RAID superblock on /dev/sdX.

mdadm --examine --scan:

ARRAY /dev/md0 level=raid5 num-devices=4 metadata=0.90 UUID=fd522a0f:2de72d76:f2afdfe9:5e3c9df1

No help there.

mdadm --assemble --scan -v:

mdadm: looking for devices for /dev/md0

mdadm: no RAID superblock on /dev/sde

mdadm: /dev/sde has wrong uuid.

mdadm: cannot open device /dev/sdd: Device or resource busy

mdadm: /dev/sdd has wrong uuid.

mdadm: no RAID superblock on /dev/sdc

mdadm: /dev/sdc has wrong uuid.

mdadm: cannot open device /dev/sdb5: Device or resource busy

mdadm: /dev/sdb5 has wrong uuid.

mdadm: no RAID superblock on /dev/sdb2

mdadm: /dev/sdb2 has wrong uuid.

mdadm: cannot open device /dev/sdb1: Device or resource busy

mdadm: /dev/sdb1 has wrong uuid.

mdadm: cannot open device /dev/sdb: Device or resource busy

mdadm: /dev/sdb has wrong uuid.

mdadm: no RAID superblock on /dev/sda

mdadm: /dev/sda has wrong uuid.

From this it looks like I have no UUIDs and no Superblocks for sda, sdc, and sde.

sudo fdisk -l

Disk /dev/sda: 2000.4 GB, 2000397852160 bytes

255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000



Disk /dev/sda doesn't contain a valid partition table



Disk /dev/sdb: 250.1 GB, 250058268160 bytes

255 heads, 63 sectors/track, 30401 cylinders, total 488395055 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x353cf669



Device Boot      Start         End      Blocks   Id  System

/dev/sdb1              63   476327249   238163593+  83  Linux

/dev/sdb2       476327250   488392064     6032407+   5  Extended

/dev/sdb5       476327313   488392064     6032376   82  Linux swap / Solaris



Disk /dev/sdc: 2000.4 GB, 2000397852160 bytes

255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000



Disk /dev/sdc doesn't contain a valid partition table



Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes

255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000



Disk /dev/sdd doesn't contain a valid partition table



Disk /dev/sde: 2000.4 GB, 2000397852160 bytes

255 heads, 63 sectors/track, 243201 cylinders, total 3907027055 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000



Disk /dev/sde doesn't contain a valid partition table

So from this it looks like none of my RAID disks have a partition table or UUID. The closest thing I found to my problem was this thread, which suggested running mdadm --create /dev/md0 -v -l 5 -n 4 /dev/sda /dev/sdc /dev/sde /dev/sdd and checking for a valid filesystem with fsck -fn /dev/md0. However, the first command spit out mdadm: no raid-devices specified. I retried the command using sda1, sdc1, etc, but then I get this:

mdadm: layout defaults to left-symmetric

mdadm: chunk size defaults to 512K

mdadm: layout defaults to left-symmetric

mdadm: layout defaults to left-symmetric

mdadm: super1.x cannot open /dev/sda1: No such file or directory

mdadm: ddf: Cannot open /dev/sda1: No such file or directory

mdadm: Cannot open /dev/sda1: No such file or directory

mdadm: device /dev/sda1 not suitable for any style of array

If I do a create and leave sda1 as a "missing" variable in the command then it just says the same thing for sdc1.

I am sure that I am making this more complicated than it needs to be. Can someone with experience please help me? Thanks for your time in advance.

*edit*
When i run dumpe2fs /dev/sda i get:

dumpe2fs 1.41.14 (22-Dec-2010)

Filesystem volume name:   <none>

Last mounted on:          <not available>

Filesystem UUID:          bbe6fb91-d37c-414a-8c2b-c76a30b9b5c5

Filesystem magic number:  0xEF53

Filesystem revision #:    1 (dynamic)

Filesystem features:      has_journal ext_attr resize_inode dir_index filetype     needs_recovery sparse_super large_file

Filesystem flags:         signed_directory_hash 

Default mount options:    (none)

Filesystem state:         clean

Errors behavior:          Continue

Filesystem OS type:       Linux

Inode count:              366288896

Block count:              1465135872

Reserved block count:     73256793

Free blocks:              568552005

Free inodes:              366066972

First block:              0

Block size:               4096

Fragment size:            4096

Reserved GDT blocks:      674

Blocks per group:         32768

Fragments per group:      32768

Inodes per group:         8192

Inode blocks per group:   512

Filesystem created:       Wed Oct 28 12:23:09 2009

Last mount time:          Tue Oct 18 13:59:36 2011

Last write time:          Tue Oct 18 13:59:36 2011

Mount count:              17

Maximum mount count:      26

Last checked:             Fri Oct 14 17:04:16 2011

Check interval:           15552000 (6 months)

Next check after:         Wed Apr 11 17:04:16 2012

Reserved blocks uid:      0 (user root)

Reserved blocks gid:      0 (group root)

First inode:              11

Inode size:           256

Required extra isize:     28

Desired extra isize:      28

Journal inode:            8

Default directory hash:   half_md4

Directory Hash Seed:      17e784d8-012e-4a29-9bbd-c312de282588

Journal backup:           inode blocks

Journal superblock magic number invalid!

So stuff is still there. Still researching...

Answers

Only authorized users can answer the question. Please sign in first, or register a free account.

oraoming

Add To Favorites

Follow

Total Points: 354

Total Questions: 105

Total Answers: 124

Location: Iraq

Member since Sat, Apr 3, 2021

3 Years ago

oraoming questions

1 Can't get Parsec to run on Ubuntu 22.04, missing libcrypto

Fri, Aug 20, 21, 10:08, 3 Years ago

1 How to print the ■ character in linux terminal using C?

Mon, May 24, 21, 21:56, 3 Years ago

1 How to print specific words from a file

Mon, Dec 12, 22, 23:21, 1 Year ago

1 Place "Power Off" as last option in the system menu?

Mon, Sep 12, 22, 11:38, 2 Years ago

1 Python.h file not found on Ubuntu 20.04 even after installing python-dev

Sat, Sep 10, 22, 14:12, 2 Years ago

View All

answered 1 Year ago itagde · Accepted Answer

Yikes! What a pickle. let's see if we can get you sorted. Starting with a recap of your disks and partition tables:

sda - no partition table

sdb - sdb1 [Linux] sdb2 [Linux extended] sdb5 [swap]

sdc - no partition table

sdd - no partition table

sde - no partition table

None of these are marked fd Linux raid autodetect, which is the default

You're not using partitions to organize your disk space [0]

You appear to have the entire disk formatted for ext2/3 and are using the entire
disk as part of the raidset

The last point is where I think you became undone. The initscripts probably thought you were due for an fsck, sanity checked the volumes, and wiped out the MD superblock in the process. dumpe2fs should return nothing for volumes part of the RAID set.

Take my RAID for example:

root@mark21:/tmp/etc/udev# fdisk -l /dev/sda



Disk /dev/sda: 640.1 GB, 640135028736 bytes

255 heads, 63 sectors/track, 77825 cylinders, total 1250263728 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x0000ffc4



Device Boot      Start         End      Blocks   Id  System

/dev/sda1            2048  1240233983   620115968   fd  Linux raid autodetect



root@mark21:/tmp/etc/udev# dumpe2fs /dev/sda1

dumpe2fs 1.41.14 (22-Dec-2010)

dumpe2fs: Bad magic number in super-block while trying to open /dev/sda

Couldn't find valid filesystem superblock.

That you were able to recreate the RAID set at all is extremely lucky, but that doesn't change the fundamental flaws in your deployment. This will happen again.

What I would recommend is:

Backup everything on that raid set

Destroy the array and erase the md superblock from each device (man mdadm)

Zero out those disks: dd if=/dev/zero of=/dev/sdX bs=1M count=100

Create partitions on sda, sdc, sdd, & sdf that span 99% of the disk [0]

Tag those partitions as type fd linux-raid wiki

never ever format these partitions with any sort of filesystem

Create a new RAID 5: mdadm --create /dev/md0 -v -f -l 5 -n 4 /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1

Update new UUID in /etc/mdadm.conf

Live happily ever after

I presume from your description that sdb is your system disk, and that's fine. Just make sure you don't accidentally include that in your raid set creation. After this, you should be on the right track and will never encounter this problem again.

[0] I encountered a very nasty fault once on SATA disks that had lots of bad blocks. After using the vendor tool to reconstitute the disk. My once identical set of disks was now unique, the bad drive now had a few blocks less than before the low level format had begun, which of course ruined my partition table and prevented the drive from rejoined the MD RAID set.

Hard drives usually have a "free list" of backup blocks used for just an occasion. My theory is that that list must have been exhausted, and since this wasn't an enterprise disk, instead of failing safe and allowing me the opportunity to send it off for data recovery, it decided to truncate my data and re-size the entire disk in.

Therefore, I never use the entire disk anymore when creating a RAID set, and instead use anywhere from 95-99% of the available free space when creating a partition that would normally span the entire disk. This also gives you some additional flexibility when replacing failed members. For example, not all 250 GB disks have the same amount of free blocks, so if you undershoot the max by a comfortable margin, then you can use almost any disk brand to replace a failed member.