RAID 1E recovery the hard way

Manual IBM ServeRAID recovery of a broken RAID1E setup – using Linux, hexedit, custom scripts – and coffee!

Yesterday I was called to a customer with a broken down IBM xSeries 232 server. The powersupply was gone, and the machine was rebooting every two minutes. Also it had 6 drives in a RAID setup with one marked as faulty (id 2). Oh – and did I mention the last backup was sometime last year?

The machine had been running Linux with VMware Server 2.0

I had brought an older IBM workstation and a ServeRAID 6m controller, and also a disk shelf to put the U320 drives in. I was careful to move the drives in the correct order, even though it doesn’t matter for the controller, it’s nice to have  organized.

Things became interesting when the ServeRAID 6m controller didn’t find any drives in the shelf. Fiddling with the cables, and moving the 6 drives to another location in the 14 bay shelf made the drives appear.

The controller complained ofcourse, but you can import the configuration by pressing Ctrl-I, entering the Advanced menu and selecting “Import configuration from drives”. After more complaining about possible data loss due to BBU and new controller, everything should be dandy for data retrieval.

But no – SCSI bus resets, hung drives etc. still plagued me. This made the controller fail two more drives (id 4 and id 5). Those familiar with RAID1E will know that two failed drives next to each other in an even numbered RAID1E is a bad thing – and also two drives in an odd numbered RAID1E is a bad thing.

So now, I was stuck with a bad thing. I figured the SCSI bus problems was due to weird grounding on the power cables, and changed those. Then I booted up the ServeRAID CD, and looked in the ServeRAID manager.

Now I had the following drives:

id 0, 1 and 3 = online
id 2, 4 and 5 = failed

I forced drive 5 online, and found out I was dealing with an odd number RAID1E (a 5 drive setup, the failed drive was not active, but the leftovers after an automatic rebuild on a hotspare).

Then I chose to rebuild drive 4, as that drive could not be forced online. After the 5 hour rebuild (oooooooooooold system – even with 15K drives), the RAID status went from DEGRADED to OKAY.

So now things should be fine, right? Noooooo, this is when things went from bad to worse. The controller completely messed up the drive numbering, and the interpretation of the striped was in the wrong order. So the partition table (first sector of drive) was okay, but everything else was in a state of … hmm, disarray? 🙂

Back in the lab, I plugged all drives to a plain SCSI controller so I could read everything from them.

That left me with 4 known working datasets – drive id 0, 1, 3 and 5.

But what was what? RAID1E duplicates data so I knew that each stripe must exist two times. Also I knew the stripe size was 8192 bytes (ServeRAID told me).

I quickly found that the RAID controller header was at the end of the drives, so I didn’t bother with that. Also I found out that first stripe of id 1 was the partition table, and second stripe of id 0 was the second stripe (by looking at the grub code). But that still didn’t give me the overview of the layout …

I mashed up this PHP script (yeah, so sue me, I know I should have used C, but I could do it much faster with PHP).

$drives=array(
'id0'=>fopen('original-id0.raw','rb'),
'id1'=>fopen('original-id1.raw','rb'),
'id3'=>fopen('original-id3.raw','rb'),
'id5'=>fopen('original-id5.raw','rb'));

foreach ($drives as $id=>$handle) {
  for ($block=0; $block<16; $block++) {
    $data[$id][$block]=bin2hex(fread($handle, 8192));
  }
  fclose($handle);
}

// compare
foreach ($drives as $id=>$handle) {
  for ($block=0; $block<16; $block++) {
    foreach ($drives as $id2=>$handle2) {
      for ($block2=0; $block2<16; $block2++) {
        if (($id!==$id2) and ($data[$id][$block]==$data[$id2][$block2])) echo "Drive $id block $block matches drive $id2 block $block2\n";
      }
    }
  }
}

That produced the following output:

Drive id0 block 1 matches drive id3 block 0
Drive id0 block 3 matches drive id3 block 2
Drive id0 block 5 matches drive id3 block 4
Drive id0 block 7 matches drive id3 block 6
Drive id0 block 9 matches drive id3 block 8
Drive id0 block 11 matches drive id3 block 10
Drive id0 block 13 matches drive id3 block 12
Drive id0 block 15 matches drive id3 block 14
Drive id1 block 0 matches drive id3 block 1
Drive id1 block 1 matches drive id5 block 0
Drive id1 block 2 matches drive id3 block 3
Drive id1 block 3 matches drive id5 block 2
Drive id1 block 4 matches drive id3 block 5
Drive id1 block 5 matches drive id5 block 4
Drive id1 block 6 matches drive id3 block 7
Drive id1 block 7 matches drive id5 block 6
Drive id1 block 8 matches drive id3 block 9
Drive id1 block 9 matches drive id5 block 8
Drive id1 block 10 matches drive id3 block 11
Drive id1 block 11 matches drive id5 block 10
Drive id1 block 12 matches drive id3 block 13
Drive id1 block 13 matches drive id5 block 12
Drive id1 block 14 matches drive id3 block 15
Drive id1 block 15 matches drive id5 block 14
Drive id3 block 0 matches drive id0 block 1
Drive id3 block 1 matches drive id1 block 0
Drive id3 block 2 matches drive id0 block 3
Drive id3 block 3 matches drive id1 block 2
Drive id3 block 4 matches drive id0 block 5
Drive id3 block 5 matches drive id1 block 4
Drive id3 block 6 matches drive id0 block 7
Drive id3 block 7 matches drive id1 block 6
Drive id3 block 8 matches drive id0 block 9
Drive id3 block 9 matches drive id1 block 8
Drive id3 block 10 matches drive id0 block 11
Drive id3 block 11 matches drive id1 block 10
Drive id3 block 12 matches drive id0 block 13
Drive id3 block 13 matches drive id1 block 12
Drive id3 block 14 matches drive id0 block 15
Drive id3 block 15 matches drive id1 block 14
Drive id5 block 0 matches drive id1 block 1
Drive id5 block 2 matches drive id1 block 3
Drive id5 block 4 matches drive id1 block 5
Drive id5 block 6 matches drive id1 block 7
Drive id5 block 8 matches drive id1 block 9
Drive id5 block 10 matches drive id1 block 11
Drive id5 block 12 matches drive id1 block 13
Drive id5 block 14 matches drive id1 block 15

So that gave me the following drive layout:

Drive Layout
id 0 a/b
id 1 c/d
id 2 missing
id 3 b/c
id 4 missing
id 5 d/e

So I had my 5 stripes at least once … so now to the ordering. From previous examination I found that the order was c/b/?/?/?. That left me with 8 possible permutations. But time was critical, so I couldn’t afford to try all 8 and just see which one was the right one.

Browsing through with hexedit, I found an interesting pattern of 02 80 00 00 02 80 01 00 etc. and 03 80 00 00 04 80 01 00 … looks like some sort of file allocation or inode table. If I could order those patterns up, so they counted up incrementally, I might have the correct order? I tried c/b/a/e/d, with this (also PHP code, sorry!)

$drives=array(
  'id0'=>fopen('original-id0.raw','rb'),
  'id1'=>fopen('original-id1.raw','rb'),
  'id3'=>fopen('original-id3.raw','rb'),
  'id5'=>fopen('original-id5.raw','rb'));

$output=fopen('joined.raw','w');

while (!feof($drives['id0'])) {
  // id0
  $a=bin2hex(fread($drives['id0'], 8192));
  $b=bin2hex(fread($drives['id0'], 8192));

  // id1
  $c=bin2hex(fread($drives['id1'], 8192));
  $d=bin2hex(fread($drives['id1'], 8192));

  // id3
  // bin2hex(fread($drives['id3'], 8192)); // b dupe
  // bin2hex(fread($drives['id3'], 8192)); // c dupe

  // id5
  bin2hex(fread($drives['id5'], 8192)); // d dupe
  $e=bin2hex(fread($drives['id5'], 8192));

  fwrite($output, pack("H*" ,$c));
  fwrite($output, pack("H*" ,$b));

  fwrite($output, pack("H*" ,$a));
  fwrite($output, pack("H*" ,$e));
  fwrite($output, pack("H*" ,$d));
}

I then imported this dump into VirtualBox with:

VBoxManage convertfromraw -format VDI joined.raw recovered.vdi

The I booted VirtualBox with Recovery is Possible Linux (http://rip.7bf.de/current/) and guess what – my image worked, I could mount the partitions and extract the data.

Things learned today:
* never assume anything
* have a open mind
* never mess around with live data, work on a copy
* backups are nice to have

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.