Thursday, June 14, 2012

How to Repair raid with Spare Drive?

Raid Repair with Spare Drive:

I had set up one of our raids with a spare drive. A disk in this raid failed last night and the spare was used immediately. Unfortunately, I forgot that this raid had a spare, so I spent a bit of time trying to figure out why the new disk would not rebuild. But eventually, things worked again. Here’s my log:

[root@cps1 ~]# cd tw_cli
[root@cps1 tw_cli]# ./tw_cli
//cps1> info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     1862.61   ON     OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCANU2126397
p1     OK               u0     465.76 GB   976773168     WD-WCANU2051520
p2     OK               u0     465.76 GB   976773168     WD-WCANU2030999
p3     DEVICE-ERROR     u?     465.76 GB   976773168     WD-WCANU2021246
p4     OK               u0     465.76 GB   976773168     WD-WCANU2114264
p5     OK               u0     465.76 GB   976773168     WD-WCANU2051215
p6     NOT-PRESENT      -      -           -             -
p7     NOT-PRESENT      -      -           -             -
The first thing I should have noticed is that the raid was ok. Normally, if there’s a bad drive it shows up as degraded. But, I completely missed that fact and tried to rebuild it.
//cps1> maint remove c0 p3
Removing port /c0/p3 ... Done.

//cps1> info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     1862.61   ON     OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCANU2126397
p1     OK               u0     465.76 GB   976773168     WD-WCANU2051520
p2     OK               u0     465.76 GB   976773168     WD-WCANU2030999
p3     NOT-PRESENT      -      -           -             -
p4     OK               u0     465.76 GB   976773168     WD-WCANU2114264
p5     OK               u0     465.76 GB   976773168     WD-WCANU2051215
p6     NOT-PRESENT      -      -           -             -
p7     NOT-PRESENT      -      -           -             -

//cps1> rescan
Rescanning controller /c0 for units and drives ...Done.
Found the following unit(s): [/c0/u1].
Found the following drive(s): [none].

//cps1> info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     1862.61   ON     OFF
u1    RAID-5    INOPERABLE     -       -       64K     1862.61   OFF    OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCANU2126397
p1     OK               u0     465.76 GB   976773168     WD-WCANU2051520
p2     OK               u0     465.76 GB   976773168     WD-WCANU2030999
p3     OK               u1     465.76 GB   976773168     WD-WCAS84739115
p4     OK               u0     465.76 GB   976773168     WD-WCANU2114264
p5     OK               u0     465.76 GB   976773168     WD-WCANU2051215
p6     NOT-PRESENT      -      -           -             -
p7     NOT-PRESENT      -      -           -             -

//cps1> /c0/u0 start rebuild disk=3 ignoreecc
The following drive(s) cannot be used [3].
Error: (CLI:144) Invalid drive(s) specified.
I don’t need to list at all the other dumb commands I ran. All I needed to do was this.
//cps1> /c0/u1 del
Deleting /c0/u1 will cause the data on the unit to be permanently lost.
Do you want to continue ? Y|N [N]: Y
Deleting unit c0/u1 ...Done.

//cps1> info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     1862.61   ON     OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCANU2126397
p1     OK               u0     465.76 GB   976773168     WD-WCANU2051520
p2     OK               u0     465.76 GB   976773168     WD-WCANU2030999
p3     OK               -      465.76 GB   976773168     WD-WCAS84739115
p4     OK               u0     465.76 GB   976773168     WD-WCANU2114264
p5     OK               u0     465.76 GB   976773168     WD-WCANU2051215
p6     NOT-PRESENT      -      -           -             -
p7     NOT-PRESENT      -      -           -             -

//cps1> /c0 add type=spare disk=3
Creating new unit on controller /c0 ... Done. The new unit is /c0/u1.
WARNING: This Spare unit may replace failed drive of same interface type only.

//cps1> info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-5    OK             -       -       64K     1862.61   ON     OFF
u1    SPARE     OK             -       -       -       465.753   -      OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCANU2126397
p1     OK               u0     465.76 GB   976773168     WD-WCANU2051520
p2     OK               u0     465.76 GB   976773168     WD-WCANU2030999
p3     OK               u1     465.76 GB   976773168     WD-WCAS84739115
p4     OK               u0     465.76 GB   976773168     WD-WCANU2114264
p5     OK               u0     465.76 GB   976773168     WD-WCANU2051215
p6     NOT-PRESENT      -      -           -             -
p7     NOT-PRESENT      -      -           -             -

No comments:

Post a Comment