Dedicated Servers should be using one of three types of RAID. If you are unsure which one you are using, please run the following from the command line:
lspci | grep RAID
If the response contains:
- 3ware — you’re probably using 3ware RAID.
- Hewlett-Packard — you’re probably using HP RAID.
- megaRAID — you’re probably using MegaRAID.
- anything else (or no output) — you’re probably using software RAID.
3ware RAID
To check the status of a 3ware RAID you need to input tw_cli
(as root). It might be tw_cli.amd64
or tw_cli.i386
on some systems.
If this isn’t installed you can install it using apt-get install 3ware-tools-32
or apt-get install 3ware-tools-64
on debian/ubuntu servers.
First, you’ll need to find the number of the controller you’re using by inputting:
root@tmp:~# tw_cli show
Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU
------------------------------------------------------
c2 9650SE-4LPML 4 3 1 1 1 1 OK
Next, check the status of the appropriate controller. In this case, it’s c2.
root@tmp:~# tw_cli /c2 show
Unit UnitType Status V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------
u0 RAID-10 DEGRADED - - 64K 931.303 OFF OFF
Port Status Unit Size Blocks Serial
------------------------------------------------------
p0 OK u0 465.76 GB 976773168 9QM8VD88
p1 NOT-PRESENT - - - -
p2 OK u0 465.76 GB 976773168 9QM8VD9L
p3 OK u0 465.76 GB 976773168 9QM8QZC8
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
------------------------------------------------------
bbu On Yes OK OK OK 0 xx-xxx-xxxx
The important part to look at is the Status column; if everything shows as OK then your RAID is healthy. If you see anything else, please raise a support ticket including the output from tw\_cli
show and tw\_cli /c2 show
.
HP RAID
To check the status of an HP RAID you need to use hpacucli
(as root).
If this isn’t installed, you can install it using apt-get install hpacucli ia32-libs
on Debian/Ubuntu systems.
root@tmp:~$ hpacucli ctrl all show config
Smart Array P410 in Slot 6 (sn: PACCR9SYN0CX )
array A (SATA, Unused Space: 0 MB)
logicaldrive 1 (931.5 GB, RAID 1+0, OK)
physicaldrive 2I:0:5 (port 2I:box 0:bay 5, SATA, 500 GB, OK)
physicaldrive 2I:0:6 (port 2I:box 0:bay 6, SATA, 500 GB, OK)
physicaldrive 2I:0:7 (port 2I:box 0:bay 7, SATA, 500 GB, OK)
physicaldrive 2I:0:8 (port 2I:box 0:bay 8, SATA, 500 GB, OK)
If they all show as OK then your RAID is healthy, anything else please raise a support ticket including the output from hpacucli ctrl all show config
.
MegaRAID
To check the status of a MegaRAID, you will need to use the storcli tool
(as root).
If this tool is not installed, it may be downloaded by searching for storcli
at http://www.avagotech.com/support/download-search.
root@tmp:\~\$ storcli show
Status Code = 0
Status = Success
Description = None
Number of Controllers = 1
Host Name = unknown.sparebox.please.change.vlan
Operating System = Linux3.17.4-bytemark-amd64
System Overview :
===============
------------------------------------------------------
Ctl Model Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth
------------------------------------------------------
0 MegaRAID 8 2 1 0 1 0 Opt On 1&2 Y 3 Opt
------------------------------------------------------
Ctl=Controller Index|DGs=Drive groups|VDs=Virtual drives|Fld=Failed
PDs=Physical drives|DNOpt=DG NotOptimal|VNOpt=VD NotOptimal|Opt=Optimal
Msng=Missing|Dgd=Degraded|NdAtn=Need Attention|Unkwn=Unknown
sPR=Scheduled Patrol Read|DS=DimmerSwitch|EHS=Emergency Hot Spare
Y=Yes|N=No|ASOs=Advanced Software Options|BBU=Battery backup unit
Hlth=Health|Safe=Safe-mode boot
The important column is Hlth
(Health). If this indicates anything other the Opt
(Optimal) then there may be an issue. You should then raise a support ticket with the full details of the affected controller (controller 0 in this example) by including the output of the following command:
root@tmp:\~\$ storcli /c0 show
Software RAID
Software RAID is found in the FSx range and the earlier value boxes. To check the status run
cat /proc/mdstat
Your output should look something like this:
root@tmp:~$ cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1]
116414912 blocks [2/1] [_U]
md0 : active raid1 sda1[0] sdb1[1]
803136 blocks [2/2] [UU]
unused devices: <none>
[UU] would indicate a healthy RAID partition
[_U] or [U_] would indicate a failed drive
If you have a failed drive please open a support ticket and include the outputs from the following commands:
cat /proc/mdstat
smartctl -a /dev/sda
smartctl -a /dev/sdb
(smartctl
can be installed using apt-get install smartmontools
on debian/ubuntu)
or
cat /proc/mdstat
lshw -c disk (as root)
(lshw
can be installed using apt-get install lshw
on Debian/Ubuntu)
This will tell us which disk has failed and the serial numbers of the disks, so that we change the correct one. There will be some downtime required for the disk swap to take place, which we will arrange with you once we’ve received the support request containing the name of the disk that needs swapping. It’s important to note that once the disc has been replaced it must be added into the RAID — we will do this step for you when we replace the disk.