Wednesday, January 23, 2013

Disabling BBU Auto Learn with megacli

We are currently facing performance problems with mysql and I remember reading about RAID BBU Learning causing huge write performance drops. So I wanted to check if the RAID controller on our Master database had this configured.

Step#1: Find out what is the brand/model of the RAID controller installed on the server

I can of course ask accounting to pull-up the delivery receipt to find the brand/model but where is the fun in that. Googling led me to the following commands:


sudo lspci | grep -i raid
04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 04)

sudo lshw -class storage
description: RAID bus controller
       product: MegaRAID SAS 2108 [Liberator]
       vendor: LSI Logic / Symbios Logic
       physical id: 0
       bus info: pci@0000:04:00.0
       logical name: scsi4
       version: 04
       width: 64 bits
       clock: 33MHz
       capabilities: storage pm pciexpress vpd msi msix bus_master cap_list rom
       configuration: driver=megaraid_sas latency=0
       resources: irq:26 ioport:d800(size=256) memory:fae7c000-fae7ffff memory:faec0000-faefffff memory:fae80000-faebffff

Step#2: Install the megacli package to be able to query the raid card for its status

Downloaded the megacli package from the LSI website. But they only provide RPMs so I had to convert them with:

sudo alien -k MegaCli-8.07.06-1.noarch.rpm

Then installed with:

sudo dpkg -i megacli_8.07.06-1_all.deb

To find out where the files got installed do:

sudo dpkg -c megacli_8.07.06-1_all.deb

Step#3: Use megacli to probe for BBU status and information

Running the command results to something unexpected:

./MegaCli64 -adpCount
Controller Count: 0.
Exit Code: 0x00

megacli cant find the raid adapter. It seems that megacli has an issue with kernels >= 3.0 and can be remedied with:

sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -adpCount
Controller Count: 1.
Exit Code: 0x01

So to find out about the BBU:

sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL

BBU status for Adapter: 0

BatteryType: iBBU
Voltage: 3972 mV
Current: 0 mA
Temperature: 24 C
Battery State: Optimal
BBU Firmware Status:

  Charging Status              : None
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                  : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No


GasGuageStatus:
  Fully Discharged        : No
  Fully Charged           : Yes
  Discharging             : Yes
  Initialized             : Yes
  Remaining Time Alarm    : No
  Discharge Terminated    : No
  Over Temperature        : No
  Charging Terminated     : No
  Over Charged            : No
  Relative State of Charge: 97 %
  Charger System State: 49168
  Charger System Ctrl: 0
  Charging current: 0 mA
  Absolute state of charge: 53 %
  Max Error: 2 %

Exit Code: 0x00

sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuCapacityInfo -aALL

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 97 %
  Absolute State of charge: 53 %
  Remaining Capacity: 641 mAh
  Full Charge Capacity: 664 mAh
  Run time to empty: Battery is not being discharged.  
  Average time to empty: Battery is not being discharged.  
  Estimated Time to full recharge: Battery is not being charged.  
  Cycle Count: 37
Max Error = 2 %
Remaining Capacity Alarm = 120 mAh
Remining Time Alarm = 10 Min

Exit Code: 0x00

sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuProperties -aALL

BBU Properties for Adapter: 0
  Auto Learn Period: 30 Days
  Next Learn time: Sun Feb 17 19:37:09 2013
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Enabled
Exit Code: 0x00

Auto learn mode should be disabled and scheduled during off-peak time.

#!/bin/bash
TMPFILE=$(mktemp -p /tmp bbu.relearn.XXXXXXXXXX) || exit 1
echo "autoLearnMode=1" > $TMPFILE
setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -SetBbuProperties -f $TMPFILE -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuProperties -aALL
rm $TMPFILE

script for safe write back:


#!/bin/bash
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp ADRA -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -Cached -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp DisDskCache -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL


script to force write back without BBU protection:

#!/bin/bash
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp ADRA -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -Cached -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp DisDskCache -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp CachedBadBBU -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -aALL
sudo setarch x86_64 --uname-2.6 /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL


Useful Links:
http://linux.dell.com/files/whitepapers/solaris/Managing_PERC6_0714.pdf
http://hwraid.le-vert.net/wiki/LSIMegaRAIDSAS
http://yo61.com/dell-drac-bbu-auto-learn-tests-kill-disk-performance.html


I hope you found the post useful. You can subscribe via email or subscribe via a feed reader to get relevant updates from this blog. Have a nice day.

2 comments:

  1. Updating with the latest technology and implementing it is the only way to survive in our niche. Thanks for making me this article. You have done a great job by sharing this content in here. Keep writing article like this.
    SAS Training in Chennai | SAS Course in Chennai

    ReplyDelete
  2. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it. The python programming language is very popular and most widely used.
    Python Training in Chennai | Python Course in Chennai

    ReplyDelete