Friday, November 26, 2010

Linux Storage Multipathing in a Mixed-Vendor (NetApp/EMC) Environment

So, recently, I've been tasked with coming up with documentation for the junior staff that has to work with RedHat Enterprise Linux 5.x servers in a multi-vendor storage environment. I our particular environment, the two most likely candidates to be seen on a Linux server are NetApp and/or CLARiiON storage arrays.

Previously, I've covered how to set up an RHEL 5.x system to use the Linux multipath service with NetApp Filer-based fibrechannel storage. Below, I'll expand on that, a bit, by explaining how to deal with a multi-vendor storage environment where separate storage subsystems will use separate storage multipathing solutions. In this particular case, the NetApp Filer-based fibrechannel storage will continue to be managed with the native Linux storage multipathing solution (multipathd) and the EMC CLARiiON-based storage will use the EMC-provided storage multipathing solution, PowerPath. I'm not saying such a configuration will be normal in your environment or mine, it's just an "edge-case scenario" I explored in my testing environment just in case someone asked for it. It's almost a given that if you haven't tested or documented the edge-cases, someone will invariably want to know how to do it (and, conversely, if you test and document it, no one ever bothers you about how to do it in real life).

Prior to presentation of EMC CLARiiON-based storage to your mixed-storage system, you will want to ensure that:
  • CLARiiON-based LUNs are excluded from your multipathd setup
  • PowerPath software has been installed
To explicitly exclude CLARiiON-based storage from multipathd's management, it will be necessary to modify your system's /etc/multipath.conf file. You will need to modify this file's blacklist stanza to resemble the following:

blacklist {
        wwid DevId
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"

          #############################################################################
        # Comment out the next four lines if management of CLARiiON LUNs is *WANTED*
        #############################################################################
        device {
                vendor "DGC"
                product "*"
        }

  }

The lines we're most interested in are the four that begin with the device { directive. These lines are what the blacklist interpreter uses to tell itself "ignore any devices whose SCSI inquiry returns a Vendor ID of "DGC".

I should note that I'd driven myself slightly nuts working out the above. I'd tried simply placing the 'device' entry directly in the 'blacklist' block. However, I found that, if I didn't contextualize it into a 'device' sub-block, the multipathd service would pretty much just flip me the bird and ignore the directive (without spitting out errors to tell me that it was doing so or why). Thus, it would continue to grab my CLARiiON LUNs until I nested my directives properly. The 'product' definition is, also, probably overkill, but, it works.

Once these are in place, restart the multipath daemon to get it to reread its configuration files. Afterwards, request storage and do the usual PowerPath tasks to bring the CLARiiON devices under PowerPath's control. Properly set up, this will result in a configuration similar to the following:
# multipath -l
360a98000486e2f34576f2f51715a714d dm-7 NETAPP,LUN
[size=25G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 0:0:0:1 sda 8:0   [active][undef]
 \_ 0:0:1:1 sdb 8:16  [active][undef]
 \_ 1:0:0:1 sde 8:64  [active][undef]
 \_ 1:0:1:1 sdf 8:80  [active][undef]

# powermt display dev=all
Pseudo name=emcpowera
CLARiiON ID=APM00034000388 [stclnt0001u]
Logical device ID=600601F8440E0000DBE5E2FEF1E1DF11 [LUN 13]
state=alive; policy=BasicFailover; priority=0; queued-IOs=0;
Owner: default=SP A, current=SP B       Array failover mode: 1
==============================================================================
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
==============================================================================
   0 qla2xxx                  sdc       SP A1     unlic   alive       0      0
   0 qla2xxx                  sdd       SP B1     unlic   alive       0      0
   1 qla2xxx                  sdg       SP A0     active  alive       0      0
   1 qla2xxx                  sdh       SP B0     active  alive       0      0

As can be seen from the above, the NetApp LUN(s) are showing up under the multipathd's control and the CLARiiON LUNs are showing up under PowerPath's control. Neither multi-pathing solution is seeing the others' devices.

You'll also note that the Array failover mode is set to "1". In my test environment, the only CLARiiON I have access to is in dire need of a firmware upgrade. Its firmware doesn't support mode "4" (ALUA). Since I'm using this test configuration to test both PowerPath and native multipathing, I had to set the LUN to a mode that both the array and multipathd supported to get my logs to stop getting polluted with "invalid mode" settings. Oh well, hopefully a hardware refresh is coming to my lab.

Lastly, you'll also likely note that I'm running PowerPath in unlicensed mode. Again, this is a lab scenario where I'm tearing stuff down and rebuilding, frequently. Were it a production system, the licensing would be in place to enable all of the PowerPath functionality.

1 comment:

  1. Thank you for this nice post! I've been tasked to migrate data from EMC to Netapp and I was able to configure Netapp with multipath while
    I had to keep EMC luns intact, all on a production server!

    ReplyDelete