Archive for May, 2012

Understanding Device Mapper Multipath

Posted in Linux, Uncategorized on May 29, 2012 by theanswriz42

I wasn’t initially sure the best way to kickoff this blog but after following a particular thread on the Oracle RAC Sig forum (www.oracleracsig.com), I figured an introduction to DM-Multipath may be the way to go since there seems to be a bit of confusion for many on exactly where to begin.

For those who aren’t terribly familiar, DM-Multipath is the native multipathing utility for Linux. Based on performance tests I’ve run through in the past couple years, it definitely seems to hold its own against offerings similar to Veritas VxDMP and EMC PowerPath. In fact, I was able to get better performance than VxDMP and only marginally (about 1%) slower performance than PowerPath on both EMC CLARiiON and Symmetrix VMAX arrays using both Orion and Fio I/O testing utilities. In my book, it’s hard to argue against free, especially when the performance is right up there with the big boys.

Anyway, onto the basics.

The DM-Multipath config file is located at /etc/multipath.conf and is broken up into 5 major sections; defaults, blacklist_exceptions, blacklist, devices, and multipaths. The defaults section is in place to either specify or override any defaults explicitly written in /usr/share/doc/device-mapper-multipath-<version>/multipath.conf.defaults. An example of what you might see in the defaults section of the multipath.conf file might be something similar to this for Symmetrix arrays:

defaults {

user_friendly_names   no

no_path_retry 5

}

Or similarly on a CLARiiON:

defaults {

user_friendly_names   no

path_selector “queue-length 0”

prio  alua

path_checker  emc_clariion

no_path_retry 5

hardware_handler      “1 alua”

}

As you can see in the above examples, I’ve opted (and recommend) turning off user friendly names. Ultimately what this does is tells DM-Multipath to specify the WWID inside of /dev/mapper (or /dev/mpath) as opposed to automatically naming devices such as mpatha, mpathb, mpathc, etc. At first this may seem counter intuitive, but I’d highly recommend coming up with your own device alias standards since I’ve found the automatic naming to be harder to manage than using a numerical system.

If you happened to notice from my example above, I’ve specified ALUA for the CLARiiONs. ALUA is an acronym for “Asymmetric Logical Unit Access” and without going into too much (or any) real detail, it is an optimization for some arrays to more efficiently handle how LUNs are seen and access data between a host system and the storage processor.

The next section of the multipath.conf file I mentioned above is blacklist_exceptions. This is exactly what it sounds like. In the blacklist section, you can specify specific devices to exclude from multipathing either directly or using a regex match, but you can override a disk if necessary. Probably the coolest thing about the blacklist section is that you can actually use a regex match similar to:

blacklist {

devnode “sda(^[0-9]| |$)”

}

You would generally do something like this in the event that /dev/sda is your root disk holding the OS. A benefit to using a regex pattern as in the example is the device sda, including any partitions listed on that device (sda1, sda2, etc.) are also excluded, however it won’t blacklist any disks such as sdaa, sdab, etc.

Keep in mind; you can spare yourself much of this trouble if you blacklist devices using the WWID such as:

blacklist {

wwid <wwid>

}

The next section is for devices a host is attached to. Here you would put any specific options for an array that you’d like, for example on a Symmetrix configuration, you might see something like:

devices {

device {

vendor                 “EMC”

product                 “SYMMETRIX”

features    “0”

path_selector       “round-robin 0”

path_grouping_policy        multibus

failback    immediate

rr_weight   uniform

no_path_retry       5

rr_min_io   1

}

}

And on a CLARiiON:

devices {

device {

vendor      “DGC”

product     “.*”

product_blacklist   “LUNZ”

features    “0”

path_selector       “queue-length 0”

path_grouping_policy        group_by_prio

failback    immediate

rr_weight   uniform

no_path_retry       5

rr_min_io   1

path_checker        emc_clariion

prio        alua

}

}

A word of advice; setting the rr_min_io to 1 seems to provide the most visible performance in comparison with any other options across the board. Keep in mind the default for rr_min_io is 1000 up through RHEL 6.3 (if I recall correctly).

The last section I’m going to cover is for the multipath devices themselves:

multipaths {

multipath {

wwid <wwid>

alias multipath_d1

}

multipath {

wwid <wwid>

alias multipath_d2

}

multipath {

wwid <wwid>

alias multipath_d3

}

}

As you can see, I’ve standardized on an alias scheme such as multipath_d[number]. From my perspective with regard to management, this seems to be an effective way to handle disks as I tend to think numbers are simply easier to deal with than letters, especially when it comes to regex in scripts, as well as sorting.

One thing to note, you can override options in the defaults section here. This can particularly come in handy if you’d like to change the ownership of a given disk. This can be done by setting uid and gid options such as:

multipath {

wwid <wwid>

alias multipath_d1

uid 34

gid 34

}

NOTE: Setting the UID and GID is only supported on RHEL versions 5.4 and later.

That’s pretty much all there is to the multipath config. Personally, I would opt to not use the template and write (or script) your own multipath.conf from scratch since it’ll be easier to read and understand in the long run. Also, once you have DM-Multipath running, make sure it starts on boot. Generally if you’re using RHEL, chkconfig is the preferred method:

chkconfig –levels 2345 multipathd on

Hopefully this serves as a decent primer to anyone interested in using DM-Multipath. I’d also highly recommend looking at the Red Hat docs to further understand all the available options in the various sections as it can prove to be an invaluable resource when it comes to tuning.