Tuesday, June 17, 2014

UDEV Friendly-Names to Support ASM Under VMware-hosted Linux Guest

This past month or so, we've been setting up a new vSphere hosting environment for a new customer. Our first guinnea-pig tenant is being brought into the virtualized hosting-environment. This first tenant has a mix of Windows and Linux systems running a multi-layer data-processing system based on a back-end Oracle database.

As part of our tenancy, process, we'd come up with a standard build request form. In general, we prefer a model that separates application data from OS data. In addition to the usual "how much RAM and CPU do you need" information, the form includes configuration-capture items for storage for applications hosted on the VMs.  The table has inputs for both the requested supplemental storage sizes and where/how to mount those chunks.

This first tenant simply filled in a sum of their total additional storage request with no indication as to how they expected to use it. After several iterations of "if you have specific requirements, we need you to detail them" emails, I sent a final "absent the requested configuration specifications, the storage will be added but left unconfigured". It was finally at this point that the tenant responded back saying "there's a setup guide at this URL - please read that and configure accordingly".

Normally, this is not how we do things. The solution we offer tends to be more of a extended IaaS model: in addition to providing a VM container, we provide a basic, hardened OS configuration (installing an OS and patching it to a target-state) configure basic networking and name-resolution and perform basic storage configuration tasks.

This first tenant was coming from a physical Windows/Red Hat environment and were testing the waters of virtualization. As a result, most of their configuration expectations were based on physical servers (SAN based storage with native multipathing-support). The reference documents they pointed us to were designed for implementing Oracle on a physical system using ASM on top of Linux dm-multipath storage objects ...not something normally done within an ESX-hosted Red Hat Linux configuration.

We weren't going to layer-on dm-multipath support, but the tenant still had the expectation of using "friendly" storage object names for ASM. The easy "friendly" storage object name path is to use LVM. However, Oracle generally recommends against using ASM in conjunction with third-party logical volume management systems. So, LVM was off the table. How best to give the desired storage configs?

I opted to let udev do the work for me. Unfortunately, because we weren't anticipating this particular requirements-set, the VM templates we'd created didn't have some of the hooks available that would allow udev to do its thing. Specifically, no UUIDs were being presented into the Linux guests. Further complicating things is the fact that, with the hardened Linux build we furnish, most of the udev tools and the various hardware information tools are not present. Down side is that it made things more difficult than they probably absolutely needed to be. The up side is the following procedures should be portable across a fairly wide variety of Linux implementations:
  1. To have VMware provide serial number information - from which UUIDs can be generated by the guest operating system, it's necessary to make a modification to the VM's advance configuration options. Ensure that the “disk.EnabledUUID” has been created for the VM and the value set to “TRUE”. Specific method for doing so varies depending on whether you use the vSphere web UI or the VPX client (or even the vmcli or direct editing of config files) to do your configuration tasks. Google for the specifics of your preferred management method.
  2. If the you had to create/change the value in the prior step, reboot the VM so that the config changes take effect
  3. Present the disks to be used by ASM to the Linux guest – if adding SCSI controllers, this step will need to be done while guest is powered off.
  4. Verify that VM is able to see new VMDKs. If suplemental disk presentation was done while the VM was running, initiate a SCSI-bus rescan (e.g., `echo "- - -" > /sys/class/scsi_host/host1/rescan`)
  5. Lay :down an aligned, full-disk partition with the `parted` utility for each presented VMDK/disk. For example, if one of the newly-presented VMDKs was seen by the Linux OS as /dev/sdb:

    # parted –s /dev/sdb -- mklabel msdos mkpart primary ext3 1024s 100%

    Specifying an explicit starting-block (at 1024 or 2048 blocks) and using the relative ending-location, as above, will help ensure that your partition is created on an even storage-boundary. Google around for discussions on storage alignment and positive impact on virtualization environments for details on why the immediately-prior is usually a Good Thing™.
  6. Ensure that the “options=” line in the “/etc/scsi_id.config” file contains the “-g” option
  7. For each newly-presented disk, execute the command `/sbin/scsi_id -g -s /block/{sd_device}` and capture the output.
  8. Copy each disk’s serial number (obtained in the prior step) is copied into the “/etc/udev.d/rules.d/99-oracle-udev.rules” file
  9. Edit the “/etc/udev.d/rules.d/99-oracle-udev.rules” file, ensuring that each serial number has an entry similar to:

    KERNEL=="sd*",BUS=="scsi",ENV{ID_SERIAL}=="{scsi_id}", NAME="ASM/disk1", OWNER="oracle", GROUP="oinstall", MODE="660"

    The "{scsi_id}" shown above is a variable name: substitute with the values previously captured via the `/sbin/scsi_id` command. The "NAME=" field should be similarly edited to suite and should be unique for each SCSI serial number.

    Note: If attempting to make per disk friendly-names (e.g., “/dev/db1p1”, “/dev/db2p1”, “/dev/frap1”, etc.) it will be necessary to match LUNs by size to appropriate ‘NAME=’ entries

  10. Reboot the system so that the udev service can process the new rule entries
  11. Verify that the desired /dev/ASM/<NAME>” entries exist
  12. Configure storage-consumers (e.g., “ASM”) to reference the aligned udev-defined device-nodes.
If your Linux system has various hardware information tools, udev management interfaces and the sg3tools installed, some tasks for finding information are made much easier and some of the reboot-steps specified in this document become unnecessary.

Thursday, June 12, 2014

Template-Deployed VMs and the "When Was I Built" Problem

For the past number of years, I have been supporting Linux systems hosted within various virtualization environments. Most of these environments have made used of template-based VM deployment.

In a large, dynamic, enterprise-scale environment, the question often comes up, "when was this host built". In such environments, there may be a number of methods to derive such information - hypervisor management server logs, service-automation engine logs, etc. However, such data can also be somewhat ephemeral due to things as small as log-truncation up through replacement of service-automation and configuration-management tools/fraemeworks.

Fortunately, the Enterprise Linux family of Linux distributions (Red Hat, CentOS Scientific Linux, etc.), offers a fairly stable method for determining when a system was first provisioned. Whenever you first build an ELx-based system, one of the files that gets installed - and then never gets updated - is the "basesystem" RPM. So, if you look at the install date for this RPM (and the system time was correctly-set at its installation time), you will have an accurate representation of when the system was built.

That said, it had previously-occurred to me (a while ago, actually) that the “deploy from template” method of building Linux VMs precludes using the rpm database from determining system build time. Unlike with a KickStarted system - where you can always run `rpm -q --qf '%{installtime:date}\n' basesystem` and it will give you the install-date for the system - doing so on a template-built system will mislead you. When deployed from a template, that method returns when the template VM was built, not when the running VM was deployed from that template.

This had been bugging me for several years now. I'd even posed the question of how to solve it on a few forums to no avail (a number of respondents hadn't been aware of the "use basesystem to show my system install-date" trick so hadn't investigated how to solve a problem they didn't know existed). One day, while I was at our engineering lab and was waiting for some other automated tasks to run, I had one of those "I wonder if this will work" moments that allowd me to finally figure out how to “massage” the RPM database so that the basesystem RPM can reflect a newer install date:

# rpm -q --qf '%{installtime:date}\n' basesystem
Tue 12 Jul 2011 11:24:06 AM EDT
# rpm -i --force --justdb basesystem-10.0-4.el6.noarch.rpm
# rpm -q --qf '%{installtime:date}\n' basesystem
Wed 11 Jun 2014 09:21:13 PM EDT

Thus, if you drop something similar to the above into your VM's system prep/cloudinit/etc. scripts, your resultant VM will have its instantiation-date captured and not just its template-build date.