Friday, January 8, 2016

Solving Root VG Collisions in LVM-Enabled Virtualization Templates

The nature of template-based Linux OS deployments means that, if a template uses LVM for its root filesystems, any system built from that template will have non-unique volume group (VG) names. In most situations, non-unique VG names are not a problem. However, if you encounter a situation where you need to a broken instance by correcting a problem within the instance's root filesystems, non-unique VG names can make that task more difficult.

To avoid this eventuality, the template user can easily modify each launched template by executing steps similar to the following:
#!/bin/sh

DEFIF=$(ip route show | awk '/^default/{print $5}')
BASEIP=$(printf '%02X' \
         $(ip addr show ${DEFIF} | \
           awk '/inet /{print $2}' | \
           sed -e 's#/.*$##' -e 's/\./ /g' \
          ))

vgrename -v VolGroup00 VolGroup00_${BASEIP}
sed -i 's/VolGroup00/&_'${BASEIP}'/' /etc/fstab
sed -i 's/VolGroup00/&_'${BASEIP}'/g' /boot/grub/grub.conf

for KRNL in $(awk '/initrd/{print $2}' /boot/grub/grub.conf | \
              sed -e 's/^.*initramfs-//' -e 's/\.img$//')
do
   mkinitrd -f -v /boot/initramfs-${KRNL}.img ${KRNL}
done

init 6
Note that the above script assumes that the current root VG name is "VolGroup00". If your current root VG name is different, change the value in the script above as appropriate.

This script may be executed either at instance launch-time or anywhere in the life-cycle of an an instance. The above script takes the existing root VG name and tacks on a uniqueness component. In this case, the uniqueness is achieved by taking the IP address of the instance's primary interface and converting it to a hexadecimal string. So long as a group of systems does not contain any repeated primary IP addresses, this should provide a sufficient level of uniqueness for a group of deployed systems.

Note: renaming the root VG will _not_ solve the problems caused by PV UUID non-uniqueness. Currently, there is no known-good solution to this issue. The general recommendation is to avoid that problem by using a different template to build your recovery-host than used to build your broken host.