Monday, December 29, 2014

Custom EL6 AMIs With a Root LVM

One of the customers I work for is a security-focused organization. As such, they try to follow the security guidelines laid out within the SCAP guidelines for the operating systems they deploy. This particular customer is also engaged in couple of "cloud" initiatives - a couple privately-hosted and one publicly-hosted option. For the publicly-hosted cloud initiative, they make use of Amazon Web Services EC2 services.

The current SCAP guidelines for Red Hat Enterprise Linux (RHEL) 6 draw the bulk of their content straight from the DISA STIGS for RHEL 6. There are a few differences, here and there, but the commonality between the SCAP and STIG guidance - at least as of the SCAP XCCDF 1.1.4 and STIG Version 1, Release 5, respectively - is probably just shy of 100% when measured on the recommended tests and fixes. In turn, automating the guidance in these specifications allow you to quickly crank out predictably-secure Red Hat, CentOS, Scientific Linux or Amazon Linux systems.

For the privately-hosted cloud initiatives, supporting this guidance was a straight-forward matter. The solutions my customer uses all support the capability to network-boot and provision a virtual machine  (VM) from which to create a template. Amazon didn't provide similar functionality to my customer, somewhat limiting some of the things that can be done to create a fully-customized instance or resulting template (Amazon Machine Image - or "AMI" - in EC2 terminology).

For the most part this wasn't a problem to my customer. Perhaps the biggest sticking-point was that it meant that, at least initially, partitioning schemes used on the privately-hosted VMs couldn't be easily replicated on the EC2 instances.

Section 2.1.1 of the SCAP guidance calls for "/tmp", "/var", "/var/log", "/var/log/audit", and "/home" to each be on their own, dedicated partitions, separate from the "/" partition. On the privately-hosted cloud solutions, use of a common, network-based KickStart was used to carve the boot-disk into a /boot partition and an LVM volume-group (VG). The boot VG was then carved up to create the SCAP-mandated partitions.

With the lack of network-booting/provisioning support, it meant we didn't have the capability to extend our KickStart methodologies to the EC2 environment. Further, at least initially, Amazon didn't provide support for use of LVM on boot disks. The combination of the two limitations meant my customer couldn't easily meet the SCAP partioning requiremts. Lack of LVM meant that the boot disk had to be carved up using bare /dev/sdX devices. Lack of console defeated the ability to repartition an already-built system to create the requisite partitons on the boot disk. Initially, this meant that the AMIs we could field were limited to "/boot" and "/" partitions. This meant config-drift between the hosting environments and meant we had to get security-waivers for the Amazon-hosted environment.

Not being one who well-tolerates these kind of arbitrary-feeling deviances, I got to cracking with my Google searches. Most of what I found were older documents that focussed on how to create LVM-enabled, S3-backed AMIs. These weren't at all what I wanted - they were a pain in the ass to create, were stoopidly time-consuming to transfer into EC2 and the resultant AMIs hamstrung me on the instance-types I could spawn from them. So, I kept scouring around. In the comments section to one of the references for S3-backed AMIs, I saw a comment about doing a chroot() build. So, I used that as my next branch of Googling about.

Didn't find a lot for RHEL-based distros - mostly Ubuntu and some others. That said, it gave me the starting point that I needed to find my ultimate solution. Basically, that solution comes down to:

  1. Pick an EL-based AMI from the Amazon Marketplace (I chose a CentOS one - I figured that using an EL-based starting point would ease creating my EL-based AMI since I'd already have all the tools I needed and in package names/formats I was already familiar with)
  2. Launch the smallest instance-size possible from the Marketplace AMI (8GB when I was researching the problem)
  3. Attach an EBS volume to the running instance - I set mine to the minimum size possible (8GB) figuring I could either grow the resultant volumes or, once I got my methods down/automated, use a larger EBS for my custom AMI.
  4. Carve the attached EBS up into two (primary) partitions. I like using `parted` for this, since I can specify the desired, multi-partition layout (and all the offsets, partition types/labels, etc.) in one long command-string.
    • I kept "/boot" in the 200-400MB range. Could probably keep it smaller since the plans weren't so much to patch instantiations as much as periodically use automated build tools to launch instances from updated AMIs and re-deploy the applications onto the new/updated instances.
    • I gave the rest of the disk to the partition that would host my root VG.
  5. I `vgcreate`d my root volume group, then carved it up into the SCAP-mandated partitions (minus "/tmp" - we do that as a tmpfs filesystem since the A/V tools that SCAP wants you to have tend to kill system performance if "/tmp" is on disk - probably not relevant in EC2, but consistency across environments was a goal of the exercise)
  6. Create ext4 filesystems on each of my LVs and my "/boot" partition.
  7. Mount all of the filesystems under "/mnt" to support a chroot-able install (i.e., "/mnt/root", "/mnt/root/var", etc.)
  8. Create base device-nodes within my chroot-able install-tree (you'll want/need "/dev/console", "/dev/null", "/dev/zero", "/dev/random", "/dev/urandom", "/dev/tty" and "/dev/ptmx" - modes, ownerships and major/minor numbers should match what's in your live OS's)
  9. Setup loopback mounts for "/proc", "/sys", "/dev/pts" and "/dev/shm",
  10. Create "/etc/fstab" and "/etc/mtab" files within my chroot-able install-tree (should resemble the mount-scheme you want in your final AMI - dropping the "/mnt/root" from the paths)
  11. Use `yum` to install the same package-sets to the chroot that our normal KickStart processes would install.
  12. The `yum` install should have created all of your "/boot" files with the exception of your "grub.conf" type files. 
    • Create a "/mnt/boot/grub.conf" file with vmlinuz/initramfs references matching the ones installed by `yum`.
    • Create links to your "grub.conf" file:
      • You should have an "/mnt/root/etc/grub.conf" file that's a sym-link to your "/mnt/root/boot/grub.conf" file (be careful how you create this sym-link so you don't create an invalid link)
      • Similarly, you'll want a "/mnt/root/boot/grub/grub.conf" linked up to "/mnt/root/boot/grub.conf" (not always necessary, but it's a belt-and-suspenders solution to some issues related to creating PVM AMIs)
  13. Create a basic eth0 config file at "/mnt/root/etc/sysconfig/network-scripts/ifcfg-eth0". EC2 instances require the use of DHCP for networking to work properly. A minimal network config file should look something like:
    DEVICE=eth0
    BOOTPROTO=dhcp
    ONBOOT=on
    IPV6INIT=no
    
  14. Create a basic network-config file at "/mnt/root/etc/sysconfig/network". A minimal network config file should look something like:
    NETWORKING=yes
    NETWORKING_IPV6=no
    HOSTNAME=localhost.localdomain
    
  15. Append "UseDNS no" and "PermitRootLogin without-password" to the end of your "/mnt/root/etc/ssh/sshd_config" file. The former fixes connect-speed problems related to EC2's use of private IPs on their hosted instances. The latter allows you to SSH in as root for the initial login - but only with a valid SSH key (don't want to make newly-launched instances instantly ownable!)
  16. Assuming you want instances started from your AMI to use SELinux:
    • Do a `touch /mnt/root/.autorelabel`
    • Make sure that the "SELINUX" value in "/mnt/root/etc/selinux/config" is set to either "permissive" or "enforcing"
  17. Create an unprivileged login user within the chroot-able install-tree. Make sure a password is set and the the user is able to use `sudo` to access root (since I recommend setting root's password to a random value).
  18. Create boot init script that will download your AWS public key into the root and/or maintenance user's ${HOME}/.ssh/authorized_keys file. At its most basic, this should be a run-once script that looks like:
    curl -f http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key > /tmp/pubkey
    install --mode 0700 -d ${KEYDESTDIR}
    install --mode 0600 /tmp/pubkey ${KEYDESTDIR}/authorized_keys
    Because "/tmp" is an ephemeral filesystem, the next time the instance is booted, the "/tmp/pubkey" will self-clean. Note that an appropriate destination-directory will need to exist
  19. Clean up the chroot-able install-tree:
    yum --installroot=/mnt/root/ -y clean packages
    rm -rf /mnt/root/var/cache/yum
    rm -rf /mnt/root/var/lib/yum
    cat /dev/null > /mnt/root/root/.bash_history
    
  20. Unmount all of the chroot-able install-tree's filesystems.
  21. Use `vgchange` to deactivate the root VG
  22. Using the AWS console, create a snapshot of the attached EBS.
  23. Once the snapshot completes, you can then use the AWS console to create an AMI from the EBS-snapshot using the "Create Image" option. It is key that you set the "Root Device Name", "Virtualization Type" and "Kernel ID" parameters to appropriate values.
    • The "Root Device Name" value will auto-populate as "/dev/sda1" - change this to "/dev/sda"
    • The "Virtualization Type" should be set as "Paravirtual".
    • The appropriate value for the "Kernel ID" parameter will vary from AWS availability-region to AWS availability-region (for example, the value for "US-East (N. Virginia)" will be different from the value for "EU (Ireland)"). In the drop-down, look for a description field that contains "pv-grub-hd00". There will be several. Look for the highest-numbered option that matches your AMIs architecture (for example, I would select the kernel with the description "pv-grub-hd00_1.04-x86_64.gz" for my x86_64-based EL 6.x custom AMI).
    The other Parameters can be tweaked, but I usually leave them as is.
  24. Click the "Create" button, then wait for the AMI-creation to finish.
  25. Once the "Create" finishes, the AMI should be listed in your "AMIs" section of the AWS console.
  26. Test the new AMI by launching an instance. If the instance successfully completes its launch checks and you are able to SSH into it, you've successfully created a custom, PVM AMI (HWM AMIs are fairly easily created, as well, but require some slight deviations that I'll cover in another document).
I've automated much of the above tasks using some simple shell scripts and the Amazon EC2 tools. Use of the EC2 tools is well documented by Amazon. Their use allows me to automate everything within the instance launched from the Marketplace AMI (I keep all my scripts in Git, so, prepping a Marketplace AMI for building custom AMIs takes maybe two minutes on top of launching the generic Marketplace AMI). When automated as I have, you can go from launching your Marketplace AMI to having a launchable custom AMI in as little as twenty minutes.

Properly automated, generating updated AMIs as security fixes or other patch bundles come out is as simple as kicking off a script, hitting the vending machine for a fresh Mountain Dew, then coming back to launch new, custom AMIs.

Wednesday, November 26, 2014

Converting to EL7: Solving The "Your Favorite Service Isn't Systemd-Enabled" Problem

Having finally gotten off my butt to knock out getting my RHCE (for EL6) before the 2014-12-19 drop-dead date, I'm finally ready to start focusing on migrating my personal systems to EL7-based distros.

My personal VPS is currently running CentOS 6.6. I use my VPS to host a couple of personal websites and email for family and a few friends. Yes, I realize that it would probably be easier to offload all of this to providers like Google. However, while Google is very good at SPAM-stomping, and provides me a very generous amount of space for archiving emails, one area that they do lack for is email aliases: whenever I have to register to a new web-site, I use a custom email address to do so. At my last pruning, I still had 300+, per-site, aliases. So, for me, number of available aliases ("unlimited" is best) and ease of creating them trumps all other considerations.

Since I don't have Google handling my mail for me, I have to run my own A/V and anti-spam engines. Being a good Internet Citizen, I also like to make use of Sender Policy Framework (via OpenSPF) and DomainKeys (currently via DKIMproxy).

I'm only just into the process of sorting out what I need to do to make the transition as quick and as painless (more for my family and friends than me) a process as possible. I hate outages. And, with a week off for the Thanskgiving holidays, I've got time to do things in a fairly orderly fashion.

At any rate, one of the things I discovered is that my current DomainKeys solution hasn't been updated to "just work" within the systemd framework used within EL7. This isn't terribly surprising, as it appears that the DKIMproxy SourceForge project may have gone dormant, in 2013 (so, I'll have to see if there's alternatives that have the appearance of still being a going concern - in the mean time...) Fortunately, the DKIMproxy source code does come with a `chkconfig` compatible SysV-init script. Even more fortunately, converting from SysV-init to a systemd-compatible service control is a bit more straight forward than when I was dealing with moving from Solaris 9's legacy init to Solaris 10's SMF.

If you've already got a `chkconfig` style init script, moving to systemd-managed is fairly trivial. Your `chkconfig` script can be copied, pretty much "as is" into "/usr/lib/systemd". My (current) preference is to create a "scripts" subdirectory and put it in there. Haven't read deeply enough into systemd to see if this is the "Best Practices" method, however. Also, where I work has no established conventions ...because they only started migrating to EL6 in fall of 2013 - so, I can't exactly crib anything EL7-related from how we do it at work.

Once you have your SysV-init style script placed where it's going to live (e.g., "/usr/lib/systemd/scripts"), you need to create associated service definition files. In my particular case, I had to create two as the DKIMproxy software actually has an inbound and an outbound funtion. Launched from normal SysV-init, it all gets handled as one piece. However, one of the nice things about systemd is it's not only a launcher framework, it's a service monitor framework, as well. To take full advantage, I wanted one monitor for the inbound service and one for the outbound service. The legacy init script that DKIMproxy ships with makes this easy enough as, in addition to the normal "[start|stop|restart|status]" arguments, it had per-direction subcommand (e.g., "start-in" and "stop-out"). The service-definition for my "dkim-in.service" looks like:
[Unit]
     Description=Manage the inbound DKIM service
     After=postfix.service


     [Service]
     Type=forking
     PIDFile=/usr/local/dkimproxy/var/run/dkimproxy_in.pid
     ExecStart=/usr/lib/systemd/scripts/dkim start-in
     ExecStop=/usr/lib/systemd/scripts/dkim stop-in


     [Install]
     WantedBy=multi-user.target

To break down the above:

  • The "Unit" stanza tells systemd a bit about your new service:
    • The "Description" line is just ASCII text that allows you to provide a short, meaningful of what the service does. You can see your service's description field by typing `systemctl -p Description show <SERVICENAME>`
    • The "After" parameter is a space-separated list of other services that you want to have successfully started before systemd attempts to start your new service. In my case, since DKIMproxy is an extension to Postfix, it doesn't make sense to try to have DKIMproxy running until/unless Postfix is running.
  • The "Service" stanza is where you really define how your service should be managed. This is where you tell systemd how to start, stop, or reload your service and what PID it should look for so it knows that the service is still notionally running. The following parameters are the minimum ones you'd need to get your service working. Other parameters are available to provide additional functionality:
    • The "Type" parameter tells systemd what type of service it's managing. Valid types are: simpleforking, oneshot, dbus, notify or idle. The systemd.service man page more-fully defines what each option is best used for. However, for a traditional daemonized service, you're most likely to want "forking".
    • The "PIDFile" parameter tells systemd where to find a file containing the parent PID for your service. It will then use this to do a basic check to monitor whether your service is still running (note that this only checks for presence, not actual functionality).
    • The "ExecStart" parameter tells systemd how to start your service. In the case of a SysV-init script, you point it to the fully-qualified path you installed your script to and then any arguments necessary to make that script act as a service-starter. If you don't have a single, chkconfig-style script that handles both stop and start functions, you'd simply give the path to whatever starts your service. Notice that there are no quotations surrounding the parameter's value-section. If you put quotes - in the mistaken belief that the starter-command and it's argument need to be grouped, you'll get a path error when you go to start your service the first time.
    • The "ExecStop" parameter tells systemd how to stop your service. As with the "ExecStart" parameter, if you're leveraging a fully-featured SysV-init script, you point it to the fully-qualified path you installed your script to and then any arguments necessary to make that script act as a service-stopper. Also, the same rules about white-space and quotation-marks apply to the "ExecStop" parameter as do the "ExecStart" parameter.
  • The "Install" stanza is where you tell systemd the main part of the service dependency-tree to put your service. You have two main dependency-specifiers to choose: "WantedBy" and "RequiredBy". The former is a soft-dependency while the latter is a hard-dependency. If you use the "RequiredBy" parameter, then the service unit-group (e.g., "mult-user.target") enumerated with the "RequiredBy" parameter will only be considered to have successfully onlined if the defined service has successfully launched and stayed running.  If you use the "WantedBy" parameter, then the service unit-group (e.g., "mult-user.target") enumerated with the "WantedBy" parameter will still be considered to have successfully onlined whether the defined service has successfully launched or stayed running. It's most likely you'll want to use "WantedBy" rather than "RequiredBy" as you typically won't want systemd to back off the entire unit-group just because your service failed to start or stay running (e.g., you don't want to stop all of the multi-user mode related processes just because one network service has failed.)

Tuesday, June 17, 2014

UDEV Friendly-Names to Support ASM Under VMware-hosted Linux Guest

This past month or so, we've been setting up a new vSphere hosting environment for a new customer. Our first guinnea-pig tenant is being brought into the virtualized hosting-environment. This first tenant has a mix of Windows and Linux systems running a multi-layer data-processing system based on a back-end Oracle database.

As part of our tenancy, process, we'd come up with a standard build request form. In general, we prefer a model that separates application data from OS data. In addition to the usual "how much RAM and CPU do you need" information, the form includes configuration-capture items for storage for applications hosted on the VMs.  The table has inputs for both the requested supplemental storage sizes and where/how to mount those chunks.

This first tenant simply filled in a sum of their total additional storage request with no indication as to how they expected to use it. After several iterations of "if you have specific requirements, we need you to detail them" emails, I sent a final "absent the requested configuration specifications, the storage will be added but left unconfigured". It was finally at this point that the tenant responded back saying "there's a setup guide at this URL - please read that and configure accordingly".

Normally, this is not how we do things. The solution we offer tends to be more of a extended IaaS model: in addition to providing a VM container, we provide a basic, hardened OS configuration (installing an OS and patching it to a target-state) configure basic networking and name-resolution and perform basic storage configuration tasks.

This first tenant was coming from a physical Windows/Red Hat environment and were testing the waters of virtualization. As a result, most of their configuration expectations were based on physical servers (SAN based storage with native multipathing-support). The reference documents they pointed us to were designed for implementing Oracle on a physical system using ASM on top of Linux dm-multipath storage objects ...not something normally done within an ESX-hosted Red Hat Linux configuration.

We weren't going to layer-on dm-multipath support, but the tenant still had the expectation of using "friendly" storage object names for ASM. The easy "friendly" storage object name path is to use LVM. However, Oracle generally recommends against using ASM in conjunction with third-party logical volume management systems. So, LVM was off the table. How best to give the desired storage configs?

I opted to let udev do the work for me. Unfortunately, because we weren't anticipating this particular requirements-set, the VM templates we'd created didn't have some of the hooks available that would allow udev to do its thing. Specifically, no UUIDs were being presented into the Linux guests. Further complicating things is the fact that, with the hardened Linux build we furnish, most of the udev tools and the various hardware information tools are not present. Down side is that it made things more difficult than they probably absolutely needed to be. The up side is the following procedures should be portable across a fairly wide variety of Linux implementations:
  1. To have VMware provide serial number information - from which UUIDs can be generated by the guest operating system, it's necessary to make a modification to the VM's advance configuration options. Ensure that the “disk.EnabledUUID” has been created for the VM and the value set to “TRUE”. Specific method for doing so varies depending on whether you use the vSphere web UI or the VPX client (or even the vmcli or direct editing of config files) to do your configuration tasks. Google for the specifics of your preferred management method.
  2. If the you had to create/change the value in the prior step, reboot the VM so that the config changes take effect
  3. Present the disks to be used by ASM to the Linux guest – if adding SCSI controllers, this step will need to be done while guest is powered off.
  4. Verify that VM is able to see new VMDKs. If suplemental disk presentation was done while the VM was running, initiate a SCSI-bus rescan (e.g., `echo "- - -" > /sys/class/scsi_host/host1/rescan`)
  5. Lay :down an aligned, full-disk partition with the `parted` utility for each presented VMDK/disk. For example, if one of the newly-presented VMDKs was seen by the Linux OS as /dev/sdb:

    # parted –s /dev/sdb -- mklabel msdos mkpart primary ext3 1024s 100%

    Specifying an explicit starting-block (at 1024 or 2048 blocks) and using the relative ending-location, as above, will help ensure that your partition is created on an even storage-boundary. Google around for discussions on storage alignment and positive impact on virtualization environments for details on why the immediately-prior is usually a Good Thing™.
  6. Ensure that the “options=” line in the “/etc/scsi_id.config” file contains the “-g” option
  7. For each newly-presented disk, execute the command `/sbin/scsi_id -g -s /block/{sd_device}` and capture the output.
  8. Copy each disk’s serial number (obtained in the prior step) is copied into the “/etc/udev.d/rules.d/99-oracle-udev.rules” file
  9. Edit the “/etc/udev.d/rules.d/99-oracle-udev.rules” file, ensuring that each serial number has an entry similar to:

    KERNEL=="sd*",BUS=="scsi",ENV{ID_SERIAL}=="{scsi_id}", NAME="ASM/disk1", OWNER="oracle", GROUP="oinstall", MODE="660"

    The "{scsi_id}" shown above is a variable name: substitute with the values previously captured via the `/sbin/scsi_id` command. The "NAME=" field should be similarly edited to suite and should be unique for each SCSI serial number.

    Note: If attempting to make per disk friendly-names (e.g., “/dev/db1p1”, “/dev/db2p1”, “/dev/frap1”, etc.) it will be necessary to match LUNs by size to appropriate ‘NAME=’ entries

  10. Reboot the system so that the udev service can process the new rule entries
  11. Verify that the desired /dev/ASM/<NAME>” entries exist
  12. Configure storage-consumers (e.g., “ASM”) to reference the aligned udev-defined device-nodes.
If your Linux system has various hardware information tools, udev management interfaces and the sg3tools installed, some tasks for finding information are made much easier and some of the reboot-steps specified in this document become unnecessary.

Thursday, June 12, 2014

Template-Deployed VMs and the "When Was I Built" Problem

For the past number of years, I have been supporting Linux systems hosted within various virtualization environments. Most of these environments have made used of template-based VM deployment.

In a large, dynamic, enterprise-scale environment, the question often comes up, "when was this host built". In such environments, there may be a number of methods to derive such information - hypervisor management server logs, service-automation engine logs, etc. However, such data can also be somewhat ephemeral due to things as small as log-truncation up through replacement of service-automation and configuration-management tools/fraemeworks.

Fortunately, the Enterprise Linux family of Linux distributions (Red Hat, CentOS Scientific Linux, etc.), offers a fairly stable method for determining when a system was first provisioned. Whenever you first build an ELx-based system, one of the files that gets installed - and then never gets updated - is the "basesystem" RPM. So, if you look at the install date for this RPM (and the system time was correctly-set at its installation time), you will have an accurate representation of when the system was built.

That said, it had previously-occurred to me (a while ago, actually) that the “deploy from template” method of building Linux VMs precludes using the rpm database from determining system build time. Unlike with a KickStarted system - where you can always run `rpm -q --qf '%{installtime:date}\n' basesystem` and it will give you the install-date for the system - doing so on a template-built system will mislead you. When deployed from a template, that method returns when the template VM was built, not when the running VM was deployed from that template.

This had been bugging me for several years now. I'd even posed the question of how to solve it on a few forums to no avail (a number of respondents hadn't been aware of the "use basesystem to show my system install-date" trick so hadn't investigated how to solve a problem they didn't know existed). One day, while I was at our engineering lab and was waiting for some other automated tasks to run, I had one of those "I wonder if this will work" moments that allowd me to finally figure out how to “massage” the RPM database so that the basesystem RPM can reflect a newer install date:

# rpm -q --qf '%{installtime:date}\n' basesystem
Tue 12 Jul 2011 11:24:06 AM EDT
# rpm -i --force --justdb basesystem-10.0-4.el6.noarch.rpm
# rpm -q --qf '%{installtime:date}\n' basesystem
Wed 11 Jun 2014 09:21:13 PM EDT

Thus, if you drop something similar to the above into your VM's system prep/cloudinit/etc. scripts, your resultant VM will have its instantiation-date captured and not just its template-build date.