Wednesday, November 23, 2016

Manually Mirroring Git Repos

If you're like me, you've had occasions where you need to replicate Git-managed projects from one Git service to another. If you're not like me, you use paid services that make such mundane tasks a matter of clicking a few buttons in a GUI. If, however, you need to copy projects from one repository-service to another and no one has paid to make a GUI buttton/config-page available to you, then you need to find other methods to get things done.

The following assumes that you have a git project hosted in one repository service (e.g, GitHub) that you wish to mirror to another repository service (e.g., BitBucket, AWS CodeCommit, etc). The basic workflow looks like the following:

Procedure Outline:

  1. Login to a git-enabled host
  2. Create a copy of your "source-of-truth" repository, depositing its contents to a staging-directory:
    git clone --mirror \
       <REPOSITORY_USER>@<REPOSITORY1.DNS.NAME>:<PROJECT_USER_OR_GROUP>/<PROJECT_NAME>.git \
       stage
  3. Navigate into the staging-directory:
    cd stage
  4. Set the push-destination to the copy-repository:
    git remote set-url --push origin \
       <REPOSITORY_USER>@<REPOSITORY2.DNS.NAME>:<PROJECT_USER_OR_GROUP>/<PROJECT_NAME>.git
  5. Ensure the staging-directory's data is still up to date:
    git fetch -p origin
  6. Push the copied source-repository's data to the copy-repository:
    git push --mirror

Procedure Outline:


Using an example configuration (the AMIgen6 project):
$ git clone --mirror git@github.com:ferricoxide/AMIgen6.git stage && \
  cd stage && \
  git remote set-url --push origin git@bitbucket.org:ferricoxide/amigen6-copy.git && \
  git fetch -p origin && \
  git push --mirror
Cloning into bare repository 'stage'...
remote: Counting objects: 789, done.
remote: Total 789 (delta 0), reused 0 (delta 0), pack-reused 789
Receiving objects: 100% (789/789), 83.72 MiB | 979.00 KiB/s, done.
Resolving deltas: 100% (409/409), done.
Checking connectivity... done.
Counting objects: 789, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (369/369), done.
Writing objects: 100% (789/789), 83.72 MiB | 693.00 KiB/s, done.
Total 789 (delta 409), reused 789 (delta 409)
To git@bitbucket.org:ferricoxide/amigen6-copy.git
 * [new branch]      ExtraRPMs -> ExtraRPMs
 * [new branch]      SELuser-fix -> SELuser-fix
 * [new branch]      master -> master
 * [new branch]      refs/pull/38/head -> refs/pull/38/head
 * [new branch]      refs/pull/39/head -> refs/pull/39/head
 * [new branch]      refs/pull/40/head -> refs/pull/40/head
 * [new branch]      refs/pull/41/head -> refs/pull/41/head
 * [new branch]      refs/pull/42/head -> refs/pull/42/head
 * [new branch]      refs/pull/43/head -> refs/pull/43/head
 * [new branch]      refs/pull/44/head -> refs/pull/44/head
 * [new branch]      refs/pull/52/head -> refs/pull/52/head
 * [new branch]      refs/pull/53/head -> refs/pull/53/head
 * [new branch]      refs/pull/54/head -> refs/pull/54/head
 * [new branch]      refs/pull/55/head -> refs/pull/55/head
 * [new branch]      refs/pull/56/head -> refs/pull/56/head
 * [new branch]      refs/pull/57/head -> refs/pull/57/head
 * [new branch]      refs/pull/62/head -> refs/pull/62/head
 * [new branch]      refs/pull/64/head -> refs/pull/64/head
 * [new branch]      refs/pull/65/head -> refs/pull/65/head
 * [new branch]      refs/pull/66/head -> refs/pull/66/head
 * [new branch]      refs/pull/68/head -> refs/pull/68/head
 * [new branch]      refs/pull/71/head -> refs/pull/71/head
 * [new branch]      refs/pull/73/head -> refs/pull/73/head
 * [new branch]      refs/pull/76/head -> refs/pull/76/head
 * [new branch]      refs/pull/77/head -> refs/pull/77/head

Updating (and Automating)

To keep your copy-repository's project in sync with your source-repository's project, periodically do:
cd stage && \
  git fetch -p origin && \
  git push --mirror
This can be accomplished by logging into a host and executing the steps manually or placing them into a cron job.

Thursday, November 3, 2016

Automation in a Password-only UNIX Environment

Occasionally, my organization has need to run ad hoc queries against a large number of Linux systems. Usually, this is a great use-case for an enterprise CM tool like HPSA, Satellite, etc. Unfortunately, the organization I consult to is between solutions (their legacy tool burned down and their replacement has yet to reach a working state). The upshot is that, one needs to do things a bit more on-the-fly.

My preferred method for accessing systems is using some kind of token-based authentication framework. When hosts are AD-integrated, you can often use Kerberos for this. Failing that, you can sub in key-based logins (if all of your targets have your key as an authorized key). While my customer's systems are AD-integrated, their security controls preclude the use of both AD/Kerberos's single-signon capabilities and the use of SSH key-based logins (and, to be honest, almost none of the several hundred targets I needed to query had my key configured as an authorized key).

Because (tunneled) password-based logins are forced, I was initially looking at the prospect of having to write an expect script to avoid having type in my password several hundred times. Fortunately, there's an alternative to this in the tool "sshpass".

SSH pass lets you supply your password with a number of methods: command-line argument, a password-containing file, an environment variable value or even a read from STDIN. I'm not a fan of text files containing passwords (they've a bad tendency to be forgotten and left on a system - bad juju). I'm not particularly a fan of command-line arguments, either - especially on a multi-user system where others might see your password if they `ps` at the wrong time (which increases in probability as the length of time your job runs goes up). The STDIN method is marginally less awful than the command arg method (for similar reasons). At least with an environment variable, the value really only sits in memory (especially if you've set your HISTFILE location to someplace non-persistent).

The particular audit I was doing was an attempt to determine the provenance of a few hundred VMs. Over time, the organization has used templates authored by several groups - and different personnel within one of the groups. I needed to scan all of the systems to see which template they might have been using since the template information had been deleted from the hosting environment. Thus, I needed to run an SSH-encapsulated command to find the hallmarks of each template. Ultimately, what I ended up doing was:

  1. Pushed my query-account's password into the environment variable used by sshpass, "SSHPASS"
  2. Generated a file containing the IPs of all the VMs in question.
  3. Set up a for loop to iterate that list
  4. Looped `sshpass -e ssh -o PubkeyAuthentication=no StrictHostKeyChecking=no <USER>@${HOST} <AUDIT_COMMAND_SEQUENCE> 2>&1
  5. Jam STDOUT through a a sed filter to strip off the crap I wasn't interested in and put CSV-appropriate delimeters into each, queried host's string.
  6. Capture the lot to a text file
The "PubkeyAuthentication=no" option was required because I pretty much always have SSH-agent (or agent-forwarding) enabled. This causes my key to be passed. With the targets' security settings, this causes the connection to be aborted unless I explicitly suppress the passing of my agent-key.

The "StrictHostKeyChecking=no" option was required because I'd never logged into these hosts before. Our standard SSH client config is to require confirmation before accepting the remote key (and shoving it into ${HOME}/.ssh/known_hosts). Without this option, I'd be required to confirm acceptance of each key ...which is just about as automation-breaking as having to retype your password hundreds of times is.

Once the above was done, I had a nice CSV that could be read into Excel and a spreadsheet turned over to the person asking "who built/owns these systems". 

This method also meant that for the hosts that refused the audit credentials, it was easy enough to report "...and this lot aren't properly configured to work with AD".