ssh deploy key for continuous delivery

One pattern I see over and over again when looking at continuous delivery pipelines, is the use of an ssh client and a private key to connect to a remote ssh endpoint. Triggering scripts, restarting services, or moving files around could all be part of your deployment process. Keeping a private ssh key “secured” is critical to limiting authorized access to your resources accessible by ssh. Whether you use your own in-house application (read “unreliable mess of shell scripts”), Travis CI, Bitbucket Pipelines, or some other CD solution, you may find yourself wanting to store a ssh private key for use during deployment.

Bitbucket Pipelines already has a built-in way to store and provide ssh deploy keys, however, this is an example alternative to roll your own. The steps are pretty simple. We create an encrypted ssh private key, it’s corresponding public key, and a 64 character passphrase for the private key. The encrypted private key and public key get checked in to the repository, and the passphrase gets stored as a “secured” Bitbucket Pipelines variable. During build time, the private key gets decrypted into a file using the Bitbucket Pipelines passphrase variable. The ssh client can now use that key to connect to whatever resources you need it to.

set -e

if [ ! `which openssl` ] || [ ! `which ssh-keygen` ] || [ ! `which jq` ] || [ ! `which curl` ]
  echo "need ssh-keygen, openssl, jq, and curl to continue"

# generate random string of 64 characters
echo "generating random string for ssh deploy key passphrase..."
DEPLOY_KEY_PASSPHRASE=`< /dev/urandom LC_CTYPE=C tr -dc A-Za-z0-9#^%@ | head -c ${1:-64}`

# save passphrase in file to be used by openssl
echo "saving passphrase for use with openssl..."
echo -n "${DEPLOY_KEY_PASSPHRASE}" >passphrase.txt

# generate encrypted ssh rsa key using passphrase
echo "generating encrypted ssh private key with passphrase..."
openssl genrsa -out id_rsa_deploy.pem -passout file:passphrase.txt -aes256 4096
chmod 600 id_rsa_deploy.pem

# decrypt ssh rsa key using passphrase
echo "decrypting ssh private key with passphrase to temp file..."
openssl rsa -in id_rsa_deploy.pem -out id_rsa_deploy.tmp -passin file:passphrase.txt
chmod 600 id_rsa_deploy.tmp

# generate public ssh key for use on target deployment server
echo "generating public key from private key..."
ssh-keygen -y -f id_rsa_deploy.tmp >

# remove unencrypted ssh rsa key
echo "removing unencrypted temp file..."
rm id_rsa_deploy.tmp

# ask user for bitbucket credentials
echo -n "enter bitbucket username: "
echo -n "enter bitbucket password: "
read -s BBPASS

# bitbucket API doesn't have "UPSERT" capability for creating(if not exists) or updating(if exists) variables
# get variable if exists
echo "getting variable uuid if variable exists"
DEPLOY_KEY_PASSPHRASE_UUID=`curl -s --user ${BBUSER}:${BBPASS} -X GET -H "Content-Type: application/json" | jq -r '.values[]|select(.key=="DEPLOY_KEY_PASSPHRASE").uuid'`

  # create bitbucket pipeline variable
  echo "DEPLOY_KEY_PASSPHRASE variable does not exist... creating DEPLOY_KEY_PASSPHRASE"
  curl -s --user ${BBUSER}:${BBPASS} -X POST -H "Content-Type: application/json" -d '{"key":"DEPLOY_KEY_PASSPHRASE","value":"'"${DEPLOY_KEY_PASSPHRASE}"'","secured":"true"}'

  # update existing bitbucket pipeline variable by uuid
  # use --globoff to avoid curl interpreting curly braces in the variable uuid
  echo "DEPLOY_KEY_PASSPHRASE variable exists... updating DEPLOY_KEY_PASSPHRASE"
  curl --globoff -s --user ${BBUSER}:${BBPASS} -X PUT -H "Content-Type: application/json" -d '{"key":"DEPLOY_KEY_PASSPHRASE","value":"'"${DEPLOY_KEY_PASSPHRASE}"'","secured":"true","uuid":"'"${DEPLOY_KEY_PASSPHRASE_UUID}"'"}' "${DEPLOY_KEY_PASSPHRASE_UUID}"


# after passphrase is stored in bitbucket remove passphrase file
  echo "DEPLOY_KEY_PASSPHRASE successfully stored on bitbucket, removing passphrase file..."
rm passphrase.txt

echo "add, commit, and push encypted private key and corresponding public key, update ssh targets with new public key"
echo "  ->  git add id_rsa_deploy.pem && git commit -m 'deploy ssh key roll' && git push"

We use bitbucket username and password to authenticate the person running the script, they need access to insert the new ssh deploy key passphrase as a “secured” variable using the bitbucket API. The person running the script never sees the passphrase and doesn’t care what it is. This script can be run easily to update the key pair and passphrase. It’s easy and fast because when you need to roll a compromised key, you should never have to remember that damn openssl command that you have used your entire career, but somehow have never memorized.

Here’s how you could use the key in a Bitbucket Pipelines build container:

set -e

# store passphrase from BitBucket secure variable into file
# file is on /dev/shm tmpfs in memory (don't put secrets on disk)
echo "creating passphrase file from BitBucket secure variable DEPLOY_KEY_PASSPHRASE"
echo -n "${DEPLOY_KEY_PASSPHRASE}" >/dev/shm/passphrase.txt

# use passphrase to decrypt ssh key into tmp file (again in memory backed file system)
echo "writing decrypted ssh key to tmp file"
openssl rsa -in id_rsa_deploy.pem -out /dev/shm/id_rsa_deploy.tmp -passin file:/dev/shm/passphrase.txt
chmod 600 /dev/shm/id_rsa_deploy.tmp

# invoke ssh-agent to manage keys
echo "starting ssh-agent"
eval `ssh-agent -s`

# add ssh key to ssh-agent
echo "adding key to ssh-agent"
ssh-add /dev/shm/id_rsa_deploy.tmp

# remove tmp ssh key and passphrase now that the key is in ssh-agent
echo "cleaning up decrypted key and passphrase file"
rm /dev/shm/id_rsa_deploy.tmp /dev/shm/passphrase.txt

# get ssh host key
echo "getting host keys"
ssh-keyscan -H >> $HOME/.ssh/known_hosts

# test the key
echo "testing key"
ssh "uptime"

It uses a tmpfs memory backed file system to store the key and passphrase, and ssh-agent to add the key to the session. How secure is secured enough? Whether you use the built-in Pipelines ssh deploy key, or this method to roll your own and store a passphrase in a variable, or store the ssh key as a base64 encoded blob in a variable, or however you do it, you essentially have to trust the provider to keep your secrets secret.

There are some changes you could make to all of this, but it’s good boilerplate. Other things to think about:

    rewrite this in python and do automated key rolls once a day with Lambda, storing the dedicated bitbucket user/pass and git key in KMS.
    do you really need ssh-agent?
    you could turn off strict host key checking instead of using ssh-keyscan
    could this be useful for x509 TLS certs?

make Makefile target for help or usage options

Using make and Makefiles with a docker based application development strategy are a great way to track shortcuts and allow team members to easily run common docker or application tasks without having to remember the syntax specifics. Without a “default” target make will attempt to run the first target (the default goal). This may be desirable in some cases, but I find it useful to have make just print out a usage, and require the operator to specify the exact target they need.

DE=docker-compose exec app

.PHONY: help
  @sh -c "echo ; echo 'usage: make <target> ' ; cat Makefile | grep ^[a-z] | sed -e 's/^/            /' -e 's/://' -e 's/help/help (this message)/'; echo"

  $(DC) up -d

  $(DC) stop

  $(DC) rm -v

  $(DC) ps

  $(DC) logs

  $(DE) sh -c "vendor/bin/phpunit"

Now without any arguments make outputs a nice little usage message:

$ make 

usage: make <target> 
            help (this message) 

This assumes a bunch of things like you must be calling make from the correct directory, but is a good working proof of concept.

use tmpfs for docker images

For i/o intensive Docker builds, you may want to configure Docker to use memory backed storage for images and containers. Ephemeral storage has several applications, but in this case our Docker engine is on a temporary EC2 spot instance and participating in a continuous delivery pipeline. In other words, it’s ok to loose the instance and all of the Docker images it has on it. This is for a systemd based system, in this case Ubuntu 16.04.

Create the tmpfs, then reconfigure the Docker systemd unit to use it:

mkdir /mnt/docker-tmp
mount -t tmpfs -o size=25G tmpfs /mnt/docker-tmp
sed -i 's|/mnt/docker|/mnt/docker-tmp|' /etc/systemd/system/docker.service.d/docker-startup.conf
systemctl daemon-reload
systemctl restart docker

This could be part of a bootstrapping script for build instances, or more effectively translated into config management or rolled into an AMI.

percentile apache server request response times

I needed a hack to quickly find the 95th percentile of apache request response times. For example I needed to be able to say that “95% of our apache requests are served in X milliseconds or less.” In the apache2 config the LogFormat directive had %D (the time taken to serve the request, in microseconds) as the last field. Meaning the last field of each log line would be the time it took to serve the request. This would make it easy to pull out with $NF in awk

# PCT=.95; NR=`cat access.log | wc -l `; cat /var/log/apache2/access.log | awk '{print $NF}' | sort -rn | tail -n+$(echo "$NR-($NR*$PCT)" |bc | cut -d. -f1) |head -1

In this case 95% of the apache requests were served in 938 milliseconds or less (WTF?!). Then run on an aggregated group of logs, or change the date/time range to just run for logs on a particular day, or for multiple time periods.

Note: I couldn’t get scale to work here in bc for some reason.

wget use gzip header to received compressed output

This test endpoint returns Content-Type: application/json

Without gzip enabled header:

$ wget -qO test https://testendpoint
$ file test
test: ASCII text, with very long lines, with no line terminators
$ du -b test
7307    test

Setting the gzip enabled header:

$ wget --header="accept-encoding: gzip" -qO test.gz https://testendpoint
$ file test.gz
test.gz: gzip compressed data, from Unix
$ du -b test.gz
1694    test.gz

Telling the server that wget can accept gzip compressed content results in 77% reduction in bytes transferred.

list tables in mysql dump

$ zgrep -o '^CREATE.*' database_backup.sql.gz

CREATE TABLE `aa_migrations` (
CREATE TABLE `abcdefg_bar` (
CREATE TABLE `abcdefg_foo` (
CREATE TABLE `abcdefg_images` (
CREATE TABLE `abcdefg_table12` (
CREATE TABLE `abcdefg_table13` (
CREATE TABLE `abcdefg_table14` (
CREATE TABLE `abcdefg_table15` (
CREATE TABLE `abcdefg_users` (

You could also just count them:

$ zgrep -o '^CREATE.*' database_backup.sql.gz | wc -l

AWS Lambda function to call bash

Use this node.js snippet to get Lambda execution of a bash command. In this case bash is just doing a curl to a URL. You could also write an entire bash script and call it.

var child_process = require('child_process');

exports.handler = function(event, context) {
  var child = child_process.spawn('/bin/bash', [ '-c', 'curl --silent --output - --max-time 1 || true' ], { stdio: 'inherit' });

  child.on('close', function(code) {
    if(code !== 0) {
      return context.done(new Error("non zero process exit code"));


AWS find unused instance reservations

AWS Trusted Advisor has some good metrics on Cost Optimization when it comes to looking at your infrastructure portfolio and potential savings, however, it doesn’t offer a good way to see underutilized reservations. By using awscli you can check your reservations against what is actually running in order to find under provisioned reservations:

# super quick hack to see difference between reserved and actual usage for RI


for az in us-east-1b us-east-1d
  if [ "$first" == "true" ]
    echo "AvailabilityZone InstanceType Reserved Actual Underprovisioned"
  for itype in `aws ec2 describe-reserved-instances --filters Name=state,Values=active Name=availability-zone,Values=$az | grep InstanceType | sed -e 's/ //g' | cut -d'"' -f 4 | sort -u`
    rcount=$(aws ec2 describe-reserved-instances --filters Name=state,Values=active Name=availability-zone,Values=$az Name=instance-type,Values=$itype | egrep 'InstanceCount' | awk '{print $NF}' | awk 'BEGIN{i=0}{i=i+$1}END{print i}')
    icount=$(aws ec2 describe-instances --filter Name=availability-zone,Values=$az Name=instance-type,Values=$itype | grep InstanceId | wc -l)

    echo "$az $itype $rcount $icount" | awk '{ if($4<$3){alert="YES"} print $1" "$2" "$3" "$4" "alert}'


done | column -t


AvailabilityZone  InstanceType  Reserved  Actual  Underprovisioned
us-east-1b        m3.2xlarge    2         0       YES
us-east-1b        m3.medium     2         8
us-east-1b        m3.xlarge     11        4       YES
us-east-1b        m4.xlarge     2         46
us-east-1d        m3.medium     2         8
us-east-1d        m3.xlarge     6         0       YES
us-east-1d        m4.xlarge     1         17

use iozone to benchmark NFS

iozone -t 32 -s1g -r16k -i 0 -i 1

32 threads operating on a 1G file in 16k chunks doing test 0 (write/re-write) and test 1 (read/re-read)

NFSv3 mounted AWS c4.xlarge 500GB gp2 EBS vol formatted EXT4

        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 64 bit mode.
                Build: linux-AMD64 

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.  

        Run began: Fri May 13 15:46:09 2016

        File size set to 1048576 KB
        Record Size 16 KB
        Command line used: iozone -t 32 -s1g -r16k -i 0 -i 1
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 32 processes
        Each process writes a 1048576 Kbyte file in 16 Kbyte records

        Children see throughput for 32 initial writers  =   82456.18 KB/sec
        Parent sees throughput for 32 initial writers   =   73910.24 KB/sec
        Min throughput per process                      =    2501.01 KB/sec
        Max throughput per process                      =    2746.40 KB/sec
        Avg throughput per process                      =    2576.76 KB/sec
        Min xfer                                        =  954736.00 KB

        Children see throughput for 32 rewriters        =   81779.61 KB/sec
        Parent sees throughput for 32 rewriters         =   78548.79 KB/sec
        Min throughput per process                      =    2509.20 KB/sec
        Max throughput per process                      =    2674.01 KB/sec
        Avg throughput per process                      =    2555.61 KB/sec
        Min xfer                                        =  983856.00 KB

        Children see throughput for 32 readers          =   91412.23 KB/sec
        Parent sees throughput for 32 readers           =   90791.54 KB/sec
        Min throughput per process                      =    2761.06 KB/sec
        Max throughput per process                      =    2910.95 KB/sec
        Avg throughput per process                      =    2856.63 KB/sec
        Min xfer                                        =  998720.00 KB

        Children see throughput for 32 re-readers       =   91781.74 KB/sec
        Parent sees throughput for 32 re-readers        =   91620.13 KB/sec
        Min throughput per process                      =    2832.08 KB/sec
        Max throughput per process                      =    2885.37 KB/sec
        Avg throughput per process                      =    2868.18 KB/sec
        Min xfer                                        = 1029440.00 KB

iozone test complete.