use tmpfs for docker images

For i/o intensive Docker builds, you may want to configure Docker to use memory backed storage for images and containers. Ephemeral storage has several applications, but in this case our Docker engine is on a temporary EC2 spot instance and participating in a continuous delivery pipeline. In other words, it’s ok to loose the instance and all of the Docker images it has on it. This is for a systemd based system, in this case Ubuntu 16.04.

Create the tmpfs, then reconfigure the Docker systemd unit to use it:

mkdir /mnt/docker-tmp
mount -t tmpfs -o size=25G tmpfs /mnt/docker-tmp
sed -i 's|/mnt/docker|/mnt/docker-tmp|' /etc/systemd/system/docker.service.d/docker-startup.conf
systemctl daemon-reload
systemctl restart docker

This could be part of a bootstrapping script for build instances, or more effectively translated into config management or rolled into an AMI.

percentile apache server request response times

I needed a hack to quickly find the 95th percentile of apache request response times. For example I needed to be able to say that “95% of our apache requests are served in X milliseconds or less.” In the apache2 config the LogFormat directive had %D (the time taken to serve the request, in microseconds) as the last field. Meaning the last field of each log line would be the time it took to serve the request. This would make it easy to pull out with $NF in awk

# PCT=.95; NR=`cat access.log | wc -l `; cat /var/log/apache2/access.log | awk '{print $NF}' | sort -rn | tail -n+$(echo "$NR-($NR*$PCT)" |bc | cut -d. -f1) |head -1
938247

In this case 95% of the apache requests were served in 938 milliseconds or less (WTF?!). Then run on an aggregated group of logs, or change the date/time range to just run for logs on a particular day, or for multiple time periods.

Note: I couldn’t get scale to work here in bc for some reason.

wget use gzip header to received compressed output

This test endpoint returns Content-Type: application/json

Without gzip enabled header:

$ wget -qO test https://testendpoint
$ file test
test: ASCII text, with very long lines, with no line terminators
$ du -b test
7307    test

Setting the gzip enabled header:

$ wget --header="accept-encoding: gzip" -qO test.gz https://testendpoint
$ file test.gz
test.gz: gzip compressed data, from Unix
$ du -b test.gz
1694    test.gz

Telling the server that wget can accept gzip compressed content results in 77% reduction in bytes transferred.

list tables in mysql dump

$ zgrep -o '^CREATE.*' database_backup.sql.gz

CREATE TABLE `aa_migrations` (
CREATE TABLE `abcdefg_bar` (
CREATE TABLE `abcdefg_foo` (
CREATE TABLE `abcdefg_images` (
CREATE TABLE `abcdefg_table12` (
CREATE TABLE `abcdefg_table13` (
CREATE TABLE `abcdefg_table14` (
CREATE TABLE `abcdefg_table15` (
CREATE TABLE `abcdefg_users` (

You could also just count them:

$ zgrep -o '^CREATE.*' database_backup.sql.gz | wc -l
9

AWS Lambda function to call bash

Use this node.js snippet to get Lambda execution of a bash command. In this case bash is just doing a curl to a URL. You could also write an entire bash script and call it.

var child_process = require('child_process');

exports.handler = function(event, context) {
  var child = child_process.spawn('/bin/bash', [ '-c', 'curl --silent --output - --max-time 1 https://fordodone.com/jobtrigger || true' ], { stdio: 'inherit' });

  child.on('close', function(code) {
    if(code !== 0) {
      return context.done(new Error("non zero process exit code"));
    }

    context.done(null);
  });
}

AWS find unused instance reservations

AWS Trusted Advisor has some good metrics on Cost Optimization when it comes to looking at your infrastructure portfolio and potential savings, however, it doesn’t offer a good way to see underutilized reservations. By using awscli you can check your reservations against what is actually running in order to find under provisioned reservations:

#!/bin/bash
# super quick hack to see difference between reserved and actual usage for RI

IFS="
"

first="true"
for az in us-east-1b us-east-1d
do
  if [ "$first" == "true" ]
  then
    echo "AvailabilityZone InstanceType Reserved Actual Underprovisioned"
    first="false"
  fi
  for itype in `aws ec2 describe-reserved-instances --filters Name=state,Values=active Name=availability-zone,Values=$az | grep InstanceType | sed -e 's/ //g' | cut -d'"' -f 4 | sort -u`
  do
    rcount=$(aws ec2 describe-reserved-instances --filters Name=state,Values=active Name=availability-zone,Values=$az Name=instance-type,Values=$itype | egrep 'InstanceCount' | awk '{print $NF}' | awk 'BEGIN{i=0}{i=i+$1}END{print i}')
    icount=$(aws ec2 describe-instances --filter Name=availability-zone,Values=$az Name=instance-type,Values=$itype | grep InstanceId | wc -l)

    echo "$az $itype $rcount $icount" | awk '{ if($4<$3){alert="YES"} print $1" "$2" "$3" "$4" "alert}'

  done;

done | column -t

Output:

$ compare_reserved_instances_vs_actual.sh
AvailabilityZone  InstanceType  Reserved  Actual  Underprovisioned
us-east-1b        m3.2xlarge    2         0       YES
us-east-1b        m3.medium     2         8
us-east-1b        m3.xlarge     11        4       YES
us-east-1b        m4.xlarge     2         46
us-east-1d        m3.medium     2         8
us-east-1d        m3.xlarge     6         0       YES
us-east-1d        m4.xlarge     1         17

use iozone to benchmark NFS

iozone -t 32 -s1g -r16k -i 0 -i 1

32 threads operating on a 1G file in 16k chunks doing test 0 (write/re-write) and test 1 (read/re-read)

NFSv3 mounted AWS c4.xlarge 500GB gp2 EBS vol formatted EXT4

        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 64 bit mode.
                Build: linux-AMD64 

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.  

        Run began: Fri May 13 15:46:09 2016

        File size set to 1048576 KB
        Record Size 16 KB
        Command line used: iozone -t 32 -s1g -r16k -i 0 -i 1
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 32 processes
        Each process writes a 1048576 Kbyte file in 16 Kbyte records



        Children see throughput for 32 initial writers  =   82456.18 KB/sec
        Parent sees throughput for 32 initial writers   =   73910.24 KB/sec
        Min throughput per process                      =    2501.01 KB/sec
        Max throughput per process                      =    2746.40 KB/sec
        Avg throughput per process                      =    2576.76 KB/sec
        Min xfer                                        =  954736.00 KB

        Children see throughput for 32 rewriters        =   81779.61 KB/sec
        Parent sees throughput for 32 rewriters         =   78548.79 KB/sec
        Min throughput per process                      =    2509.20 KB/sec
        Max throughput per process                      =    2674.01 KB/sec
        Avg throughput per process                      =    2555.61 KB/sec
        Min xfer                                        =  983856.00 KB

        Children see throughput for 32 readers          =   91412.23 KB/sec
        Parent sees throughput for 32 readers           =   90791.54 KB/sec
        Min throughput per process                      =    2761.06 KB/sec
        Max throughput per process                      =    2910.95 KB/sec
        Avg throughput per process                      =    2856.63 KB/sec
        Min xfer                                        =  998720.00 KB

        Children see throughput for 32 re-readers       =   91781.74 KB/sec
        Parent sees throughput for 32 re-readers        =   91620.13 KB/sec
        Min throughput per process                      =    2832.08 KB/sec
        Max throughput per process                      =    2885.37 KB/sec
        Avg throughput per process                      =    2868.18 KB/sec
        Min xfer                                        = 1029440.00 KB



iozone test complete.

mercurial hg clone turn off host key checking for bitbucket.org

If you clone a repository during an automated code deploy (for example in AWS CodeDeploy or Atlassian Bamboo) then you probably need to turn off host key checking for the clone of your repository. This prevents hg (or git) from raising a user prompt about the authenticity of the host key.

$ echo -e "Host bitbucket.org\nStrictHostKeyChecking no\n" >> ~/.ssh/config

Docker Compose static IP address in docker-compose.yml

Usually, when launching Docker containers we don’t really know or care what IP address a specific container will be given. If proper service discovery and registration is configured, we just launch containers as needed and they make it into the application ecosystem seamlessly. Recently, I was working on a very edge-case multi-container application where every container needed to know (or be able to predict) every other containers’ IP address at run time. This was not a cascaded need where successor containers learn predecessors’ IP addresses, but more like a full mesh.

In Docker Engine 1.10 the docker run command received a new flag namely the --ip flag. This allows you to define a static IP address for a container at run time. Unfortunately, Docker Compose (1.6.2) did not support this option. I guess we can think of Engine as being upstream of Compose, so some new Engine features take a while to make it into Compose. Luckily, this has already made it into mainline dev for Compose and is earmarked for release with the 1.7.0 milestone (which should coincide with Engine 1.11). Find the commit we care about here.

get the dev build for Compose 1.7.0:


# cd /usr/local/bin
# wget -q https://dl.bintray.com/docker-compose/master/docker-compose-Linux-x86_64
# chmod 755 docker-compose-Linux-x86_64
# mv docker-compose-Linux-x86_64 docker-compose$(./docker-compose-Linux-x86_64 --version | awk '{print "-"$3$5}' | sed -e 's/,/_/')
# mv docker-compose docker-compose$(./docker-compose --version | awk '{print "-"$3$5}' | sed -e 's/,/_/')
# ln -s docker-compose-1.7.0dev_85e2fb6 docker-compose
# ls
lrwxrwxrwx 1 root root      31 Mar 30 08:38 docker-compose -> docker-compose-1.7.0dev_85e2fb6
-rwxr-xr-x 1 root root 7929597 Mar 24 08:01 docker-compose-1.6.2_4d72027
-rwxr-xr-x 1 root root 7938771 Mar 29 09:14 docker-compose-1.7.0dev_85e2fb6
#

In this case I decided to keep the 1.6.2 docker-compose binary along with the 1.7.0 docker-compose binary, then create a symlink to the one I wanted to use as the active docker-compose

Here’s a sample of how you might define a static IP address in docker-compose.yml that would work using docker-compose 1.7.0


version: "2"
services:
  host1:
    networks:
      mynet:
        ipv4_address: 172.25.0.101
networks:
  mynet:
    driver: bridge
    ipam:
      config:
      - subnet: 172.25.0.0/24