awk average multiple columns

If you have some output lined up in columns, use awk to average the columns. Here’s some sample output (from a NetApp “toaster> stats show -p flexscale-access”)

    
# cat sample.txt
    73   5480      0   1040  84     0     0      0     0      0     0      0       541
    73   6038     39   1119  84     0     0      0     0      0     0      0       475
    73   5018     19    859  85     0     0      0     0      0     0      0       348
    73   5960     20   1480  80   120     0    320     0      0     0      0       427
    73   6098      0   1019  85     0     0      0     0      0     0      0       486
    73   5220      0   1220  81     0     0      0     0      0     0      0       288
    73   5758     79   1319  81    59    39    319     0      0     0      0       500
    73   4419      0   2039  68     0     0      0     0      0     0      0       279
    73   5400      0    840  86     0     0      0     0      0     0      0       382
    73   5238      0   1299  80     0     0      0     0      0     0      0       389
    73   5449      0   1696  76    59     0    199     0      0     0      0       340
    73   5478      0   1419  79     0     0      0     0      0     0      0       414
    73   5020     20   1000  83     0     0      0     0      0     0      0       405
    73   4359      0   1059  80     0     0      0     0      0     0      0       295
    73   5838     39   1139  83     0    19      0     0      0     0      0       494
    73   6100     40   1720  78     0     0      0     0      0     0      0       480
    73   5398     19   1239  81     0     0      0     0      0     0      0       398
    73   5089     79   1097  82     0     0      0     0      0     0      0       459
    73   6178     19   1159  84     0    39    159     0      0     0      0       487
    73   4999      0   1239  80     0     0      0     0      0     0      0       345
    73   4820      0    880  84     0     0      0     0      0     0      0       339
    73   5467      0   1177  82     0     0      0     0      0     0      0       413
    73   4700     60   1480  76     0     0      0     0      0     0      0       337
#

And the column averages:

# cat sample.txt | awk '{for (i=1;i<=NF;i++){a[i]+=$i;}} END {for (i=1;i<=NF;i++){printf "%.0f", a[i]/NR; printf "\t"};printf "\n"}'
73      5371    19      1241    81      10      4       43      0       0       0       0       405
#

Here awk loops through each field in a row, and adds the value to an array (a[i]) with the key being the field number. Then at the end, it takes the total, and divides by the number of rows (NR) and prints that (without decimals). It separates each field by a tab (\t) and after the end record prints a newline (\n).

You could make it print totals, as well as averages. You could also make it print out the original data, or a field header to know what each column represents...

Leave a Reply

Your email address will not be published. Required fields are marked *