ForDoDone

Tales from the Command Line…

printing large integers with awk

Posted on May 27, 2013 by ForDoDone

When printing with awk, it uses scientific notation by default. Take this snippet from an example file. The first column is a count of how many times a file is present, the second column is the md5sum of that file and the third is the number of bytes that the file is.

# tail -3 md5sums
  14737 113136892f2137aa0116093a524ade0b        53
  19402 1c7b413c3fa39d0fed40556d2658ac73        44
  52818 b7f10e862d0e82f77a86b522159ce3c8        45
#

If I wanted to sum up the number of files counted in this file, and how much total space they are all taking up, I do this:

# awk '{i=i+$1;j=j+($3*$1);} END {print i; print j}' md5sums
22412000
1.45255e+13

So awk counted 22412000 files, totaling about 14.5 TB. Let’s make that a little more readable:

# awk '{i=i+$1;j=j+($3*$1);} END {printf ("%d\n", i); printf("%d\n", j)}' md5sums
22412000
2147483647

Um… that’s not right. But 2147483647 is a special number. You should recognize it as the maximum value of a 32 bit unsigned integer or ((2^32)/2)-1. In this case printf doesn’t handle large integers at all. Instead, use print, but tell awk what the output format should look like:

awk 'BEGIN {OFMT = "%.0f"} {i=i+$1;j=j+($3*$1);} END {print i; print j}' md5sums 
22412000
14525468874034

2 thoughts on “printing large integers with awk”

Phil on February 21, 2014 at 1:29 pm said:

I just wanted to thank you for this. I was struggling with achieving exactly this for quite some time until I found your hands tip. THANKS!
Marcelo Oliveira on October 24, 2015 at 2:20 pm said:

Thank you soooo much!

Leave a Reply