MySQL change field separator for select dump

You can select from a table into an outfile, and change the field delimiter. This is interactive, and you must be logged in locally to the mysql server, with write permissions to wherever you want to write the file.

mysql> SELECT 'pid','email' UNION SELECT pid,email INTO OUTFILE '/tmp/test.csv' FIELDS TERMINATED BY ',' FROM table;
Query OK, 11837 rows affected (0.21 sec)

mysql> Bye
# head /tmp/test.csv
pid,email
1081603,user1@fordodone.com
888151,user2@fordodone.com
781,user3@fordodone.com
2307364,user4@fordodone.com
2286573,user5@fordodone.com
2212194,user6@fordodone.com
2137603,user7@fordodone.com
500492,user8@fordodone.com
1514582,user9@fordodone.com

This is non-interactive and can be done from remote host:

# echo "select pid,email from table;" | mysql -h dbhost -udbuser -p -Ddbname | awk '{for (i=1;i<NF;i++) printf "%s,",$i; printf $i"\n"}' > /tmp/test.csv

MySQL count all rows in database by table

This gets a list of tables in database ‘interesting’ and how many rows are in each table. The second query just sums them up for a grand total.

mysql> use mysql;
Database changed
mysql> SELECT table_name,table_rows FROM INFORMATION_SCHEMA.tables WHERE table_schema = 'interesting';
+------------+------------+
| table_name | table_rows |
+------------+------------+
| a       |         29 |
| b       |          4 |
| c       |       1104 |
| d       |   55378308 |
| e       |   56925066 |
| f       |   53479079 |
| g       |   55963435 |
| h       |   53966893 |
| i       |   55839182 |
| j       |   56854816 |
| k       |   54613112 |
| l       |   55454111 |
| m       |   56133247 |
| n       |          7 |
| o       |          4 |
| p       |         18 |
| q       |         30 |
| r       |          5 |
| s       |          5 |
| t       |         68 |
| u       |         22 |
| v       |          5 |
| w       |          7 |
| x       |      66046 |
| y       |      69439 |
| z       |      87361 |
| aa      |      89519 |
| ab      |      75892 |
| ac      |      61785 |
| ad      |      64964 |
| ae      |      69905 |
| af      |      65599 |
| ag      |      69710 |
| ah      |        381 |
| ai      |        181 |
| aj      |    2285397 |
| ak      |    2272764 |
| al      |         10 |
+---------+------------+
38 rows in set (0.04 sec)

mysql> SELECT sum(table_rows) FROM INFORMATION_SCHEMA.tables WHERE table_schema = 'interesting';
+-----------------+
| sum(table_rows) |
+-----------------+
|       559887795 |
+-----------------+
1 row in set (0.05 sec)

mysql> 

mysql log slow queries

It’s really helpful to know what queries are choking a database. To enable the mysql slow query log add these lines to the [mysqld] configuration section of your my.cnf or mysql configuration file and restart mysqld:

slow_query_log=1
slow_query_log_file=/var/log/mysql_slow_queries/dbserver1.log
long_query_time=10

This will log queries that take longer than 10 seconds to the specified file. After you collect some of the logs, work with the application developers to see if some of the queries can be optimized, using more efficient sql, adding indices, etc.

mysql count and group

In mysql you can count the total of rows, or you can count rows after grouping for a condition. Here we take a table and count every row, for each type of fruit:

mysql> select fruit,count(id) as cnt from inventory group by fruit;

+-------------+------------+
| fruit       | count(id)  |
+-------------+------------+
| apple       |       3496 |
| orange      |       3783 |
| mango       |         11 |
+-------------+------------+

Add the ‘with rollup’ statement, to get the total at the bottom:

mysql> select fruit,count(id) as cnt from inventory group by fruit;

+-------------+------------+
| fruit       | count(id)  |
+-------------+------------+
| apple       |       3496 |
| orange      |       3783 |
| mango       |         11 |
| NULL        |       7290 |
+-------------+------------+

You can add other conditional statements to refine what you need. Here we only want the fresh fruits that we have more than 3500 of:

mysql> select count(id) as cnt,fruit from inventory where status='fresh' group by fruit having cnt >3500;
+------+-----------+
| cnt  | fruit     |
+------+-----------+
| 3783 | apple     |
+------+-----------+

MySQL group by modulus

I have a series of scripts that acts on a subset of entries in a large (2M rows) MySQL table. There are 10 instances in the application, and each instance works on id‘s ending in 0-9 from a single table. The ending digit could also be considered the modulus. Instance 0 works on id‘s ending in 0, instance 1 works on id‘s ending in 1, etc.

mysql> select id from table limit 10;
+---------+
| id      |
+---------+
| 1099138 | 
| 1811048 | 
|  794297 | 
|  985812 | 
|  424120 | 
|  353809 | 
|  478882 | 
|  506101 | 
| 2184449 | 
| 1189511 | 
+---------+

To see what progress the application is making or if it is stuck at I use a query grouping the id‘s remaining to be worked on by modulus:

mysql> select substr(id, -1) as modulus,count(*) from table where field='criteria' group by modulus with rollup;
+---------+----------+
| modulus | count(*) |
+---------+----------+
| 0       |        9 | 
| 1       |       14 | 
| 2       |       13 | 
| 3       |     1985 | 
| 4       |       10 | 
| 5       |       16 | 
| 6       |       16 | 
| 7       |       11 | 
| 8       |       10 | 
| 9       |       20 | 
| NULL    |     2104 | 
+---------+----------+

The substr(id,-1) takes the id and returns the last digit (-1) as modulus. Then it’s grouped by modulus, and using rollup is summed at the end. As you can see the application still has 2104 id‘s to work on, and something is different about instance 3. Either there is an uneven distribution of id‘s ending in 3 (probably not the case, as id is primary key auto increment), or instance 3 in the application is borked.

time mysql select without output

To see how long a query will take without actually outputting anything, change the pager in mysql client to /dev/null instead of STDOUT

mysql> pager;
Default pager wasn't set, using stdout.

mysql> pager >/dev/null
PAGER set to '>/dev/null'

mysql> select md5sum from table1;
5245618 rows in set (1 min 21.50 sec)

mysql> select md5sum from table1 where md5sum is not NULL and md5sum!='not calculated';
4832676 rows in set (1 min 11.52 sec)

mysql> select md5sum from table1 where md5sum not in ('NULL', '', 'not calculated');
4832078 rows in set (1 min 21.91 sec)

find duplicate entry in sql dump

Recently, I tried to import a SQL dump created by mysqldump that somehow had a duplicate entry for a primary key. Here’s a sample of the contents:

INSERT INTO `table1` VALUES ('B97bKm',71029594,3,NULL,NULL,'2013-01-22 09:25:39'),('dZfUHQ',804776,1,NULL,NULL,'2012-09-05 16:15:23'),('hWkGsz',70198487,0,NULL,NULL,'2013-01-05 10:55:36'),('n6366s',69480146,1,NULL,NULL,'2012-
12-18 03:27:45'),('tBP6Ug',65100805,1,NULL,NULL,'2012-08-29 21:32:39'),('yfpewZ',18724906,0,NULL,NULL,'2013-03-31 17:12:58'),('UNz5qp',8392940,2,NULL,NULL,'2012-11-28 02:00:00'),('9WVpVV',71181566,0,NULL,NULL,'2013-01-25 06:15:03'),('kEP
Qu5',64972980,9,NULL,NULL,'2012-09-01 06:00:36')

It goes on for another 270,000 entries. I was able to find the duplicate value like this:

# cat /tmp/table1.sql | grep INSERT | sed -e 's/),/\n/g' | sed -e 's/VALUES /\n/' | grep -v INSERT | awk -F, '{print $2}' | sort | uniq -c | awk '{if($1>1) print;}'
    2 64590015
#

The primary key value 64590015 had 2 entries. I removed the spurious entry, and subsequently the SQL imported fine.