MySQL load data infile – acceleration

indexingload-data-infileMySQLperformance

sometimes, I have to re-import data for a project, thus reading about 3.6 million rows into a MySQL table (currently InnoDB, but I am actually not really limited to this engine). "Load data infile…" has proved to be the fastest solution, however it has a tradeoff:
– when importing without keys, the import itself takes about 45 seconds, but the key creation takes ages (already running for 20 minutes…).
– doing import with keys on the table makes the import much slower

There are keys over 3 fields of the table, referencing numeric fields.
Is there any way to accelerate this?

Another issue is: when I terminate the process which has started a slow query, it continues running on the database. Is there any way to terminate the query without restarting mysqld?

Thanks a lot
DBa

Best Answer

if you're using innodb and bulk loading here are a few tips:

sort your csv file into the primary key order of the target table : remember innodb uses clustered primary keys so it will load faster if it's sorted !

typical load data infile i use:

truncate <table>;

set autocommit = 0;

load data infile <path> into table <table>...

commit;

other optimisations you can use to boost load times:

set unique_checks = 0;
set foreign_key_checks = 0;
set sql_log_bin=0;

split the csv file into smaller chunks

typical import stats i have observed during bulk loads:

3.5 - 6.5 million rows imported per min
210 - 400 million rows per hour

Related Solutions

Mysql – Which MySQL data type to use for storing boolean values

For MySQL 5.0.3 and higher, you can use BIT. The manual says:

As of MySQL 5.0.3, the BIT data type is used to store bit-field values. A type of BIT(M) enables storage of M-bit values. M can range from 1 to 64.

Otherwise, according to the MySQL manual you can use BOOL or BOOLEAN, which are at the moment aliases of tinyint(1):

Bool, Boolean: These types are synonyms for TINYINT(1). A value of zero is considered false. Non-zero values are considered true.

MySQL also states that:

We intend to implement full boolean type handling, in accordance with standard SQL, in a future MySQL release.

References: http://dev.mysql.com/doc/refman/5.5/en/numeric-type-overview.html

Mysql – How to output MySQL query results in CSV format

From Save MySQL query results into a text or CSV file:

SELECT order_id,product_name,qty
FROM orders
WHERE foo = 'bar'
INTO OUTFILE '/var/lib/mysql-files/orders.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';

Note: That syntax may need to be reordered to

SELECT order_id,product_name,qty
INTO OUTFILE '/var/lib/mysql-files/orders.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM orders
WHERE foo = 'bar';

in more recent versions of MySQL.

Using this command, columns names will not be exported.

Also note that /var/lib/mysql-files/orders.csv will be on the server that is running MySQL. The user that the MySQL process is running under must have permissions to write to the directory chosen, or the command will fail.

If you want to write output to your local machine from a remote server (especially a hosted or virtualize machine such as Heroku or Amazon RDS), this solution is not suitable.

Best Answer

Related Solutions

Mysql – Which MySQL data type to use for storing boolean values

Mysql – How to output MySQL query results in CSV format

Related Topic