Import huge sql file

I have a wikipedia dump enwiki-20161220-pagelinks.sql.gz, and it’s about 37G after being extracted.

Firstly I try to use command mysql -u root -p dbname < filename to import directly, however it is so slow and I can’t see the progress of import, then I searched around and found some useful methods.

  • For Innodb

    change the following settings (it should be in /etc/my.conf in Linux)

    1
    2
    3
    4
    5
    innodb_buffer_pool_size = 4G
    innodb_log_buffer_size = 256M
    innodb_log_file_size = 1G
    innodb_write_io_threads = 16
    innodb_flush_log_at_trx_commit = 0

    The reason is listed in MySQL any way to import a huge (32 GB) sql dump faster?

    Restart mysql like this

    1
    service mysql restart --innodb-doublewrite=0

    After importing data, restart mysql normally

    1
    service mysql restart

    Disable keys:

    1
    2
    3
    SET autocommit=0;
    SET unique_checks=0;
    SET foreign_key_checks=0;

    This way of disable keys works just for current session, instead of it you can use

    1
    2
    3
    SET GLOBAL autocommit=0;
    SET GLOBAL unique_checks=0;
    SET GLOBAL foreign_key_checks=0;

    And SET GLOBAL xxx will take effect on next session (not this session, you should end up this session and reconnect), and it won’t affect the database after restart MySQL server.

    After setting it, take a check SHOW VARIABLES LIKE 'foreign_key_checks';, it should return OFF.

    For MyISAM

    Disable MySQL indexes (replacing table_name with the real name of your table)

    1
    ALTER TABLE table_name DISABLE KEYS;

    and enable it after import

    1
    ALTER TABLE table_name ENABLE KEYS;

    How to check if you disabled the indexes ? Just run EXPLAIN on a SELECT statement that would normally use a key. For instance

    1
    EXPLAIN SELECT * FROM table_name limit 1,1;

    If the keys are disabled, the columns possible_keys, key, key_len, ref will return null.

  • Another solution

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    mysql -u root -p

    set global net_buffer_length=1000000; --Set network buffer length to a large byte number

    set global max_allowed_packet=1000000000; --Set maximum allowed packet size to a large byte number

    SET foreign_key_checks = 0; --Disable foreign key checking to avoid delays,errors and unwanted behaviour

    source file.sql --Import your sql dump file

    SET foreign_key_checks = 1; --Remember to enable foreign key checks when procedure is complete!

    The answer is here. I don’t give this solution a try.

  • monitor the process

    • If you use source command to import files, just like mysql> source filename;, this will give you some output, but I don’t recommend this way to import huge file. And if you exit mysql, the import process will still exist, you can type show processlist; in mysql client to see the newest entry of import, and evaluate the remnant time.

    • Recommended: use tool pipe viewer. It can give a vision indication of the progress of data through a pipeline.

      1
      pv filename | mysql -u root -p dbname
  • split dump file by line

    1
    split -l 1000 dumpfilename segment

    The dump file dumpfilename will be splitted to segmentaa, segmentab, segmentac ..., each small piece contains 1000 lines.

    Notice that segmentaa contains header of original file, including definition of tables, variables and so on, so we should import segmentaa at first, and then import other files in order.

By the way, improve your hardware, and old-fashioned disks are hopeless. Try to get more RAM, a faster processor, and SSD for faster reads and writes.

Useful links:

Testing the Fastest Way to Import a Table into MySQL (and some interesting 5.7 performance results)