Diamond Notes

Just another WordPress weblog

MyIsam vs Innodb

I did some testing because we are going to convert a very large merge table (composed of five MyIsam tables) into Innodb. I knew that Innodb tables were larger than MyIsam but I didn’t have any real figures. In addition I used the time to test loading data into both MyIsam and Innodb tables to see which loaded faster. As you will see I ran into some interesting results at the end. If anyone can explain why it would be MUCH slower to load the Innodb data from four files instead of two files I would love to hear from you. The files are loaded sequentially.

Testing MyIsam and Innodb

One of the problems that we have on an older database is that we have one table arranged as a merged-MyIsam table of five MyIsam tables. Whenever a long-running select query is issued on the table it blocks writes from happening. Even though we used insert delays on this table, after a while we simply run out of connections because there is a limit of 400.Moving to the new version of MySQL gives us a good opportunity to do some upgrading. It has been proposed that we replace the Merge table with an Innodb table. One consideration is how much space that this is going to consume.

Setup

I am importing the table from the last backup we ran on on the database. On satest2 I will set up the same table structure and data with two different names and with an engine type of Innodb for one and MyIsam for another. Then we can compare the disk space used by the two tables to see which is better (in regards to space). In addition I will be testing the import times of innodb and myisam.

Tests

The first set of results will show the time taken to import the data. The dataset was the same with just the table type changed (one was the original myisam and one was innodb).Innodb

Start time End time Elapsed time
12:30:32 14:08:46 1:38:14

MyIsam

Start time End time Elapsed time
14:24:35 14:39:53 15:18

The second set of results will show the size of the table after import.

Innodb

5930745856 (5.6G) opens_5_innodb.ibd

5.6G

MyIsam

1240851825 (1.2G) opens_5_myisam.MYD

920984576 (879M) opens_5_myisam.MYI

2.1G

The MyIsam tables are roughly 62% smaller.

Trying to speed up the Innodb import I split the import file into two parts. I fed both parts - one after the other - into the Innodb table to see if this changed the load time significantly.

Innodb (data split into two files)

Start time End time Elapsed time
16:28:07 17:12:30 44:27

Seeing that this improved things quite a bit (44:27 vs 1:38:14) I then split the data into four files and once again fed them into the table one by one:

Start time End time Elapsed time
18:44:45 20:14:33 1:29:48

This did not do so well. It was only a few minutes faster than loading the data from one file. I don’t know why it was so much slower loading from four files than two files.

Conclusions

This is fascinating. The difference in load time between MyIsam and Innodb tables is so dramatic that I went back and did a count of the rows on each table to make sure that everything was actually loaded during the MyIsam import. The only major difference is that MyIsam is a non-transactional engine and Innodb is a transactional engine. I do not understand why the splitting of the input file for the Innodb table into two files decreased load time dramatically (compared to a single input file) but when I split the same input file into four files it made almost no affect.

5 Comments so far

  1. Kevin Burton July 26th, 2007 6:41 pm

    That’s no the only difference between MyISAM and INNOB…

    INNODB uses a write ahead log which is why you saw the significant performance boost.

  2. ryan July 26th, 2007 7:21 pm

    In my own tests, I’ve found that the difference between InnoDB and MyISAM loading is much less pronounced if you can use tab delimited files and mysqlimport. For tables with no varchar/text columns, a properly tuned InnoDB setup gets me about 10% slower than MyISAM. For text-heavier tables I found the difference to be about 25% slower.

    If using mysqldump extended insert syntax, I find similar results to what you posted above.

  3. Thilo Raufeisen July 27th, 2007 1:25 am

    You should use bulkinserts for innodb.
    So instead
    insert into foo (col1, col2) values (’foo’, ‘bar’);
    insert into foo (col1, col2) values (’foo2′, ‘bar2′);
    you should use
    insert into foo (col1, col2) values (’foo’, ‘bar’),(’foo2′, ‘bar2′);

    this will speed up a lot.

  4. Roland Bouman July 28th, 2007 11:14 am

    Hi there,

    did you have autocommit on or off for the InnoDB tests? For better performance, turn autocommit off, insert rows and commit (or use the bulk insert suggestion by Thilo)

  5. Isotopp July 28th, 2007 4:00 pm

    InnoDB likes primary keys and primary key order. If you load data in primary key order, you are triggering an insert buffer optimization. Also, the data set size is likely to be smaller, because the index is packed tighter. So maybe a “sort” before load will help.

    If you are loading data in non-primary key order, an alter table t engine=innodb on the innodb table will recreate the InnoDB table, costing additional time. But the data will be read in InnoDB native order, which is PK order, and the secondary InnoDB table will be tightly packed, and be up to 1/3 smaller.

    LOAD DATA INFILE is your friend with InnoDB, and so is extended insert syntax, as has already been said above. Note that InnoDB sucks at rollback, so make sure your loads are not creating too many undo log entries. Transaction sizes of 1000 to 10000 rows seem to be nice. If you have hard benchmark data for the sweet spot on your system, I’d like to see.

Leave a reply