Archive for July, 2007
Cool Schema Resource
Check out this site: http://www.databaseanswers.org/data%5Fmodels/
It’s a great resource of sample schemas for all kinds of stuff. I was looking at setting up a sample database for a library to examine sql query tuning and googled for sample schemas.
Hope that might prove useful.
XFS filesystem on Coraid
Just a follow-up on the two previous post I had about the Coraid SAN. Today I formatted a partition as XFS and ran the iozone tests against it. Here are the results:
Coraid w/xfs filesystem, switch and jumbo frames
| Jumbo Frames | Switch Connect |
| /data � xfs write (Kbytes/s) | /data � xfs read (Kbytes/s) |
| 87106 | 97790 |
| 82377 | 98685 |
| 81747 | 96517 |
| 86159 | 98289 |
| 87023 | 98716 |
| 83515 | 98660 |
| 78546 | 98878 |
| 81995 | 97479 |
| 82144 | 99673 |
| 82734 | 98480 |
| 83334.6 (AVG) | 98316.7(AVG) |
Comparing this to previous tests are quite enlightening. First, XFS is the clear leader in the write category. It is over 20MB/sec faster than either ext3 or reiserfs. The read rates are slightly slower than ext3 but it is a very small difference (around 2MB/sec). It is slightly faster than reiserfs.
With the clear write advantage we are very seriously considering using it for our Coraid which will be where we store data for our new cluster.
I have one question. I think I remember that there was a problem with XFS some time ago. If I recall correctly it would actually zero out files with some kind of bug. Another words I might have a very large innodb data file and it ends up as completely empty. Not a good situation. Anyone know about this? If I am correct..has this been resolved?
Thanks for the request to perform this test Stewart. Hope everyone finds it interesting.
2 commentsMyIsam vs Innodb
I did some testing because we are going to convert a very large merge table (composed of five MyIsam tables) into Innodb. I knew that Innodb tables were larger than MyIsam but I didn’t have any real figures. In addition I used the time to test loading data into both MyIsam and Innodb tables to see which loaded faster. As you will see I ran into some interesting results at the end. If anyone can explain why it would be MUCH slower to load the Innodb data from four files instead of two files I would love to hear from you. The files are loaded sequentially.
Testing MyIsam and Innodb
One of the problems that we have on an older database is that we have one table arranged as a merged-MyIsam table of five MyIsam tables. Whenever a long-running select query is issued on the table it blocks writes from happening. Even though we used insert delays on this table, after a while we simply run out of connections because there is a limit of 400.Moving to the new version of MySQL gives us a good opportunity to do some upgrading. It has been proposed that we replace the Merge table with an Innodb table. One consideration is how much space that this is going to consume.
Setup
I am importing the table from the last backup we ran on on the database. On satest2 I will set up the same table structure and data with two different names and with an engine type of Innodb for one and MyIsam for another. Then we can compare the disk space used by the two tables to see which is better (in regards to space). In addition I will be testing the import times of innodb and myisam.
Tests
The first set of results will show the time taken to import the data. The dataset was the same with just the table type changed (one was the original myisam and one was innodb).Innodb
| Start time | End time | Elapsed time |
| 12:30:32 | 14:08:46 | 1:38:14 |
MyIsam
| Start time | End time | Elapsed time |
| 14:24:35 | 14:39:53 | 15:18 |
The second set of results will show the size of the table after import.
Innodb
5930745856 (5.6G) opens_5_innodb.ibd
5.6G
MyIsam
1240851825 (1.2G) opens_5_myisam.MYD
920984576 (879M) opens_5_myisam.MYI
2.1G
The MyIsam tables are roughly 62% smaller.
Trying to speed up the Innodb import I split the import file into two parts. I fed both parts - one after the other - into the Innodb table to see if this changed the load time significantly.
Innodb (data split into two files)
| Start time | End time | Elapsed time |
| 16:28:07 | 17:12:30 | 44:27 |
Seeing that this improved things quite a bit (44:27 vs 1:38:14) I then split the data into four files and once again fed them into the table one by one:
| Start time | End time | Elapsed time |
| 18:44:45 | 20:14:33 | 1:29:48 |
This did not do so well. It was only a few minutes faster than loading the data from one file. I don’t know why it was so much slower loading from four files than two files.
Conclusions
This is fascinating. The difference in load time between MyIsam and Innodb tables is so dramatic that I went back and did a count of the rows on each table to make sure that everything was actually loaded during the MyIsam import. The only major difference is that MyIsam is a non-transactional engine and Innodb is a transactional engine. I do not understand why the splitting of the input file for the Innodb table into two files decreased load time dramatically (compared to a single input file) but when I split the same input file into four files it made almost no affect.
5 commentsTesting mysqldump and mysqlimport - part II
First, thanks for all the input everyone. It has helped immensely. I ran more test with the suggested ideas. Some of them helped a great deal. Some, not so much, but that could be the hardware I am running the test on. The server is running 32-bit Linux, but has 4 gigs of RAM. All of the RAM is not visible to the OS. I think that is why raising the buffer on Innodb actually slowed it down.
One other note. I didn’t take the time to average the test results of each group run.
All of the following tests were run with the Innodb buffer set to one gig.
importing icengine3_2 with mysqlimport -u root -p –use-threads=2 database3 /data2/database3/*.txt
| start time | stop time | elapsed time |
| 15:39:27 | 15:51:44 | 12:11 |
| 15:51:52 | 16:04:13 | 12:21 |
| 16:04:21 | 16:16:52 | 12:31 |
| 16:17:01 | 16:28:24 | 11:23 |
| 16:29:32 | 16:41:58 | 12:30 |
importing icengine3_2 with mysqlimport -u root -p –use-threads=4 database3 /data2/database3/*.txt
| start time | stop time | elapsed time |
| 16:44:57 | 16:56:50 | 11:53 |
| 16:56:57 | 17:08:46 | 11:49 |
| 17:08:54 | 17:20:43 | 11:49 |
| 17:20:51 | 17:32:37 | 11:46 |
| 17:32:45 | 17:44:47 | 12:02 |
importing icengine3_2 with mysqlimport -u root -p database2 /data2/database3/*.txt
The following tests were run with innodb_flush_log_at_trx_commit = 2 in the my.cnf file.
| start time | stop time | elapsed time | |
| 17:52:33 | 18:08:54: | 16:27 | |
| 18:20:15 | 18:36:40 | 16:25 | |
| 18:36:50 | 18:53:18 | 16:28 | |
| 18:53:32 | 19:09:56 | 16:28 | |
| 19:10:06 | 19:26:40 | 16:34 | |
| 19:26:51 | 19:43:23 | 16:32 | |
| 19:43:34 | 20:00:03 | 16:29 | |
| 20:00:13 | 20:16:38 | 16:25 | |
| 20:16:48 | 20:33:18 | 16:30 | |
| 20:33:29 | 20:50:08 | 16:37 |
importing icengine3_2 with mysqlimport -u root -p –use-threads=4 database3 /data2/database3/*.txt
The following tests were run with innodb_flush_log_at_trx_commit = 2 in the my.cnf file.
| start time | stop time | elapsed time |
| 20:54:07 | 21:06:00 | 11:53 |
| 21:06:07 | 21:17:57 | 11:50 |
| 21:18:05 | 21:29:57 | 11:52 |
| 21:30:05 | 21:41:52 | 11:47 |
| 21:42:00 | 21:53:59 | 11:59 |
| 21:54:07 | 22:06:02 | 11:55 |
| 22:06:10 | 22:18:02 | 11:52 |
| 22:18:09 | 22:30:06 | 11:57 |
| 22:30:14 | 22:42:08 | 11:54 |
| 22:42:16 | 22:54:03 | 11:47 |
Conclusions
Obviously the mysqlimport command (which really maps to �load data infile�) is the fastest. In its basic form it speeds up an import by around 29% average over the mysql -u root -p database < database.sql command. The innodb buffer variable change of 1 gigabyte to 2 gigabytes actually hurt the performance. I think this was because the 2 gig buffer size was simply to large for a machine with 3.75 gigs of RAM. Also, this was a 32-bit server (the server really has 4 gigs but doesn�t map it properly).
The �use-threads option in mysqlimport helped quit a bit. In fact, going from one (the default) to two threads decreased load time almost 25%. Increasing it from two threads to four threads doesn�t make much of a difference - about 30 seconds over a 1.3 gig import. One caveat, this option is only available on the mysqlimport binaries from the 5.1.x server series. I downloaded the 32-bit version and just grabbed the single binary for mysqlimport. I copied the old binary to mysqlimport.old in case there was any problems and copied over the new version. It worked as advertised.
Using the innodb_flush_log_at_trx_commit = 2 in the my.cnf options didnt’ really make any significant difference in my setup. Make sure that this is set to zero after done with the importing to make certain your data is safe.
One interesting note. Moving from 4.1 to 5.0 shrinks the size of the database. I expected this somewhat, but was supprised at how much. In the testing case it went from 1468M of data to 1315M. This is a 10.4% reduction in size just from the move. Please note that the 4.1 data which was replicated from had been imported into the server from production data. Because of the import before replication the database was at optimum size.
For our uses the optimal command would be: mysqimport -u root -p –use-threads=4 database3 /data2/database3/*.txt
No commentsTesting??? PLEASE!!!!
A few days ago I came across an article about a test comparing Oracle to PostgreSQL. First of all, where was MySQL? The only figure I saw in the article was that PostgreSQL was 12% slower than Oracle.
Then the article talks about how the hardware was “similar” but not the same.
HMMMMMMMMMMMMMMMMMMMMMMMM
Now I am not a testing expert. I didn’t stay in a Holiday Inn Express last night. However, this I know. This test is worthless. It doesn’t prove anything. If you can’t run the test on the same hardware than you can’t rely on the results. I won’t even get it into if one database was tuned and the other wasn’t. I don’t know.
Anyways, just my little rant for the day.
Testing mysqldump and mysqlimport
So we are doing a major project of migrating from version 4.1.22 to 5.0.45. I am testing various things in preparation for this migration. We have a large amount of data to dump and re-load so it is important that this is done in the most efficient way possible. I began doing some testing to see what was possible. I have heard that using a mysqldump command to dump everything in one file (and then importing with mysql -u < databases.sql) is slower than using mysqlimport but I wanted to test for myself. The results were not that surprising but I thought it might help others if I post the summary here.
I set things up on our test server. All database data was stored on an individual hard drive (sdb). The dump was written and read from our Coraid (which is still attached to the server from the previous tests). I did two sets of tests..one with the innodb buffer set to one gigabyte and one set of test with the innodb buffer set to two gigabytes.
Test Results
The first two series of test were run with the innodb buffer set to two gigs.
importing database3 with mysql -u root -p database3 < /data2/database3.sql
| start time | stop time | elapsed time |
| 18:03:00 | 18:27:19 | 24:19 |
| 18:27:34 | 18:51:51 | 24:17 |
| 18:52:03 | 19:16:20 | 24:17 |
| 19:16:33 | 19:40:47 | 24:14 |
| 19:40:59 | 20:05:25 | 24:24 |
| 20:05:37 | 20:29:56 | 24:19 |
| 20:30:09 | 20:54:29 | 24:20 |
| 20:54:41 | 21:18:57 | 24:22 |
| 21:19:10 | 21:43:31 | 24:21 |
| 21:43:43 | 22:07:56 | 24:21 |
importing database3 with mysqlimport -u root -p database3 /data2/database3/*.txt
| start time | stop time | elapsed time |
| 10:21:58 | 10:40:55 | 18:57 |
| 22:08:08 | 22:27:06 | 18:58 |
| 22:27:19 | 22:46:05 | 18:46 |
| 22:46:17 | 23:05:00 | 18:43 |
| 23:05:13 | 23:23:51 | 18:38 |
| 23:24:04 | 23:42:52 | 18:48 |
| 23:43:05 | 00:01:55 | 18:50 |
| 00:02:08 | 00:21:04 | 18:56 |
| 00:21:17 | 00:40:05 | 18:38 |
| 00:40:17 | 00:58:58 | 18:41 |
The next series of test were run with the innodb buffer set to one gig.
importing database3 with mysql -u root -p database3 < /data2/database3.sql
| start time | stop time | elapsed time |
| 04:35:09 | 04:58:28 | 23:19 |
| 04:58:37 | 05:22:01 | 23:24 |
| 05:22:11 | 05:45:55 | 23:44 |
| 05:46:05 | 06:09:42 | 23:37 |
| 06:09:52 | 06:33:31 | 23:39 |
| 06:33:40 | 06:57:22 | 23:42 |
| 06:57:32 | 07:21:17 | 23:45 |
| 07:21:26 | 07:45:07 | 23:41 |
| 07:45:16 | 08:08:56 | 23:40 |
| 08:09:06 | 08:32:43 | 23:37 |
importing database3 with mysqlimport -u root -p database3 /data2/database3/*.txt
| start time | stop time | elapsed time |
| 08:32:53 | 08:49:27 | 16:34 |
| 08:49:35 | 09:06:17 | 16:42 |
| 09:06:28 | 09:23:00 | 16:32 |
| 09:23:10 | 09:39:46 | 16:36 |
| 09:39:57 | 09:56:35 | 16:38 |
| 09:56:45 | 10:13:25 | 16:40 |
| 10:13:35 | 10:30:04 | 16:29 |
| 10:30:15 | 10:46:49 | 16:34 |
| 10:46:59 | 11:03:36 | 16:37 |
| 11:03:47 | 11:20:25 | 16:48 |
The results are interesting. It does prove that the mysqlimport works faster. Quite a bit faster actually. I thought that increasing the memory for the innodb buffer would speed things up also. This was not the case and I am not sure why. I didn’t change anything else. Running the top command didn’t show any apparent problems but I am guessing that a two gigabytes buffer was just to much for the setup.
5 commentsFantastic New Tool!!!!!
I just looked at the finished product of a co-worker. It solves a troubling problem that any DBA will come up against from time to time. The tool is called Querysniffer and it grabs MySQL queries from “the wire”. It does this in real time. You don’t have to stop/start your MySQL server or have to enable the general query log. You can start and stop the Querysniffer at any time.
Massively cool in my opinion!!
Querysniffer is available from http://iank.org/querysniffer. It is written out of perl and is available as both a perl script and a pre-made executable.
As an example I run the program on the server that MySQL is running:
kmurphy@db5-a:~$ sudo mysqlsniff.x64 > output
This must be run from sudo or as root. In this case we send the results to the file output. You can also specify an Ethernet port to grab queries. This is done by adding an eth0 or eth1 before specifying the output file:
kmurphy@db5-a:~$ sudo mysqlsniff.x64 eth0 > output
Why would you want to do this? If you are running Querysniffer on a server that is attached to a SAN then you should probably specify the interface that is not attached to the SAN.
Thanks for the hard work Ian. You have made the lives of a lot MySQL DBAs a little easier!!!
1 commentSupport Your Local MUG
And I don’t mean mugger…your MUG. That would be your MySQL User’s Group. I am fortunate in that when I moved from Pensacola, FL to here in the Raleigh/Durham area of North Carolina that we do have a MUG. I went to my first meeting last Wensday night. I am doubly fortunate in that the company that I work for, iContact, provides a conference room for the meeting. So, I have to walk about 300 feet from my desk to the conference room. Mike Kimsal does a fantastic job heading up our group.
We meet with the PHP Users Group (..a PUGMUG??) and during this meeting we had a gentlemen do a presentation on some tools that Yahoo provides for PHP developers. It is great to meet with people who care about MySQL. I wonder how many people there are like me…running around all day neck-deep in MySQL and whenever I want to talk to someone about the technology I care about the most..they get this blank look on their face like they are thinking “Who is this crazy guy?” Ah well, at least once a month I have an opportunity to talk with people who care about MySQL. If you have a group in your area — join it!! If you don’t have a group in your area..start one! You can search for one near you at http://mysql.meetup.com/.
Have fun at the MUG meeting!
Do You Need a SANdwich?? Part II
So, drumroll…here are the results of testing:
In the quantitative testing I tested against an internal hard drive (formatted with ext3 and then reiserfs), a partition on the Coraid formatted with ext3 and a partition on the Coraid formatted with reiserfs. I also tested with a failed hard drive in the Coraid (while it was rebuilding).
/dev/sda reiserfs (mirrored)
| /testing (sdb) reiserfs write (Kbytes/s) | /testing (sdb) reiserfs read (Kbytes/s) |
| 17408 | 64646 |
| 17118 | 64368 |
| 17140 | 64339 |
| 16347 | 64451 |
| 17080 | 64510 |
| 17055 | 64464 |
| 17024.67 (AVG) | 64463 (AVG) |
/dev/sdb1 ext3
| /testing (sdb) ext3 write (Kbytes/s) | /testing (sdb) ext3 read (Kbytes/s) |
| 41196 | 54154 |
| 40455 | 54078 |
| 40612 | 54246 |
| 40620 | 54229 |
| 41053 | 54160 |
| 40198 | 54217 |
| 40698 (AVG) | 54180.67 (AVG) |
/dev/sdb1 reiserfs
| /testing (sdb) reiserfs write (Kbytes/s) | /testing (sdb) reiserfs read (Kbytes/s) |
| 37041 | 34897 |
| 36559 | 34650 |
| 37201 | 34823 |
| 36867 | 34487 |
| 36420 | 34835 |
| 35064 | 34788 |
| 36525.33 (AVG) | 34746.67 (AVG) |
/dev/sdb1 reiserfs (noatime,notail)
| /testing (sdb) reiserfs write (Kbytes/s) | /testing (sdb) reiserfs read (Kbytes/s) |
| 37116 | 35243 |
| 36916 | 35246 |
| 36587 | 35051 |
| 36467 | 35058 |
| 37102 | 34999 |
| 36690 | 35221 |
| 36813 (AVG) | 35136.33(AVG) |
Coraid w/ext3 filesystem, switch and no jumbo frames
| /data – ext3 write (Kbytes/s) | /data – ext3 read (Kbytes/s) |
| 50266 | 59311 |
| 46901 | 77020 |
| 46478 | 76883 |
| 48248 | 76896 |
| 49829 | 76969 |
| 49381 | 74925 |
| 48517.17 (AVG) | 73667.33 (AVG) |
Coraid w/reiserfs filesystem, switch and no jumbo frames
| /data2 – reiserfs write (Kbytes/s) | /data2 – resierfs read (Kbytes/s) |
| 59311 | 73519 |
| 58776 | 74742 |
| 52935 | 73526 |
| 59458 | 75665 |
| 56600 | 75375 |
| 60726 | 75814 |
| 57967.67 (AVG) | 74773.5 (AVG) |
Coraid w/reiserfs filesystem, no switch and jumbo frames
| Jumbo Frames | Direct Connect |
| /data2 – reiserfs write (Kbytes/s) | /data2 – resierfs read (Kbytes/s) |
| 63894 | 99924 |
| 62547 | 96795 |
Coraid w/reiserfs filesystem, switch and jumbo frames
| Jumbo Frames | Switch Connect |
| /data2 – reiserfs write (Kbytes/s) | /data2 – resierfs read (Kbytes/s) |
| 62328 | 95355 |
| 64189 | 98888 |
| 60961 | 96272 |
| 63776 | 95773 |
| 64258 | 97866 |
| 62483 | 98469 |
| 62999.17 (AVG) | 97103.83 (AVG) |
Coraid w/ext3 filesystem, switch and jumbo frames
| Jumbo Frames | Switch Connect |
| /data - ext3 write (kbytes/s) | /data - ext3 read (Kbytes/s) |
| 58595 | 101549 |
| 58145 | 98725 |
| 58344 | 100478 |
| 59689 | 102201 |
| 59244 | 101434 |
| 63951 | 99432 |
| 59661.33 (AVG) | 100636.67 (AVG) |
Coraid w/ext3 filesystem, switch, jumbo frames and degraded raid
| Jumbo Frames | Switch Connect |
| /data - ext3 write (kbytes/s) | /data - ext3 read (Kbytes/s) |
| 12102 | 23846 |
| 27402 | 24224 |
| 3916 | 28248 |
| 15560 | 45066 |
| 38592 | 53528 |
| 30365 | 30616 |
| 21322.83 (AVG) | 34254.67 (AVG) |
Coraid w/reiserfs filesystem, switch, jumbo frames and degraded raid
| Jumbo Frames | Switch Connect |
| /data2 - reiserfs write (kbytes/s) | /data2 - reiserfs read (Kbytes/s) |
| 38317 | 42851 |
| 42463 | 43326 |
| 31546 | 51028 |
| 32177 | 24707 |
| 33981 | 26320 |
| 34643 | 27002 |
| 35521.17 (AVG) | 35872.33 (AVG) |
Coraid w/ocfs2 filesystem, switch and jumbo frames
| Jumbo Frames | Switch Connect |
| /data3 - ocfs2 write (kbytes/s) | /data3 - ocfs2 read (Kbytes/s) |
| 36849 | 83958 |
| 40973 | 84855 |
| 38983 | 83853 |
| 40748 | 85188 |
| 39143 | 83710 |
| 40535 | 83146 |
| 39538.5 (AVG) | 84118.33 (AVG) |
Qualitative Testing Results
For the qualitative testing I loaded up a stock binary install of MySQL 5.0. The database (icengine3_2) was loaded from a dump of a production server. I used mybench to execute a query against the database pulling random records from one of the tables. The number of connections utilized are included below with the summary data for each test run. To establish a baseline I ran the test against the MySQL server pulling data off the interal drives (mirrored reiserfs) and a single drive dedicated only to MySQL data. Then I ran the tests against data loaded onto ext3 and reiserfs Coraid partitions. Nothing else was happening to the the Coraid. This would be similar to a single server attached to the Coraid. Then I ran the test against both the ext3 and reiserfs partitions on the Coraid with a throughput test being performed on the partition not being used to hold the mysql data. As an example: if the ext3 partition was holding the data for the test than the reiserfs partition of the Coraid had a throughput (both read and write) test being performed at the same time. This would more closely simulate what it would be like with multiple servers attached to different partitions of the Coraid. Finally, these tests were repeated while one drive of the Coraid was being rebuilt.
/dev/sda drive w/reiserfs (mirrored)
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 2429 | 2862 | 2736 | 2636 | 2417 |
| 2997 | 2874 | 2747 | 2631 | 2416 |
| 3081 | 2851 | 2758 | 2634 | 2414 |
| 3069 | 2841 | 2759 | 2646 | 2413 |
| 3104 | 2869 | 2750 | 2640 | 2417 |
| 3085 | 2862 | 2773 | 2631 | 2415 |
| 3097 | 2853 | 2771 | 2632 | 2410 |
| 3068 | 2855 | 2755 | 2629 | 2414 |
| 3078 | 2858 | 2767 | 2621 | 2412 |
| 3113 | 2842 | 2753 | 2641 | 2420 |
| 3012 (AVG QPS) | 2856 (AVG QPS) | 2756 (AVG QPS) | 2634 (AVG QPS) | 2414 (AVG QPS) |
/dev/sdb1 drive w/reiserfs (noatime, notail)
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 2236 | 2876 | 2773 | 2641 | 2416 |
| 2944 | 2886 | 2776 | 2640 | 2417 |
| 3097 | 2872 | 2764 | 2644 | 2428 |
| 3085 | 2874 | 2772 | 2639 | 2420 |
| 3110 | 2862 | 2755 | 2636 | 2422 |
| 3117 | 2883 | 2780 | 2647 | 2425 |
| 3092 | 2859 | 2773 | 2639 | 2422 |
| 3101 | 2880 | 2783 | 2644 | 2418 |
| 3143 | 2837 | 2755 | 2641 | 2427 |
| 3147 | 2861 | 2762 | 2640 | 2418 |
| 3007 (AVG QPS) | 2869 (AVG QPS) | 2769 (AVG QPS) | 2641 (AVG QPS) | 2421 (AVG QPS) |
/dev/sdb1 drive w/ext3
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 2294 | 2840 | 2758 | 2636 | 2422 |
| 2986 | 2853 | 2750 | 2633 | 2413 |
| 3059 | 2857 | 2742 | 2634 | 2415 |
| 3089 | 2867 | 2769 | 2629 | 2418 |
| 3064 | 2852 | 2747 | 2629 | 2416 |
| 3080 | 2854 | 2765 | 2633 | 2416 |
| 3083 | 2848 | 2751 | 2626 | 2417 |
| 3058 | 2852 | 2740 | 2617 | 2409 |
| 3043 | 2841 | 2754 | 2640 | 2416 |
| 3096 | 2894 | 2747 | 2633 | 2418 |
| 2985 (AVG QPS) | 2855 (AVG QPS) | 2752 (AVG QPS) | 2631 (AVG QPS) | 2416 (AVG QPS) |
Coraid w/reiserfs filesystem, switch, jumbo frames
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 3170 | 2873 | 2754 | 2625 | 2410 |
| 3183 | 2862 | 2769 | 2630 | 2413 |
| 3169 | 2870 | 2759 | 2608 | 2410 |
| 3143 | 2867 | 2751 | 2621 | 2410 |
| 3165 | 2881 | 2739 | 2622 | 2411 |
| 3172 | 2900 | 2746 | 2621 | 2420 |
| 3195 | 2908 | 2758 | 2628 | 2436 |
| 3176 | 2923 | 2760 | 2631 | 2438 |
| 3223 | 2899 | 2761 | 2630 | 2439 |
| 3189 | 2920 | 2754 | 2625 | 2436 |
| 3178 (AVG QPS) | 2890 (AVG QPS) | 2755 (AVG QPS) | 2624 (AVG QPS) | 2422 (AVG QPS) |
Coraid w/ext3 filesystem, switch, jumbo frames
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 3061 | 2849 | 2714 | 2572 | 2361 |
| 3070 | 2812 | 2715 | 2573 | 2364 |
| 3051 | 2824 | 2711 | 2577 | 2363 |
| 3060 | 2831 | 2695 | 2576 | 2367 |
| 3026 | 2819 | 2713 | 2561 | 2364 |
| 3072 | 2822 | 2707 | 2567 | 2362 |
| 3088 | 2813 | 2706 | 2564 | 2363 |
| 3031 | 2815 | 2702 | 2566 | 2363 |
| 2966 | 2845 | 2704 | 2581 | 2365 |
| 3092 | 2832 | 2699 | 2568 | 2355 |
| 3051 (AVG QPS) | 2826 (AVG QPS) | 2706 (AVG QPS) | 2570 (AVG QPS) | 2362 (AVG QPS) |
Coraid w/ext3 filesystem, switch, jumbo frames
These tests were run while iozone was running against the reiserfs partition on the Coraid.
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 2989 | 2837 | 2749 | 2637 | 2418 |
| 3077 | 2850 | 2747 | 2633 | 2421 |
| 3093 | 2857 | 2750 | 2629 | 2420 |
| 3137 | 2870 | 2746 | 2635 | 2417 |
| 3083 | 2880 | 2743 | 2631 | 2422 |
| 3015 | 2853 | 2745 | 2640 | 2420 |
| 2938 | 2857 | 2749 | 2640 | 2424 |
| 3087 | 2849 | 2747 | 2635 | 2418 |
| 3063 | 2890 | 2749 | 2646 | 2421 |
| 2748 | 2841 | 2764 | 2639 | 2423 |
| 3023 (AVG QPS) | 2858 (AVG QPS) | 2748 (AVG QPS) | 2636 (AVG QPS) | 2420 (AVG QPS) |
Coraid w/ext3 filesystem, switch, jumbo frames
These tests were run while iozone was running against the reiserfs partition on the Coraid. In addition the Coraid had a degraded array.
| 10 connections | 20 connections | 40 connections | 100 connections | 400 connections |
| 3053 | 2844 | 2726 | 2621 | 2446 |
| 3060 | 2845 | 2728 | 2624 | 2421 |
| 3061 | 2837 | 2746 | 2636 | 2420 |
| 3117 | 2828 | 2734 | 2640 | 2419 |
| 3018 | 2855 | 2740 | 2625 | 2421 |
| 3001 | 2934 | 2731 | 2623 | 2420 |
| 3177 | 2792 | 2753 | 2626 | 2425 |
| 3006 | 2889 | 2744 | 2619 | 2428 |
| 3001 | 2844 | 2752 | 2622 | 2418 |
| 3052 | 2861 | 2733 | 2614 | 2412 |
| 3054 (AVG QPS) | 2852 (AVG QPS) | 2738 (AVG QPS) | 2625 (AVG QPS) | 2423 (AVG QPS) |
Conclusions
I think this conclusively proves that the Coraid is capable of handling multiple servers while maintaining reasonable throughput. Even with a degraded drive the actual (read) performance of the MySQL server does not suffer. Heavy write performance to a database would suffer as the write throughput during a rebuild of an array is roughly half of the normal throughput. As for the question about using reiserfs or ext3 as a filesystem the performance numbers are close enough that it would be wise to consider that reiserfs has better functionality under the LVM system that we use on our hard drives. Currently there are six drives (five active) in the Coraid. Increasing the drive count in the coraid will also improve the throughput. According to the numbers released by Coraid it should be a fairly dramatic increase (on the order of 100% faster write performance with a full complement of 14 drives vs the current complement of six drives). Of course increasing the drive count will increase the number of platters and spindles so it would be expected. I also tried bonding two Ethernet ports. This did not increase throughput.
With Ian’s help I did test the Oracle clustering filesystem. At this point it is really to fragile to consider using. In addition, the performance throughput testing that I did perform indicated that it was going to be significantly slower than both ext3 and the reiserfs. While we have to partition off the Coraid and dedicate each specific parition to a server - I think that this is certainly justified for more a more reliable filesystem that gives better performance.
Jumbo frames make a difference. I proved early in the testing that just configuring the switch and the ethernet card for jumbo frames increases raw throughput by around 20 Mbytes a second. NOT ALL ETHERNET CARDS SUPPORT MTUS ABOVE 1500!!! Check with the vendor before purchasing to see if this is supported.
Thanks to both Ian and Justin for their help with LVM, the ocfs and general system crap
2 commentsDo You Need a SANdwich??
I just wrapped up a test to determine if it would be possible to implement a SAN as a backend for our data storage. For those who do not know - SAN stands for Storage Area Network. It is a close relative to a NAS (Network Area Storage) unit. Basically it a fancy RAID unit that you plug into your network instead of directly into the server itself.
I tested a storage unit by Coraid (http://www.coraid.com). The model was an SR1521. It will take up to 14 drives. I actually did the testing with six drives installed. Before I get into some details about the testing itself let me mention a couple of things about Coraid itself. I think Coraid has built a great product. I am not completely certain, but if I recall Coraid is the inventor of the AoE (ATA over Ethernet) protocol. It is a protocol that allows very fast communication between the server and the SAN unit. The AoE protocol and driver are open and the source code for the driver is available for download from the Coraid website. I emailed the company a couple of times during testing with various questions. They responded quickly and were helpful with my problems. From a company perspective I have absolutely no qualms recommending them. When doing any type of benchmarking it is important that you change as little as possible between test. If I was trying to determine what various changes to a my.cnf file might make I would not make four, three, or even two changes to the my.cnf file and run my benchmarking program. I would make one change and run my benchmarking program. Otherwise, how will I know what is happening? Adding 512MB to the Innodb buffer pool might improve performance by 20%. But, adding the general query log option might drop performance 10%. I have a net gain of 10% changing both variables but I would not realize what was really going on under the hood.
I used two primary tools when doing my benchmarking. For basic throughput (both read and write) testing I used the iozone (http://www.iozone.org)program. I used this to run basic write and read tests against an individual hard drive, a software mirrored hard drive, and then partitions formatted with ext3, reiserfs and the oracle clustering file system (yes, I know oracle..laugh) mounted on the Coraid unit. I also used the mybench program (http://jeremy.zawodny.com/mysql/mybench/) of Jeremy Zawodny. This tool will let you take a query and run it against a database many times and with multiple clients. I used it to run test of 10 simultaneous connections through 400 simultaneous connections.
Next time I will talk about the actual test results.
No comments