Archive for November, 2007
MySQL Users Group Meeting - Raleigh, NC - December 12 - Jay Pipes Speaking
On December the 12th, our local MUG is having the honor of hosting Jay Pipes as the speaker for our meeting. I personally am looking forward to meeting him as he has obviously had a great impact on the MySQL community with his work. If you are in the area, or anywhere within driving distance, I would recommend that you come by and join the meeting!!
If you can make it and are not a normally Triangle MUG person you will want to let us know so we can plan on how many people are going to be here. Please email me at “bmurphy” at “paragon-cs.com” to let me know and I will pass it along to the powers-that-be
Meeting time is 7:00 pm and it will go until 8:30 pm.
QPP Update and New Tutorial
We made a minor update to the QPP toolkit today (”ver. 1.01″). I uploaded the new files to the website. They can be found at http://www.paragon-cs.com/queryprogs. In addition, I have beginnings of a tutorial located at http://www.paragon-cs.com/queryprogs/tutorial.html. It will make this a bit easier to understand.
A couple of people have asked how the QPP toolkit is different than other programs. Please give me a minute to try and explain. QPP can be used to capture queries being executed on a production server at any time. In addition, it will take the captured queries and insert the text of those queries plus timing and host information into a database which you can then use to determine where your servers are spending time executing queries. Finally, you can take that same data that you gathered from a production server and use to generate benchmarking and load tests against a test server. If set up properly these tests can be repeated as many times as needed.
Hope that helps!!
No commentsQuery Processing Programs — Version 1.0!!!
Hey everyone. I just finished uploaded the latest versions of the three QPP programs. They are all working well and I consider them “version 1.0″. I will be updating the qpp pages as soon as possible with a tutorial on how all these programs work together.
The basic idea is that you use the querysniffer program to gather production data from your servers as needed without stopping/starting the MySQL server. Queryparse is actually used to control querysniffer and push the gathered data into a database from which you can use simple php/html pages to view the statistics.
Querybench is used to benchmark/stress-test a test server with production data (gathered, once again, with querysniffer). You can even set it up so that the database can be “rolled back” to its initial state after a run.
All in all, I am very pleased. I have fulfilled all the goals I set out to achieve when I began this little journey a few months ago. It will be very useful in my work and I hope others find it useful as well.
I would like to publicly thank Ian Kilgore who helped far above and beyond anything required. He is one of the system administrators where I work and is the genius programmer behind this stuff. I just had the idea and kind of directed things along. The only “programming” I have done is working on the documentation a bit and developing the web pages for the display of the statistics.
Feedback can be sent to “bmurphy” at “paragon-cs.com”.
Now what do I with all this free time I have left over *cough*?
No commentsA Long Time Coming
So I recently came across a situation at work where I need to copy almost 150 million rows of data from a MyISAM table to an Innodb table. Here is the problem that I started thinking of approximately 130 million rows (and ten or eleven hours) into the transfer: what happens if the query gets aborted?
Having had large transfer die once before I know from experience it isn’t pretty. The way I ended up stopping the rollback the first time was to kill the mysqld daemon and rip all the data of the database out and restore the database after restarting the server. I could be so heavy handed during this recovery because it was a test server. Not something that you could do in production.
If I had to cancel the query I ran late last week it would have taken days for the transaction to rollback. So how could I keep a rollback like this from happening– or at least minimize it?
Here is what I ran:
INSERT INTO new_table SELECT * FROM old_table
Of course this selects everything from one table and copies it to a new table. What came to mind when I started thinking about this was:
INSERT INTO new_table SELECT * FROM old_table LIMIT X OFFSET Y
I posted a question to the mysql mailing list and got similar comments. I thought I might be on the right track.
So this seems to work. BUT…I am not convinced that it would. This is similar to the replication issue that Baron Schwartz has talked about this here:
http://www.xaprb.com/blog/2007/11/08/how-mysql-replication-got-out-of-sync/
Baron was having trouble because an ORDER BY….LIMIT statement was returning non-deterministic results. I think something similar could go on here.
First, let me say that I am in no way a “SQL expert”. I don’t pretend to be one. However, follow me for a minute. And feel free to comment :). When you execute a SELECT such as the one above against a database, the server is in no way constrained to give you the results in a deterministic order. When you run the above select statement multiple times with different offsets (to “chunk” the output so it’s easier to roll back) you aren’t guaranteed to get the proper results. At least that is what I understand.
I may be mistaken :). Be gentle if I am. Remember, I am not a SQL guru.
Another recommendation was to do something like the following:
INSERT INTO new_table SELECT * FROM old_table WHERE user_id mod [num_of_passes] = 0, 1, 2 . . . (num_of_passes - 1)
I believe that user_id would have to be a unique key. This query would return completely consistent results. I am not sure what you would do if you didn’t have a unique key to run the WHERE clause against.
Just some food for thought
****added…as Paul pointed out this is valid only as long as no inserts/updates/deletes are performed on the SELECTed table. I was operating under that assumption (the table I was reading from was not being written too) but forgot to mention it. Thanks Paul!!
7 commentsMySQL Magazine - Format/Policy Change??
Hey everyone. I am seriously thinking about changing somewhat the format of the magazine. So far I have been able to pull together enough articles each time to publish but I would like to look at an additional way to get material.
There are many high-quality blog entries written every month about MySQL. I would like to (with permission of the author) edit some of them for grammar and any clarifications needed and publish them as articles. I am already planning on doing this with two of Peter Zaitsev’s blog postings on Innodb tuning. Combining those two posts is going to make for a fantastic article. I have already discussed this with him and he agreed to a trial.
Will I still accept articles written strictly for the magazine? Absolutely!! This just means I don’t have to go around twisting peoples arms :).
Thoughts? Comments?
No commentsmysql-parallel-dump test
I just wrapped up a test of the new mysql-parallel-dump and mysql-parallel-restore that Baron Schwartz has released as part of his toolkit. The toolkit is available from http://mysqltoolkit.sourceforge.net
I tested on an 8-core (dual processor/quad-core) server with 8 gigabytes of RAM. I baselined with the normal mysqldump command (no options) and then benchmarked variations of both the mysqldump command and the mysqlimport command against the mysql-parallel-dump and mysql-parallel-restore command.
The database tested was under two gigs in size so it wasn’t an ideal test but I was just trying to get an idea of how I can expect it to operate.
The results:
Test One: Parallel Dump & Restore
rm -rf /home/kmurphy/pdump/default/*time mysql-parallel-dump –databases pdump –tab –basedir=/home/kmurphy/pdump mysql -e ‘drop database pdump’ mysql -e ‘create database pdump’
| run time | |
| run one | 0m22.166s |
| run two | 0m22.348s |
| run three | 0m21.625s |
| run four | 0m22.723s |
| run five | 0m16.744s |
| run six | 0m24.603s |
| run seven | 0m16.647s |
| run eight | 0m21.449s |
| run nine | 0m18.590s |
| run ten | 0m21.975s |
| average | |
| 20.887s |
time mysql-parallel-restore –tab /home/kmurphy/pdumpmysql -e ‘reset master’
| run time | |
| run one | 4m3.048s |
| run two | 4m24.352s |
| run three | 4m25.403s |
| run four | 4m8.525s |
| run five | 4m18.319s |
| run six | 4m7.668s |
| run seven | 4m32.558s |
| run eight | 4m29.318s |
| run nine | 4m10.424s |
| run ten | 4m53.700s |
| average | |
| 4m21.332s |
Test Two: Parallel Dump & Restore w/Chunk Size of 100,000
rm -rf /home/kmurphy/pdump/default/*time mysql-parallel-dump –databases pdump –tab –chunksize=100000 –basedir=/home/kmurphy/pdump mysql -e ‘drop database pdump’ mysql -e ‘create database pdump’
| run time | |
| run one | 0m16.207s |
| run two | 0m14.899s |
| run three | 0m14.988s |
| run four | 0m15.331s |
| run five | 0m15.112s |
| run six | 0m16.124s |
| run seven | 0m15.902s |
| run eight | 0m14.763s |
| run nine | 0m14.903s |
| run ten | 0m15.261s |
| average | |
| 15.394s |
time mysql-parallel-restore –tab /home/kmurphy/pdumpmysql -e ‘reset master’
| run time | |
| run one | 4m27.080s |
| run two | 4m43.456s |
| run three | 4m32.554s |
| run four | 4m27.986s |
| run five | 5m3.333s |
| run six | 4m48.599s |
| run seven | 4m14.181s |
| run eight | 5m7.383s |
| run nine | 4m44.915s |
| run ten | 5m3.911s |
| average | |
| 4 min 39.34s |
Test Three: Default Dump & Restore
time mysqldump –databases pdump > /data/backup/testing/normal/dumptest.sql
| run time | |
| run one | 0m38.788s |
| run two | 0m43.511s |
| run three | 0m39.406s |
| run four | 0m41.640s |
| run five | 0m40.168s |
| run six | 0m39.696s |
| run seven | 0m43.360s |
| run eight | 0m43.546s |
| run nine | 0m43.370s |
| run ten | 0m42.890s |
| average | |
| 41.638s |
time mysql < /data/backup/testing/normal/opt/dumptest.sqlmysql -e ‘reset master’
| run time | |
| run one | 9m33.572s |
| run two | 9m32.136s |
| run three | 9m28.766s |
| run four | 9m32.804s |
| run five | 9m31.367s |
| run six | 9m32.018s |
| run seven | 9m33.421s |
| run eight | 9m30.748s |
| run nine | 9m34.836s |
| run ten | 9m29.171s |
| average | |
| 9m31.884s |
Test Four: Optimized standard dump with grouped insert statements and restore
time mysqldump –opt -e –databases pdump > /home/kmurphy/dump/dumptest.sql
| run time | |
| run one | 0m43.553s |
| run two | 0m44.514s |
| run three | 0m39.177s |
| run four | 0m41.644s |
| run five | 0m41.356s |
| run six | 0m40.577s |
| run seven | 0m39.982s |
| run eight | 0m39.553s |
| run nine | 0m41.970s |
| run ten | 0m41.008s |
| average | |
| 41.333s |
time mysql < /home/kmurphy/dump/dumptest.sqlmysql -e ‘reset master’
| run time | |
| run one | 9m17.282s |
| run two | 9m28.965s |
| run three | 9m29.265s |
| run four | 9m35.659s |
| run five | 9m31.757s |
| run six | 9m32.542s |
| run seven | 9m27.294s |
| run eight | 9m28.672s |
| run nine | 9m32.159s |
| run ten | 9m27.542s |
| average | |
| 9m29.114s |
Test Five: Optimized mysqldump and restore w/tab delimited files
time mysqldump -Q -T /home/kmurphy/dump/pdump –opt –single-transaction pdumpcat /home/kmurphy/dump/pdump/*.sql > /home/kmurphy/dump/pdump/xyz.sql time mysql pdump < /home/kmurphy/dump/pdump/xyz.sql
| run time | |
| run one | 38.188s |
| run two | 33.321s |
| run three | 33.663s |
| run four | 33.096s |
| run five | 33.268s |
| run six | 33.203s |
| run seven | 33.352s |
| run eight | 33.073s |
| run nine | 37.682s |
| run ten | 33.196s |
| average | |
| 34.204s |
time mysqlimport pdump /home/kmurphy/dump/pdump/*.txtrm /home/kmurphy/dump/pdump/xyz.sql mysql -e 'reset master'
| run time | |
| run one | 5.348s + 6m10.926s |
| run two | 4.905s + 6m11+366s |
| run three | 6.770s + 6m11.556s |
| run four | 5.859s + 6m5.699s |
| run five | 6.066s + 6m10.076s |
| run six | 6.478s + 6m6.743s |
| run seven | 5.422s + 6m7.818s |
| run eight | 6.055s + 6m14.252s |
| run nine | 5.016s + 6m11.458s |
| run ten | 6.730s + 6m16.509s |
| average | |
| 6m16.505s |
Test Six: Optimized mysqldump and restore w/tab delimited files and extended inserts
time mysqldump -Q -T /home/kmurphy/dump/pdump --opt --single-transaction -e pdump
| run time | |
| run one | 33.164s |
| run two | 33.463s |
| run three | 33.239s |
| run four | 33.182s |
| run five | 33.140s |
| run six | 33.596s |
| run seven | 33.115s |
| run eight | 33.063s |
| average | |
| 33.245s |
cat /home/kmurphy/dump/pdump/*.sql > /home/kmurphy/dump/pdump/xyz.sqltime
time mysql pdump < /home/kmurphy/dump/pdump/xyz.sql
time mysqlimport pdump /home/kmurphy/dump/pdump/*.txtrm /home/kmurphy/dump/pdump/xyz.sql
mysql -e 'reset master'
| run time | |
| run one | 5.846s + 6m12.939s |
| run two | 5.195s + 6m10.240s |
| run three | 6.221s + 6m9.272s |
| run four | 5.431s + 6m10.620s |
| run five | 6.297s + 6m15.142s |
| run six | 5.740s + 6m12.571s |
| run seven | 6.551s + 6m15.873s |
| run eight | 6.293s + 6m11.707s |
| average | |
| 6m18.242s |
The results are quite impressive. The mysql-parallel-dump with chunksize option that I ran didn’t seem to make any difference. Don’t be fooled as this was on a smaller data set. If you have a multi-gigabyte table with millions of rows in innodb format this could make a huge difference. Don’t underestimate it.
The clear winner is the mysql-parallel-dump paired with mysql-parallel-restore. The parallel import/dump over a normal mysqldump w/various options is around twice as fast. Using mysqlimport with mysqldump brings the time difference down to around two minutes but it is still fairly significant. With a larger dataset I think the difference would become more pronounced.
In short, you should strongly consider using the parallel dump and restore tool for your backup needs if you are currently using mysqldump.
2 commentsflush hosts command
We had a little fun yesterday. We had a mysql server that stopped because it turns out that it ran out of disk space in the /var directory. Someone was uploading a large list and it exhausted the disk space writing bin logs before our monitoring tools picked it up.
The server process ended up being killed and then restarted. I then cleaned up some bin-logs so that the server had room. That seemed to solve the problem. Everything looked like it was running. However, after about 15 or 20 minutes we realized that the application itself wasn’t running correctly.
Turns out that the application had continued to try and make connections to the database server while it was experiencing problems. These attempted connections were being logged as connection errors and after the default 10 connection errors the server essentially locks down.
The reference manual lists information here:
http://dev.mysql.com/doc/refman/5.0/en/blocked-host.html
We are going to raise the connection error count on our servers to keep this from happening in the future. Consider this carefully if your servers are exposed to the Internet (ours aren’t).
If you run the flush hosts command it will resolve this problem. I am not sure if restarting the mysqld daemon would have resolved the problem (I think it would have). Even though we had restarted the daemon, it was before clearing the log files so failed connections continued to occur.
One final note…in our case the error described in the reference manual did not show up in the error log. So don’t expect it to be there if this problem occurs.
2 commentsRaleigh MySQL Users Group
Wednesday, Nov. 7, is the next meeting of the Raleigh/Triangle area MySQL users group. More information available here: http://mysql.meetup.com/144/
Come join the fun!! Meeting starts at 6:30 pm and ends around 8:15 pm.
No comments