Diamond Notes

Just another WordPress weblog

Archive for September, 2007

Log Buffer #64: a carnival of the Vanities for DBAs

Welcome to Log Buffer #64. For those who haven’t come across my little patch of the Internet - I’m Keith Murphy. I am the MySQL DBA at iContact, Inc and the editor of MySQL Magazine. I do my best to contribute to the MySQL community any way I can. Special thanks to the Pythian Group for giving me the honor of writing this week’s buffer. Thanks for stopping by!!

MySQL

The guys over at MySQL Performance Blog - the high performance MySQL gurus - reported that a important bug for was fixed (currently only for 5.1.22 -which isn’t released yet). The bug was an issue with the auto-increment field for an InnoDB table. This causes serious scalability issues with Innodb so is a welcome fix. No known information on if this will be back-ported.

One experienced Oracle DBA who is transitioning to MySQL blogs about his nine tips for new MySQL DBAs. They are quite enlightening. Along with that, the new DBA should take a look at George J. Trujillo Jr.’s blog entry for “Top Ten Things to do Before Installing MySQL”. For the more experienced DBA who is planning for the scaling of their systems — stop by and take a look at thoughts on the capacity planning of a MySQL DBA.

This week was the MySQL Developers conference in Heidelburg, Germany. A number of the developers blogged about it including Colin Charles, Paul McCullagh and Lukas Smith. They discuss upcoming features including my favorite - the coming online backup in MySQL 5.2 (or whatever the version after 5.1.x ends up being called).

Postgres

This week for the world of Postgres brings the introduction of a new tool called Bucardo. It brings asynchronous multi-master replication system for Postgres. Greg is the developer of the tool and blogs about it here. Magnus Hagander’s blog has a good introduction to the integrated security features of Postgres and how it can integrate with both Windows and Unix clients. Take a look here. And finally, for the Postgres fans, is a cool tip from </depesz> about a nice way to access Posgres documentation from Firefox quickly and easily.

Oracle

In Oracle news, Kris Buytaert points out that Oracle still takes a solid hour to install. It is ironic that I also remember when Oracle first started shipping a Linux product and how they distributed CDs (with Linux Magazine?) like AOL “back in the day”. It was either in 1998 or 1999 and I had just started using MySQL for database functions at the dial-up ISP I started.

Want to participate in some Oracle surveys? Doug Burns points to a couple that Oracle DBAs can participate in.

Microsoft SQL Server

The PASS conference in Denver, CO generated quite the buzz in the MS SQL Server blog-o-sphere. And even though the release of SQL Server 2008 is some time off (Q2 2008) the next version of SQL Server is building buzz. The PASS conference was covered in detail at http://www.mssqltips.com and pointed out by Andy Warren’s blog. More “views and opinions PASS” and maybe a prescription for the ailing DBA are available from the SQL Doctor here. One of the speakers at the conference — the Crazy DBA — even posted about their experience speaking. Worth reading by anyone who is looking to be a “first timer” speaking at a conference. More thoughts on the conference from Ken Henderson.

Who says that DBAs don’t have a sense of humor? With a name like “Patron Saint of Lost Yaks” there is at least one DBA with a sense of humor. In addition to the humor, they have a good introduction to horizontal partitioning in SQL Server on his/her blog here. Also, the PSoLY brings us a cool tip about inserting binary data . Wrapping things up for the company from Redmond is a blog entry by SimonS about how the SSIS date package isn’t as accurate as you might think (and a solution to the problem).

DB2

DB2 Magazine has a nice blog entry about effectively indexing DB2 databases. Susan Visser has an interesting blog entry about a new book Understanding DB2 9 Security. In addition she has an interesting statistic I will leave to the gentle reader to find. She also announces the chance for a DB9 DBA to win a free trip to a conference.

According to Daniel Krook there is a fairly substantiated rumor that IBM is porting DB2 to the Intel Mac.

And always remember - the views and opinions expressed here are my views and opinions and not necessarily the views or opinions of my employer.

1 comment

All MySQL’ers should use KDE!!

Ok…file this under “absurd” but I just couldn’t pass it up…

In the next version of KDE (4.0) due to be released on December 11 there is an new file manager:

http://en.wikipedia.org/wiki/Dolphin_%28software%29

Have a happy Wednesday!!

No comments

Adding a SQL node to a cluster

Just a quick post. This information is fairly readily available but more information is better right?

Adding a SQL node to a cluster is a fairly simple task:

Step 1: Add the new SQL node to the config.ini file on the management node.
Step 2: Make sure that the SQL node is configured identically (i.e. my.cnf) to the other cluster SQL nodes,
Step 3: Issue the ’stop’ command from the ndb_mgm console and then do a normal restart of the cluster: management node first, data nodes second and then SQL nodes.
Step 4: Issue the ’show’ command from the ndb_mgm console and you see the new SQL node (it takes about a minute to show up)

Hope that is useful for someone!!

No comments

Yeah MySQL AB!!

So Lukas Smith just wrote about the developers conference from last week. Some information came out of including the following:

  • Replication allows different schema on the master and slave. Data type mismatches cause automatic type casts and you can have additional column on the master or on the slave (but this might give you problems when you are using row based replication without a primary key - so just use a primary key :-)). Multi threaded writer support on the slave is likely to be added in December (not sure when it will get into a GA release)
  • Online backup is coming in MySQL 5.2 (or whatever is coming after 5.1)
  • MySQL Proxy allows you to extend SQL, by intercepting whatever string you are sending to the server and turning it into proper SQL. The proxy can operate on the data going to the server as well as the data coming back. You can load only one script at a time and only at start up, however there are solutions to load and/or chain scripts via LUA scripts

These are all really cool. The proxy information is not big news as it has been generating excitement for quite some time. I see it firmly entrenched into the server architecture very soon and I think for the most part this is a good thing. I think MySQL AB must read my blog..just a few days ago I wrote this blog entry. Although the mysqlpdump tool is better than the “stock” mysqldump I have been praying for an online backup tool integrated into the server. It looks like it might be coming. So, as I said, “Yeah MySQL AB!!”.

The replication information is exciting also. Stable, solid replication is a necessary part of any production environment. Speeding it up with multiple slave write threads will help with lag issues.

All in all, I have to say I am starting to look forward to the “5.2″ branch (or whatever it will be called).

No comments

Update

Sorry everyone  I have been down for a few days.  Last week my sight was messed up when some script kiddie compromised the shared hosting server my site was hosted on.  So, I have spent the weekend moving my domain to a virtual server where I have a lot more control over things.  The blog is back online and I think everything is up and running except my paragon-cs email.  I still have that to get that working.

Please let me know if you run into any problems.  I can be reached at “mysql.consulting” at “gmail.com”.

Special thanks to the reader who pointed out the compromise (javascript pop-ups inserted into webpages that displayed pop-ups - annoying but not serious).

Sorry for the troubles!!

1 comment

Beginning MySQL Cluster

So, at work, we are implementing a cluster. Actually, we are implementing several clusters. This is my first production experience with clustering. Trust me, it’s a whole different animal.

A couple of observations:

  • I find it much easier to split up the sql and data nodes. Originally I had put a SQL node and a data node on the same physical server. Because I didn’t go with the default locations for the data I kept having trouble. Not sure why, but once I separated things out it became much easier.
  • Having multiple SQL nodes creates some complexity. We want a single IP address for the application so, obviously, we have to have a load balancer of some type. This isn’t quite as easy as it might seem. We have already had one situation where the cluster thought a SQL node failed and shut it off. The SQL node itself was up and running and unless you actually tried to execute a SQL command on the node and you couldn’t tell the SQL node wasn’t working properly.

A typical load balancer might either just ping the IP or possibly try and log in to the SQL node. Both of those would have worked in this case. So, the load balancer would assume everything was cool and keep trying to serve connections to the SQL node. This is still being investigated and any feedback would be welcome!!

  • scripting is going to be your friend. It’s one thing to make a change on three or four servers. It’s another to make a change to twenty. That is one of the cool things about the MySQL Clustering book..quite a few scripts. I have found quite a few on the old ‘net also.

There will be more posts about this in the future I am sure.

3 comments

Mysqlpdump Test

I should be sleeping. But I am playing :). I have fairly serious dump of a couple of servers that will probably occur late this week. I have been tweaking various things and testing various configurations to get the best dump and restore speeds. Then, in the last couple of days, I ran across this by Monty Taylor:

http://mysql-ha.com/2007/09/12/mysqlpdump-for-parallel-dumps/

which links to this site:

http://www.fr3nd.net/projects/mysqlpdump/

It is a python wrapper script around mysqldump that executes the dumps using multiple threads — from what I see it each thread dumps an individual table.
I executed some tests on a slave that we run backups and a small DB that is right at 900M in size. It was fairly eye opening:

1:37 — 917M mysqlpdump.py -u root -ppassword -v -d /var/backup/directory -i database
2:23 — 915M mysqldump database -u root -ppassword > /var/backup/directory/database.sql

So, on this small database, it made a significant % speed increase. This was with the default 5 threads. Next I increased the thread count to 10, 15 and then 20. The results are:
1:50 — mysqlpdump.py -u root -ppassword -v -t 10 -d /var/backup/directory -i database
1:40 — mysqlpdump.py -u root -ppassword -v -t 15 -d /var/backup/directory -i database
1:59 — mysqlpdump.py -u root -ppassword -v -t 20 -d /var/backup/directory -i database

While these aren’t controlled test I think it is clear that the mysqlpdump utility script is faster than the mysqldump command. Speed was best with a thread count of five. I don’t have an explanation for this other than the possibility that my server (a dual-xeon) is really just simulating a higher thread count. I don’t know. Not as much a hardware person as I might like. Even so, just with default settings it is a nice little speed-up.

As for the restore - Brian Aker says in this post: http://krow.livejournal.com/519655.html that there is an option now to mysqlimport to specify a number of threads..the reverse of this script. Or, as an alternative, I could use the mysqlprestore script developed by Monty Taylor after he looked at mysqlpdump. Not sure which I will use, but it seems that both will do the job.
Cool stuff. Hoping something gets directly integrated into mysqldump soon. Or, even better, an online backup that didn’t disrupt things :)

4 comments

MySQL with SSD drives - Part Two

I have been pointed to a RAM-disk based device called the HyperDrive4 by a recent comment of a reader of the previous post on SSD drives. It is a “relatively” cheap device that looks to be solidly build- battery backup (with optional automatic hard drive copy in case of power failure). I did say relatively cheap. I think the basic 32GB unit runs about $3,500 (plus RAM). It seems to currently top out at 32GB.

Even so, it is would incredible to have one of these in your server to run the most used indexing files and tables. I know 32 gigs isn’t that much these days but if necessary you can RAID these things to build larger disks. The throughput and access times on these is just incredible. Take a look here:

http://www.hyperossystems.co.uk/

Click on the “more information” button underneath the hyperdrive4 unit to see the product line.

So do I get to test one now?

8 comments

Querysniffer updated

Just wanted to let everyone know that we haven’t quit working on querysniffer. We have added several new options to allow (among other things) logging the IP address of the initiating client and tracking sessions (including time-stamp information).

It is becoming much more useful for our other query processing tools..querybench and queryparser. I set up a page for all the tools in addition to Ian’s homepage at http://www.paragon-cs.com/queryprogs.

Reading the new Logbuffer #61 that was just released I came across this new feature of Oracle 11g — what amounts to application replay. Think DVR for your database.

http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-features/11g-replay.html

Now sing to the tune of “I’m Dreaming of a White Christmas” –

I’m dreaming of an MySQL replay

Just like the other replays I know

Where the data is captured

and the testing is easy

and the DBA’s get to go (home early that is)

…..

So now you know why I work on databases and don’t write songs.

Maybe one day the QPP’s (query processing tools) will grow up and do something like that. If I can just figure out how they roll the database back after making changes to it. I mean..can you imagine capturing 400 megs of data and playing it back against a test database .. all the changes that are made to the DB and then being able to roll it back to a start checkpoint easily?

That would be awesome. Any ideas how you could set a checkpoint and the easily and quickly drop all changes made after that checkpoint when you run restore?

2 comments

Multiple hard drive arrangement testing

I am thinking about testing two layouts of hard drives in a db server. I have this small “disagreement” with members of the sys-admin team about which is best: using a striped RAID arrangement to spread reads over multiple drives or what I was always taught — separating your data among various drives. For example I would typically do something like this: 1 drive for OS, 1 drive for data, 1 drive for log files. This spreads the reads and writes somewhat.

Before I go and start testing all this…can someone point me to a test of MySQL on these arrangements?

Thanks!!

5 comments

Next Page »