Diamond Notes

Just another WordPress weblog

Upgrading from 4.1 to 5.0

December the 17th we upgraded the majority of our servers that had been running version 4.1.22 of MySQL to 5.0.45. This was not a small undertaking and is something that I had been planning for about five months. It had been initially slated to take place in September, but due to the release of a new product (and the subsequent lack of time on the part of the developers) it was delayed until December.

I wanted to lay out some thoughts for those who will be upgrading in the future. Hopefully you might miss a few of the potholes that we ran across in our little journey to 5.0.

It was imperative that our customers be affected as little as possible. We went to great lengths to minimize the downtime of the site. We also did everything in our power to make sure that bugs didn’t creep into the process and cause problems.

**space savings: one benefit to upgrading is that 5.0 uses a more compact format for innodb tables. We gained about 15 - 20% on each server in space which was a nice bonus.

**dump your data using mysqldump (or better yet — the Ma’atkit parallel dump). Don’t try and just upgrade the mysql binaries. Not only will you loose out on the innodb space savings there are enough subtle differences in the format of storage that you will end up with problems down the road. Invest the time to do it right. Dump your data.

** mysql replicates from 4.1 to 5.0 — but not the other way. Once we promoted our slaves to masters we were running without backups until we upgraded the slaves to 5.0 and re-synced the data from the new masters.

**go through the changelogs. Every time MySQL releases a new version of the server they also release a changelog of bug fixes/errata for the server. These are your friend. I copied each one of the changelogs between 4.1.22 and 5.0.45 into one gigantic log file and highlighted areas that might cause problems. It didn’t catch every problem we ran across..but most of them.

As a starting point here are the major problems we ran across:

1. Rounding is calculated differently. This will affect financial applications particularly. With the new version, ROUND may return different results depending on whether you’re rounding a DECIMAL or a DOUBLE type.
2. When doing joins, subqueries, etc., the new version complains about duplicate column names:

  • A subquery is not allowed to have a duplicate column name. The solution is to specify the columns you want for at least one of the tables.

  • When doing joins with the USING clause, duplicate columns will cause an error.

3. With DECIMAL columns, the new version will cut off numbers that were stored in the old version. An example I found was in the the table_name.column_name column. It’s specified as DECIMAL(6,2). The old version allowed numbers such as 12000.00, but the new version will cut those off and store 9999.99. The solution is to modify the column definition before data is migrated: “alter table table_name modify rate decimal(10,2) default NULL”

4. UPDATE works differently.

**make sure you have backups of every bit of data… including your binary logs.

**make sure your backups continue to run after you shuffle around servers. This is something I should have planned ahead a little. In the shuffle I didn’t think about how our backups run off our slaves — but those slaves got promoted to master. No backups ran off those masters (I shut the backups off) but I could have done it with a little more forethought and probably only miss one backup (while our servers were down during the actual upgrade).

**in any organization larger than a mom ‘n pop shop you need to coordinate with other people. In my case I had to work with our systems team (who brought the application down, made some necessary configuration changes and deployed new code for the app and then brought it back up) and also our developers (in resolving problems with the code and its interaction with the new databases). I couldn’t have done it without them. I despise meetings but they are necessary to coordinate things. Make sure everyone is on-board!!

Overall I would rate our experience a 9 out of 10. It went exactly as I planned up until the point where I discovered that XFS data partitions don’t create consistent snapshots with LVM2 for copying over to another server and creating a slave..but that is a topic for another post.

7 Comments so far

  1. mike December 27th, 2007 1:22 am

    Can you provide more information about the XFS/LVM2 issue?

    I have LVM2 partitions formatted as XFS and was planning on using snapshots for the most efficient/quickest backup possible (nightly mysqldumps can be a huge performance hit) - but that would suck if they’re not actually reliable or usable.

    Any additional information or URLs would be great to know.

  2. Mark Robson December 27th, 2007 5:09 am

    Of course there are other things which aren’t backwards compatible.

    5.0 has more reserved words - so if you’re using any of those in your schema, those queries will break (Except for DATE, TIME, DATETIME which are allowed in the schema).

    If you try to replicate from 4.1 to 5.0 and use those reserved words, the slave will break with a syntax error.

    Mark

  3. Joe Izenman December 27th, 2007 11:53 am

    Also of significant note is the change in precedence of the comma join as compared to JOIN statements (as detailed in my blog entry from our upgrade a couple months ago: http://www.sitecrafting.com/blog/adventures-in-database-migration-pt/ )

    It’s another one of those problems that is just about order. As long as you update the queries before you upgrade, you’re fine (changing over to INNER JOINS, done correctly, doesn’t break anything on 4.0 or 4.1).

  4. admin December 27th, 2007 7:38 pm

    Mark,

    Of course you are right. One of the things I should have mentioned is that for up to two weeks on some servers we ran two slaves — one running 4.1 and one running 5.0. We discovered the column referencing issue that I outlined because a slave kept breaking on 5.0 and not 4.1. We had no issues with reserved words, but those would have been uncovered as well (most likely).

    Keith

  5. admin December 27th, 2007 7:39 pm

    Mike,

    I am going to go ahead and do a entry on the XFS issue because it was pretty major and deserves it’s own coverage.

    Keith

  6. Mark Robson December 28th, 2007 9:30 am

    Unfortunately running a slave with 5.0 to check that your app is 5.0 compatible is not sufficient, as that only catches binlogged queries.

    Non-binlogged queries (i.e. SELECTs) won’t be checked by that method.

    The method that we envisaged was to enable the general SQL log, then write a tool to replay (a few days worth of) the general query log from production to ensure that no (more) errors occur. Of course it would have to start with the database in the same state, but that is just a matter of logistics.

    Sadly of course, this general query log is going to be very large on a busy server, and worse still, it can’t be turned on and off online prior to 5.1

    Mark

  7. admin December 28th, 2007 8:49 pm

    Mark,

    Of course you are correct. But running the slave does help find some issues. You just can’t depend on it finding everything as you rightly point out.

    There was no way we were enabling the general log as some of our servers are pretty near the limits in terms of hardware. The server that has the most data in our systems also happens to be more than four years old. In addition, we don’t shut down servers unless something is on fire. I had servers up to the point of the upgrade that had been online for probably four months.

    You might want to check into the query sniffer tool that we developed. It will allow you to do exactly what you want (capture the data in real-time.. save it to a file for later analysis..turn it off and on at will). It is at http://www.paragon-cs.com/queryprogs/

    We even have a querybench program to replay the captured data against a test server. Get in touch with me at bmurphy at paragon-cs.com if you need any help with the programs.

Leave a reply