Tricks with mysqldump
I'm going to cover a few of the tricks I have used over the years. Hopefully you will find them helpful. In some cases one of these might even save your bacon.
One of the annoying issues with mysqldump is its lack of flexibility. It's very easy to dump all databases (--all-databases option), or one or more databases (--databases). Even dumping a single table isn't too difficult. But what about all the tables that start with "a"? You could specify each table on the command line, but I have ADD and I can't focus for more than a couple of tables :). So, here is one trick to emulate regex patterns to an extent....
mysql --skip-column-name information_schema -e "SELECT table_name FROM TABLES WHERE table_schema='schema_name' and table_name LIKE 'a%'" > tables.txt
Playing around with that a bit and you can probably get the results you want. There is even a REGEXP option that can give you some nice results. If you need to make multiple selects you can use the ">>" output option instead of ">" so that the results are appended to the end of the text file instead of overwriting what is currently there.
Once you have your list of tables you can then do the actual dump:
mysqldump schema_name `cat tables.txt` > tables.sql
With mysqldump there is no option to encrypt backups. Sometimes a client is required, due to regulations, to have backups encrypted. Sometimes they just want the backup encrypted. As with our previous tip, combining bash scripting and mysqldump makes this possible. With the method shown, the backup is never sitting on disk unencrytped.
First you need GPG (GnuPrivacyGuard) installed and configured. This is how the encryption is done. You need a "public key" generated for the server where you are making the backup.
Step One - GPG Setup
Setting up GPG is fairly simple. You just run it with the --gen-key option:
I would go with DSA and Elgamal. Once you choose your encryption method, then it is going to ask a series of questions including keysize, how long the key will last until expiring, name, address and comment. I would use 2048 for the keysize and tell it to never expire (0).Please understand you should not use these values just because I say that I use them. You use what is best for you.
The last question is for you to input a passphrase. You have two choices, either leave it empty for no passphrase or generate a password and then input it using a file stored somewhere and specified from the command line. Since keeping the passphrase around in a textfile isn't really safe the only recommendation I have is to not use a passphrase. That isn't necessarilly safe either, but I am not aware of a better option.
Step Two - Backup
Now that you generated your key it's time to run the backup:
# mysqldump database_name | gpg --encrypt -r 'user_id' | gzip > dump.sql.gz
This is a three step process -- the mysqldump, piped to the gpg process, piped to gzip for compression. Because the results are being piped everything stays in memory (and off disk) until it lands in the compressed file.
Of course the "--encrypt" option is used to encrypt the data. The "-r" option is used to specify the recipient of the encryted file. This is the backup user in this case.
When decrypting the backup you use the --decrypt option similar to this:
gpg -r user_id --output dump.sql --decrypt encrypted-file.gpg
Within the last few months we had an issue arise at a client where weekly backups of prod are being made and then restorations to dev were happing. We were trying to trim down the time this cycle was taking. These backups include what amounts to archival data that were not being used in dev. This amounted to something like 100 GB of data spread across 40 tables or so. I really don't remember whose idea it was (my colleagues or mine?) but a colleague implemented a bit of sed magic that would scan the dump file and replace the innodb table type for these archival tables with the blackhole table type. The inserts of the archival tables still happen of course, but you are essentially sending these inserts into /dev/null. They happen MUCH faster than inserts of an innodb table. Massive time savings happened. Only one line of actual code added to the backup script.
Here is the cleaned up code:
We had to be careful. Most tables in this database where InnoDB. We didn't want to convert all of the table -- just the archival tables. What we did was scan for a certain index ("tablename_productid") that each of the archival tables contained and was the last index on the archival table. The next line was the table storage engine definition so then we replaced InnoDB with Blackhole. The sed tool is very powerful and can serve you well if you take the time learn its use.
echo "Starting Blackhole replacement: `date`"echo "Singularity created: `date`"