Diamond Notes

Just another WordPress weblog

Archive for April, 2008

Experience with Dell Powervault MD3000

Hey everyone.  Haven’t been writing since I got back from the conference.  Been busy with other things.  But I would love to get some input on something I don’t have any experience.

I am evaluating possible solutions for expanding storage on our servers.  All the way from a full blown SAN to something as simple as ripping out the hard drives currently in the servers and replacing them with larger units.  One of the possibilities is using a Powervault MD300 which is a direct-connect RAID unit that can attach to up to four servers.  It would resolved some space issues we have on a few servers without require an enormous expense of a couple of SANs.    In theory it should actually be faster than the internal RAID as it has a larger cache.

Things I am particulary interested in:

  • general impressions
  • reliability
  • is having four servers going to slow it down that much?
  • the java software for management…I understand it can work from any client computer.  Does this mean you don’t have to install any additional software on the server itself?

Thanks!!

3 comments

New Maatkit Tutorial for mk-table-checksum

Hey everyone, I just posted a tutorial I wrote for the maatkit toolkit. This tutorial only covers the mk-table-checksum tool and the udf that Baron coded which speeds up the checksum process quite dramatically (small benchmark test included in tutorial). I really don’t have it linked into my website yet, but the direct download is here.

Baron, thanks for the hard work on the toolkit. We really appreciate it.

Enjoy!!

No comments

MySQL Proxy

I have been following MySQL Proxy ever since it started.  It is really cool stuff.

Just recently at work we evaluated some various possibilities or a high availability (HA) setup.  Load balancing would be nice, but that wasn’t the primary point.  Well, we evaluated probably half a dozen alternatives and I believe we have a winner … MySQL Proxy.

With proxy you have something that is fairly simple to set up, can be built with redundancy so you don’t have a SPOF and is very flexible.  In addition to HA we can use it for load balancing, storing all queries that run through the proxy (for auditing) and even modifying queries for various reasons.  There are other uses but I would like to keep this post short.

Because it contains the Lua programming language you really have the ultimate flexibility.  Yesterday I was discussing proxy with our Junior DBA and he said that there were things that haven’t even been thought of yet that will be implemented down the road.  Well, here is on of those things.  Jan

If you are a professional MySQL DBA I would recommend you invest some time to understand what MySQL Proxy is all about.  I would also take the time to set it up in a “sandbox” environment so you can start working on it.  I believe that the proxy will be an important part of the MySQL landscape in the years to come.

MySQL Proxy homepage: http://forge.mysql.com/wiki/MySQL_Proxy

1 comment

Thoughts on the Fuss

Well, as I often do, I am going to weigh in on the topic du jour after pretty much everyone is done yelling about it. After evaluating what was said I believe I am going to offer something that is actually original. For background, in case you missed it, MySQL (the company) announced that they were going to charge for compression and encryption features of the new online backup that is coming in version 6.0.

This has raised quit the furor. I am not going to take the time track down all the posts about it, but there have been many. Just check out planetmysql.com if you want. Almost without exception the posts have been very negative.

Now that a little background has been laid out, let’s discuss. It is important to understand that this decision was made by MySQL AB management before the Sun acquisition. It is my understanding that this actually took Sun very much off guard. While Sun isn’t perfect, I do think that Jonathan Schwartz believes in the value of open source software and has every intention of keeping the MySQL server open source. It seems at this point that there is quite a difference in viewpoint between the upper management at MySQL and the people at Sun who are responsible for managing MySQL from here on out.

I think I know who is going to win in the long run in this little “battle” between Sun and MySQL. Sun paid a nice sum for MySQL and if I understand some basic business that means they are in charge. While with corporations of this size change takes time I am sure that Sun will be shaping MySQL somewhat “in its image”.

I doubt that this little scheme of charging for these features ever actually takes place. It is pretty much diametrically opposed to the what Sun says they want for MySQL. I think that by the time server version 6.0 is GA that every feature will be fully available for anyone. And that is why I have not taken the time to sharpen a pitchfork and join the mob. Because in the end I don’t think this will ever happen.

And cudos to Monty W. for leading chants of “We don’t ship crippleware” in his session on the future of MySQL at the conference.

Just my two cents.

6 comments

Best Swag of the Conference

Everyone has shirts, pens, flashy little do-dads, coffee cups, or whatever. But the best swag of the conference had to be:

\"Freedom to Work Anywhere\"

Not the computer, that’s mine.  The boxers.  MySQL was giving them way.  They say ‘MySQL Freedom to Work Anywhere’.  I thought it was hilarious.

No comments

MySQL Certification

One of the benefits of going to the conference is that you can take various certification tests (CMA, CMDBA, CMDEV and the CMCDBA) for a very low price ($25.00 a test).

I planned on taking the  first CMDBA (the cert requires two test) and I was able to take the first test this morning.  I work daily in an environment where all I do is “DBA” functions (I don’t do any coding).  Plus, I have been studied the certification book for a while.  I was fairly confident that I could pass the first test.  Even so, I was nervous going in to take the test.  Not enough sleep this week, overloaded with session information.  Not really ideal.  However, I really breezed through it.

Then I started thinking I should take the second test.  I didn’t study for it and while we run 5.0 in production we don’t currently use any stored procedures/triggers/etc.  So, I don’t have a great deal of experience on something that was probably a third of the test.  Well, I did take it.  And while many people would probably think it was actually easier than the first test I was a little worried I might not pass.  However, about an hour ago, I did find out that I passed the second test so I am “unofficially” a CMDBA.  Yeah!!

Should you do it?  Well, for conference attendees I am afraid it is too late.  Testing stopped at 3:00 pm today or something like that.  However, MySQL offers the tests through Pearson/vue (sp?) testing centers.  It is worth it, and does show you know a great deal about the MySQL server.  I wouldn’t recommend you try and take this test without production experience.  The certification guide says that you need six months of production experience if I am not mistaken.  I agree, although you better study well also!

1 comment

Mark Callaghan — Scaling InnoDB/MySQL @ UC

Goals

Make Innodb scale on big servers

  • fix bottlenecks on big SMP server
  • utilize server with many disk
  • support thousands of database connections
  • handle corruption in mormroy and on disk
  • make query plans perdictable
  • support thousands of tables and accounts

Mentioned a problem with SANs?

Desirable Features

  • throughput scales with number of CPU cores
  • efficient support for 128 GB buffer cache
  • performance for servers with man and or remote disk
  • recover from corruption

CPU Problems

  • Mutex implementations favor portability over performance
  • Mutex hotspots
    • buffer cache
    • memory allocation
    • transaction log
    • adaptive hash latch

Symtops of CPU problems

  • adaptive hash latch contention
    • SHOW INNODB STATUS displays the session that holds the adaptive hash latch
    • a background thread logs SHOW INNODB STATUS into the error log when there is along lock wait
  • excessive mutex contention
  • server has many queries, is slow and is not IO bound
  • on linux, vmstat will report a lot of idle time

Workarounds for CPU problems

  • Upgrade
    • MySQL 5.0.30 has import improvements
    • MySQL 5.0.54 fixes a bug that causes some threads to miss wakeup signals
  • experiment with innodb_thread_concurrency to limit the number of threads that run concurrently

Making the InnoDB RW-mutex fast

  • Use atomic ops to change internal state (replacing the Innodb spin lock mutex)
  • use separate events to wake readers and writers

Work in progress for CPU problems

  • change the innodb mutex to use atomic ops
  • change the Innodb read-write mutex to use atomic ops

Performance Tests

Eight-CPU core server. Data is from the sbtest table generated by sysbench. Data is cached (key buffer, InnoDB buffer cache).

It was amazing the difference that these patches make. The scalability for these servers (number of sessions) was almost linear.

Memory

To support a 128GB buffer cache:

  • data structures must scale
    • walking a list with 8M page entries might be slow
  • Resources need to be split
    • more than one mutex might be needed for the buffer cache and LRU chain
  • Detection of corruption is more important
    • memory will be corrupted by software and hardware bugs

Detect and correct corruption

Features in InnoDB

  • page checksums to detect corruption that occurs after a page has been written to disk
  • doublewrite buffer to correct partial page writes that occur during a server crash

Innodb crashes when it reads a page with an invalid checksum

if this is a page for a secondary index, than the index can be rebuilt

Disk performance

  • Innodb uses one background thread to process prefetch requests.
  • Innodb uses one background thread to flush dirty buffer cache pages to disk. This is fine as long as writes go to the OS buffer cache. Otherwise, writes may be slow
  • The background IO threads assume a server with one disk and don’t run fast enough when there is work to be done

Fixes

Google patch adds support for multiple background IO threads.

The Google patch will soon have an option to tune the maximum rate of background IO

Connections

  • need to support thousands of connections, but not will to use one thread per connection
  • MySQL 6 separates threads from connections

Query Plans

  • innodb uses sampling to gather stats for the optimizer
  • stats are not stored on disk
  • stats are collected the first time a table is used after startup and after many rows has been modified

stats can be off between slaves with identical dataset. This happens (sometimes) except if the is a unique key

Overall a good talk. Mark got off several times on things that, frankly, if you weren’t deeply involved in the Innodb code you wouldn’t understand (code snippets). However, it sounds like they are solving some important problems. I am really looking forward to these being put into the “official” MySQL code.

1 comment

Monty Widenius — The Future of MySQL @ UC

Talk about:

MySQL Server limitations: skeletons, “official secrets”, embarrassing things in the server

Why this Talk

  • MySQL and Sun needs to become more transparent in what they are doing
  • when users know the limitations they can go around them
  • It’s easier to trust someone when they acknowledge a problem (hear hear!!)

Threads

  • Problems one connection per thread doesn’t work in all cases
  • no way to give priority to thread
  • no way to ensure that we have X active threads running

Symptoms

  • too many context switches
  • no multi-core cpu scalability (efficiently)
  • Solutions (one of many)
  • –thread-handling=–pool-off-threads (MySQL 6.0)

Work to be done

  • all Innodb concurrency patches
  • spawn more threads when threads are blocked
  • removing overal mutex contention in the server
  • give higher/lower priority to some threads
  • allow ’super’ user to login when all threads are in use

Memory as a resource

Single biggest problem is no single memory allocator (server/engine)

Privileges

Problems

not modular/pluggable (extendable to things such LDAP)

no ROLES

Symptoms

  • hard to maintain lots of users
  • hard/impossible to use external authentication

Solution

get the community to implement an authenctication module (four summer of code projects for LDAP)

Pluggable storage engines

Problems

storage engines are depending on internal MySQL structures

Symptoms

storage engines con only be used with the exact MySQL server version they were compiled against

Parser

Problems

  • state machine to loarge
  • not pluggable
  • not cacheable
  • still Bison
  • Bad error messages

Symptoms

  • Parsing has a high overhead for simple queries (12% time spent in parser)
  • Parser takes a lot of code space

Modularity

Problems

  • Server is very monolithic
  • Few defined interfaces (not often stable)
  • Server and libraries not documented
  • Multiple Execution paths
  • no rewrite state for optimizer

Symptoms

  • hard to change code without introducing bugs
  • hard for newcomers to understand the server

Stored Procedures and Triggers

Problems

  • stored procedure are not cacheable across connections
  • we only support SQL (but this is changing )
  • Pre-locking of all tables (deadlock-free algorithm)
  • all cursors are materialized
  • trigger code is not shared across open tables
  • no constaint of resources
  • we don’t support stored procedures as table

Replication

Problems

  • replication is not fail safe
  • no synchronous options
  • no checking consistency option
  • setup and resync of slave is complicated
  • single thread on the slave
  • no multi-master
  • only InnodDB synchronizes with the replication (binary) log

Symptoms

  • slaves can catch up with master
  • hard to do clean fail overs
  • we are dependent on Innodb

Solutions

  • use backup to setup slave
  • replicate CHECKSUM TABLE and do consistency checking on slave (such as maatkit)

Table names

Problems

  • tables are stored as files (nam.frm)
  • file system may be case sensitive
  • falcon has it’s own interpretation of how things should be done

Symptoms

  • SELECT * from TableName and SELECT * from ‘TABLEname’ may or may not refer to different tables depending on file system
  • hard to move apps between OS’s
  • doing alter table of all tables to Falcon may delete data from tables on Unix fro table names that only differ in case

Solutions

  • use –lower-case-table-names when running Windows and Mac
  • add modes to be backword compatabile, ANSI compatible and PostgreSQL compatible

Why Falcon and Maria

Problem

MySQL/Sun doesn’t have it’s own transactional storage engine

Solution

Falcon/Maria

Interesting. This is the first time I have heard MySQL officially say this although I suspected this for some time.

Open Source Project?

This part of the discussion was a very frank assessment of how things are. Monty would like to make things transparent. He would like to change the development model to attract outside developers. One major change is giving outside developers commit and decision rights to the MySQL server code base.

Release Policy

Problems

  • MySQL ships releases before they are ready
  • Benchmarks are given out which show “partial truths”

Symptoms

  • MySQL 5.1 was declared RC way too early
  • features are removed in RC releases
  • major code changes are done each month in RC code
  • Users are not happy with the releases until six months after GA (see 5.0)
  • Critical bugs are still open in 5.1 and not scheduled to be fixed before GA:
  • – bug 989
  • –bug 30414

Solution

  • wait to declare something GA until code stabilizes and critical bugs are fixed
  • create a release policy and an independent release policy board

The good news

Monty says Sun is much more open source friendly than MySQL AB has been lately and is driving MySQL in the right direction.

Overall, this was a very fair assessment of the current state of MySQL. It is good to know that people are thinking about these things. While I have always thought they worried about this “stuff”..sometimes it feels like MySQL doesn’t because of a lack of information coming from them. Can I just say that Monty is really cool? He has no need to continue working and yet he obviously enjoys immensely what he does.  And his honesty is very refreshing.

No comments

Tip for Future Conference Attendees!!

I have a confession to make.  I am struggling this afternoon.  For two or three weeks before the conference I had many things to do and managed to wear myself out.  Then, last Thursday, I jumped on a plane (for the second time in a week) and flew cross-country to come out here.  I then spent two days taking a look at Kickfire’s new technology.  Sunday night came around and the conference started (well, not officially, but it was good community dinner).  Oh, and I was working hard to get the latest issue of MySQL Magazine out  Tuesday.  The first two days for the most part I have been OK.  Then, because I was a bit run down, I went to bed fairly early last night (10 ‘ish).

I woke up at 5:00 AM.  Couldn’t go back to sleep. Now I am paying for it.  I haven’t blogged in the last two sessions simply because I don’t have the energy.  And in the last session (Baron’s great session on the Query Cache) I started dozing off at one point. Shhh…don’t tell him.  Oh wait, he will probably read this.  I couldn’t help it and it certainly wasn’t his fault.  He did a great job.

I have woken up a bit since then (the caffeine helps sometimes!!).  Now I am listening to Ann Harrison talk about Falcon and explain how it is quite a bit different than Innodb.  However, for the rest of the day I am going to skip blogging the sessions.  Too much concentration required.  My point to all this is that you shouldn’t go to the conference run down.  Get as much rest as you can.  We end up in information overload somewhat as it is so don’t hobble yourself by not resting up beforehand.

No comments

Scaling MySQL - - Up or Out? Panel @ UC

I would recommend that you download the video of this!! Sheeri posted it here.

The numbers in parentheses are Alexa rankings.

Moderator - Kaj Arno

(1317) Monty Taylor - MySQL

(905) Matt Ingenthron - Sun

(39) John Allspaw - Flickr

(13) Frank mash - Fotolog

(9) Domas Mituzas - Wikipedia

(6) Jeff Rothschild - Facebook

(2) Paul Tuckfield - YouTube

Question One: Number of MySQL servers

MySQL one master/three slaves

Sun four servers

Flickr 166

Fotolog 37

Wikipedia

Facebook 1,800 (900m/900s)

YouTube


Question Two: Number of MySQL DBAs

MySQL 1/10th

Sun 1.5

Flickr 0 (normally 1)

Fotolog 1

Wikipedia Technical Team

Facebook 2

YouTube 3

Question Three: Number of Web Servers

MySQL 2

Sun 160

Flickr 244

Fotolog 70

Wikipedia

Facebook 10,000

YouTube

Question Four: Number of Memcached servers

MySQL 2

Sun 8

Flickr 14

Fotolog 40

Wikipedia 79

Facebook 805

YouTube

Question Five: Version of MySQL

MySQL 5.23-2rc

Sun 5.0.21

Flickr 5.0.51

Fotolog 4.11

Wikipedia 4.4

Facebook 5.0.44

YouTube 5.0.24

Question Six: Operating System on Server

MySQL Fedora

Sun OpenSolaris

Flickr Linux

Fotolog Solaris 10

Wikipedia Fedora/Ubuntu

Facebook Fedora/RHEL

YouTube SuSE 9

Question Seven: What happens if a server fails?

Flickr - Federated setup for failover. Can loose any one side of the shard.

Wikipedia - if a master fails they replace with slave

Facebook - archive binlogs, promote slave

Fotolog - mount snapshots?

Youtube - SAN; shards with a master and multiple slaves so they promote slaves

Question Eight: What is Their Crucial Scaling Technology

Facebook doesn’t use SAN - they do use RAID 10 with 2.5″ drives

Fotolog — UltraSparc T1 — excellent master UltraSparc T2 — excellent slave — uses SAN

This was interesting to me. Frank (Fotolog) said they use a SAN to keep things manageable (only two dbas with the second one just hired). Facebook says they don’t use SAN because they didn’t want to limit themselves.
Next they got off on discussion about power. This varied quite a bit with YouTube pretty much dismissing power concerns. Of course Frank from Fotolog then pointed out that when they (Fotolog) want to expand in a datacenter — the datacenter has to get Google’s approval…hmmm..no wonder Google isn’t worried about it. Fotolog and Facebook were very much in favor of power savings. I think there is more than just saving a little power, you get cooling and space (if smaller of course) savings.

27 comments

Next Page »