Diamond Notes

Just another WordPress weblog

Maintenance Plan

Whenever you need to perform maintenance it is critical that you take the time before to write down what is going to happen.  Step by detailed step.  Have it reviewed, critiqued and torn apart.  Then, when it is time for maintenance, you have an easily executed plan with much less change for error.

Let me elaborate.

I know people who just fly by the seat of their pants.  Impressive maybe, but really irresponsible.  It is way too easy to make mistakes.

Take the time to craft a skeleton plan to build upon.  What I mean by that is a written plan that only has the things that happen every time you do maintenance.  Things such as before maint you need to tell the proper people that maint is happening, turn off monitoring, make changes to your backup plans if necessary — things like that.  Post maint you will need to update notices about  maintenance being performed.  These can be added to your skeleton plan because they are going to happen every time.   A checklist is very helpful.  It insures that things don’t get missed.

Take your time.  Do it right!!

3 Comments so far

  1. Bill Karwin February 15th, 2008 10:32 am

    Good point! Also see my blog post on checklists:

    http://karwin.blogspot.com/2007/12/how-to-save-100-million.html

  2. admin February 15th, 2008 11:42 am

    That is awesome. Wished I would have seen this before the post. Not sure how I missed it.

    Thanks!!!

  3. Sheeri February 20th, 2008 5:53 am

    It’s especially important in so many ways. You can have someone review your checklist so you get a “second set of eyes”. You also get a bit of CYA as well — if something happens that was unexpected, having it be unexpected to more than one party is very useful.

    I tend to write things out step by step so I could hand it off to anyone with some knowledge….for instance, part of a plan to propagate a slave to a master might be:

    8) STOP SLAVE on all the slaves when they reach the master’s current binlog position
    9) turn off mysql on the master
    10) RESET the slave settings on the new master
    11) RESET the master settings on the new master
    12) CHANGE MASTER TO on all the slaves to point them to the new master

    and then I’ll have rollback plans, like “until step 9, rollback consists of turning on the master mysql instance. For steps 10-12, rollback consists of that plus making sure all slaves read from the old master”.

    Also, putting time frames on things are helpful, ie, “if backing up takes longer than 20 minutes, abort/notify customer service the downtime will be longer than you thought/get a cup of coffee.”

Leave a reply