Benutzer-Werkzeuge

Webseiten-Werkzeuge


a_quick_note_on_mysql_troubleshooting_and_mysql_replication_recovery

Quelle: planetsuse.org

While caring for a remarkable amount of MySQL server instances, troubleshooting becomes a common task. It might of interest for you which

Recovering a crashed MySQL server

After a server crash (meaning the system itself or just the MySQL daemon) corrupted table files are quite common. You’ll see this when checking the /var/log/syslog, as the MySQL daemon checks tables during its startup.

Apr 17 13:54:44 live1 mysqld[2613]: 090417 13:54:44 [ERROR]

/usr/sbin/mysqld: Table './database1/table1' is marked as
crashed and should be repaired

The MySQL daemon just told you that it found a broken MyISAM table. Now it’s up to you fixing it. You might already know, that there is the “REPAIR” statement. So a lot of people enter their PhpMyAdmin afterwards, select database and table(s) and run the REPAIR statements. The problem with this is that in most cases your system is already in production - for instance is up again and the MySQL server already serves a bunch of requests. Therefore a REPAIR request gets slowed down dramatically. Consider taking your website down for the REPAIR - it will be faster and it’s definitely smarter not to deliver web pages based on corrupted tables.

The other disadvantage of the above method is, that you probably just shut down your web server and your PhpMyAdmin is down either or you have dozens of databases and tables and therefore it’s just a hard task to cycle through them. The better choice is the command line in this case.

If you only have a small number of corrupted tables, you can use the “mysql” client utility doing something like:

  $ mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.0.75-0ubuntu10 (Ubuntu)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

  mysql> REPAIR TABLE database1.table1;

+--------------------+--------+----------+----------+
| Table              | Op     | Msg_type | Msg_text |
+--------------------+--------+----------+----------+
| database1.table1   | repair | status   | OK       |
+--------------------+--------+----------+----------+
1 row in set (2.10 sec)

This works, but there is a better way: First, using OPTIMIZE in combination with REPAIR is suggested and there is a command line tool only for REPAIR jobs. Consider this call:

$ mysqlcheck -u root -p --auto-repair --check --optimize database1
Enter password:
database1.table1      OK
database1.table2      Table is already up to date

As you see, MySQL just checked the whole database and tried to repair and optimize it.

The great deal about using “mysqlcheck” is, that it can also be run against all databases in one run without the need of getting a list of them in advance:

$ mysqlcheck -u root -p --auto-repair --check --optimize \
  --all-databases

Of course you need to consider if an optimize of all your databases and tables might just take too long if you have huge tables. On the other hand a complete run prevents of thinking about a probably missed table.

Recovering a broken replication

MySQL replication is an easy method of load balancing database queries to multiple servers or just continuously backing up data. Though it is not hard to setup, troubleshooting it might be a hard task. A common reason for a broken replication is a server crash - the replication partner notices that there are broken queries - or even worse: the MySQL slave just guesses there is an error though there is none. I just ran into the latter one as a developer executed a “DROP VIEW” on a non-existing VIEW on the master. The master justs returns an error and ignores. But as this query got replicated to the MySQL SLAVE, the slave thinks it cannot enroll a query and immediately stopped replication. This is just an example of a possible error (and a hint on using “IF EXISTS” as often as possible).

Actually all you want to do now, is telling the slave to ignore just one query. All you need to do for this is stopping the slave, telling it to skip one query and starting the slave again:

$ mysql -u root -p
mysql> STOP SLAVE;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
mysql> START SLAVE;

That’s all about this.

Recreating databases and tables the right way

In the next topic you’ll recreate databases. A common mistake when dropping and recreating tables and databases is forgetting about all the settings it had - especially charsets which can run you into trouble later on (”Why do all these umlauts show up scrambled?”). The best way of recreating tables and databases or creating them on other systems therefore is using the “SHOW CREATE” statement. You can use “SHOW CREATE DATABASE database1″ or “SHOW CREATE TABLE database1.table1″ providing you with a CREATE statement with all current settings applied.

mysql> show create database database1;

+-----------+--------------------------------------------------------------------+
| Database  | Create Database                                                    |
+-----------+--------------------------------------------------------------------+
| database1 | CREATE DATABASE `database1` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+-----------+--------------------------------------------------------------------+
1 row in set (0.00 sec)

The important part in this case is the “comment” after the actual create statement. It is executed only on compatible MySQL server versions and makes sure, your are running utf8 on the database.

Keep this in mind and it might save you a lot of trouble.

Setting up replication from scratch

There are circumstances forcing you to start replication from scratch. For instance you have a server going live for the first and actually all those test imports don’t need to be replicated to the slave anymore as this might last hours. My quick note for this (consider backing up your master database before!)

slave: STOP SLAVE;
slave: RESET SLAVE;
slave: SHOW CREATE DATABASE datenbank;
slave: DROP DATABASE datenbank;
slave: CREATE DATABASE datenbank;

master: DROP DATABASE datenbak;
master: SHOW CREATE DATABASE datenbank;
master: CREATE DATABASE datenbank;
master: RESET MASTER

slave: CHANGE MASTER TO MASTER_USER="slave-user", \
MASTER_PASSWORD="slave-password", MASTER_HOST="master.host";
slave: START SLAVE

You just started replication from scratch, check “SHOW SLAVE STATUS” on the slave and “SHOW MASTER STATUS” on the master.

Deleting unneeded binlog files

Replication needs binlog files - a mysql file format for storing database changes in a binary format. Sometimes it is hard to decide how many of the binlog files you want to keep on the server possibly getting you into disk space trouble. Therefore deleting binlog files that have already been transferred to the client might be a smart idea when running low on space.

First you need to know which binlog files the slave already fetched. You can do this by having a look on “SHOW SLAVE STATUS;” on the slave. Now log into the MySQL master and run something like:

mysql> PURGE BINARY LOGS TO 'mysql-bin.010';

You can even do this on a date format level:

mysql> PURGE BINARY LOGS BEFORE '2008-04-02 22:46:26';

Conclusion

The above hints might save you same time when recovering or troubleshooting a MySQL server. Please note, that these are hints and you have - at any time - make sure, that your data has an up to date backup. Nothing will help you more.

a_quick_note_on_mysql_troubleshooting_and_mysql_replication_recovery.txt · Zuletzt geändert: 2009/04/18 20:10 (Externe Bearbeitung)