Notes on <High Performance MySQL> Ch8: Replication

Chapter 8: Replication

Replication is not just useful for high-performance applications—it is also handy for many other tasks, such as sharing data with a remote office, keeping a “hot spare”, or keeping a server with a copy of the data for testing or training purpose.

Replication Overview

You can replicate the entire server, replicate only certain database, or even choose which tables you want to replicate.

MySQL supports two kinds of replication: statement-based replication and row-based replication. Both kinds work by recording changes in the master’s binary log and replaying the log on the slave, and both are asynchronous.

MySQL’s replication is mostly backward compatible. That is, a newer server can usually be a slave of an older server without trouble. It’s a good idea to test your replication setup before upgrading from one major version to another.

Replication requires binary logging to be enabled on the master. Aside from binary logging, each attached slave also adds a little load (mostly network I/O) on the master during normal operation.

Replication is relatively good for scaling reads.

Problems Solved by Replication

- Data distribution

MySQL’s replication is usually not very bandwidth-intensive, and you can stop and start it at will.

- Load balancing

Distribute read queries across several servers, which works very well for read-intensive applications.

On a small scale, you can use simplistic approaches such as hardcoded hostnames or round-robin DNS.

Or use standard load-balancing solutions, such as network load-balancing products.

- Backups

- High availability and failover

- Testing MySQL upgrades

It’s common practice to set up a slave server with an upgraded MySQL version and use it to ensure that your queries work as expected, before upgrading every instance.

How Replication Works

At a high level, replication is a simple three-part process:

- The master records changes to its data in its binary log. These records are called binary log events.

- The slave copies the master’s binary log events to its relay log.

- The slave replays the events in the relay log, applying the changes to its own data.

When slave to copy the master’s binary log to its own hard drive, it starts a worker thread, called the I/O slave thread. The I/O thread opens an ordinary client connection to the master, then starts a special binlog dump process. The binlog dump process reads events from the master’s binary log.

The SQL slave thread handles the last part of the process.

Replication is serialized on the slave, which means updates that might have run in parallel (in different threads) on the master cannot be parallelized on the slave.

Setting Up Replication

Creating Replication Accounts

You must create a user account on the master and give it the proper privileges, so the I/O thread can connect as that user and read the master’s binary log.

Mysql-> GRANT REPLICATION SLAVE, REPLIACTION CLIENT ON *.*

- > TO repl@’192.168.0.%’ identified by ‘password’;

Configuring the Master and Slave

You need to enable binary logging and specify a server ID. Enter the following lines in the master’s my.cnf file:

Log_bin = mysql-bin

Server_id = 10

You must explicitly specify a unique server ID. If binary logging wasn’t already specified in the master’s configuration file, you’ll need to restart MySQL. To verify that the binary log file is created on the master, run SHOW MASTER STATUS.

The slave requires a configuration in its my.cnf file similar to the master, and you’ll also need to restart MySQL on the slave.

Starting the Slave

Tell the slave how to connect to the master and begin relaying its binary log.

Mysql-> CHANGE MASTER TO

- > MASTER_HOST=’server1’,

- > MASTER_USER=’repl’,

- > MASTER_PASSWORD=’password’,

- > MASTER_LOG_FILE = ‘mysql-bin.00001’,

- > MASTER_LOG_POS=0;

To start replication, running the following command:

Mysql -> START SLAVE;

Initializing a Slave from Another Server

You need three things to synchronize a slave with a master:

A snapshot of the master’s data at some point in time.
The master’s current log file, and the byte offset within that log at the exact point in time you took the snapshot. We refer to these two values as the log file coordinates, because together they identify a binary log position.
The master’s binary log files from that time to the present.

Here are some ways to clone a slave from another server:

With a cold copy
With a warm copy

If you use only MyISAM tables, you can use mysqlhotcopy to copy files while the server is still running.

Using mysqldump

If you use only InnoDB tables, you can use the following command to dump everything from the master, load it all into the slave, and change the slave’s coordinates to the corresponding position in the master’s binary log:

$mysqldump –single-transaction –all-databases –master-data=1 –host=server1 | mysql –host=server2

With an LVM snapshot or backup

As long as you know the corresponding binary log coordinates, you can use an LVM snapshot from the master or a backup to initialize the slave.

From another slave

The biggest disadvantage of cloning one slave from another is that if your slave has become out of sync with the master, you’ll be cloning bad data.

Recommended Replication Configuration

The most important setting for binary logging on the master is sync_binlog:

sync_binlog=1

This makes MySQL synchronize the binary log’s contents to disk each time it commits a transaction, so you don’t lose log events if there’s a crash.

If you use InnoDB, we strongly recommend setting the following options on the master:

Innodb_flush_logs_at_trx_commit=1 # Flush every log write

Innodb_support_xa=1 # MySQL 5.0 and newer only

Innodb_safe_binlog # MySQL 4.1 only, roughly equivalent to innodb_support_xa

On the slave, we recommend enabling the following configuration options:

Skip_slave_start

Read_only

Replication Under the Hood

Statement-Based Replication

MySQL 5.0 and earlier support only statement-based replication (also called logical replication). Statement-based replication works by recording the query that changed the data on the master.

Statement-based replication: the modifications must be serializable.

Row-Based Replication

MySQL 5.1 added support for row-based replication, which records the actual data changes in the binary log.

The biggest advantages are that MySQL can replicate every statement correctly, and some statements can be replicated much more efficiently.

The main drawbacks are that the binary log can become much larger and there’s less visibility into what statements updated the data, so you can’t use the binary log for auditing with mysqlbinlog.

Because neither format is perfect for every situation, MySQL 5.1 switches between statement-based and row-based replication dynamically. By default, it uses statement-based replication, but when it detects an event that cannot be replicated correctly with a statement, it switches to row-based replication. You can also control the format as needed by setting the binlog_format session variable.

Replication Files

mysql-bin.index

Each line in the file contains the filename of a binary log file. MySQL relies on this index file, and it will not recognize a binary log file unless it’s mentioned here.

mysql-relay-bin.index
master.info

This file contains the information a slave server needs to connect to its master.

relay-log.info

This file contains the slave’s current binary log and relay log coordinates.

By default, the binary logs are named after the server’s hostname with a numeric suffix, but it’s a good idea to name them explicitly in my.cnf, as in the following example:

Log_bin # Don’t do this, or files will be named after the hostname

Log_bin = mysql_bin # this is safe

You should also name the relay logs and the corresponding .index files explicitly.

Log_bin = mysql_bin

Log_bin_index = mysql_bin.index

Relay_log = mysql_relay_bin

Relay_log_index = mysql_relay_bin.index

The .inex files also interact with another setting, expire_logs_days, which specifies how MySQL should purge expired binary logs.

Sending Replication Events to Other Slaves

The log_slave_updates option lets you use a slave as a master of other slaves. It instructs MySQL to write the events the slave SQL thread executes into its own binary log, which its own slaves can then retrieve and execute.

Replication Filters

Replication filtering options let you replicate just part of a server’s data. There are two kinds of replication filters: those that filter events out of the binary log on the master, and those that filter events coming from the relay log on the slave.

The most important thing to understand about these options is that the *_do_db and *_ignore_db options, both on the master and on the slave, do not work as you might expect. You might think they filter on the object’s database name, but they actually filter on the current default database. That’s, if you execute the following statements on the master:

Mysql-> USE test;

Mysql -> DELETE FROM sakila.film;

The *_do_db and *_ignore_db parameters will filter the DELETE statement on the test, not on sakila. The *_do_db and *_ignore_db parameters have uses, but they’re limited and rare, and you should be very careful with them. If you use these parameters, it’s very easy to for replication to get out of sync.

Stopping GRANT and REVOKE statements from replicating to slave is a common use for replication filters.

replicate_ignore_table=mysql.columns_priv

replicate_ignore_table=mysql.db

replicate_ignore_table=mysql.host

replicate_ignore_table=mysql.procs_priv

replicate_ignore_table=mysql.tables_priv

replicate_ignore_table=mysql.user

Replication Topologies

A MySQL slave instance can have only one master
Every slave must have a unique server ID
A master can have many slaves (or, correspondingly, a slave can have many siblings)
A slave can propagate changes from its master, and be the master of other slaves, if you enable log_slave_updates

Master and Multiple Slaves

This configuration is most useful when you have few writes and many reads. You can spread reads across any number of slave servers, up to the point where the slaves put much load on the master or network bandwidth from the master to the slaves becomes a problem.

Master-Master in Active-Active Mode

Master-master replication involves two servers, each configured as both a master and a slave of the other—in other words, a pair of co-masters.

Master-master replication in active-active mode has uses, but they’re generally special-purpose. One possible use is for geographically separated offices, where each office needs its own locally writable copy of data.

The biggest problem with such a configuration is how to handle conflicting changes.

Problems usually show up when a query changes the same row simultaneously on both servers and inserts into a table with an AUTO_INCREMENT column at the same time on both servers.

In general, allowing writes on both servers can cause more trouble than it’s worth. However, an active-passive configuration is very useful indeed.

Master-Master in Active-Passive Mode

One of the servers is a read-only “passive” server.

This configuration lets you swap the active and passive server roles back and forth very easily, because the servers’ configurations are symmetrical.

The change gets written to active server’s binary log and flows through replication to the passive server’s relay log. The passive server executes the query and writes the event to its own binary log, because you enabled log_slave_updates. The active server then retrieves the same change via replication into its own relay log, but ignores it because the server ID in the event matches its own.

Master-Master with Slaves

A related configuration is to add one or more slaves to each co-master, as shown in the following figure—

The advantage of this configuration is extra redundancy. In a geographically distributed replication topology, it removes the single point of failure at each site. You can also offload read-intensive queries to the slaves, as usual.

Ring

A ring has three or more masters. Each server is a slave of the server before it in the ring, and a master of the server after it. This topology is also called circular replication.

In general, rings are brittle and best avoided.

You can mitigate some of the risk of a ring replication setup by adding slaves to provide redundancy at each site. This merely protects against the risk of a server failing, though. A loss of power or any other problem that affects any connection between sites will still break the entire ring.

Master, Distribution Master, and Slaves

A distribution master is a slave whose only purpose is to read and serve the binary logs from the master. Many slaves can connect to the distribution master, which insulates the original master from the load. To remove the work of actually executing the queries on the distribution master, you should change its tables to the Blackhole storage engine.

If you use Blackhole tables on the distribution master, it will be able to serve more slaves than it could otherwise. The distribution master will execute the queries, but the queries will be extremely cheap, because the Blackhole tables will not have any data.

Tree or Pyramid

Custom Replication Solutions

Selective replication

Separating functions

Data archiving

Using slaves for full-text searches

Read-only slaves

Emulating multimaster replication

Creating a log server

Replication and Capacity Planning

Why Replication Doesn’t Help Scale Writes

Replication scales reads, but it doesn’t scale writes.

Sharding (partitioning) your data is the only way you can scale writes.

Plan to Underutilize

Intentionally underutilizing your servers can be a smart and cost-effective way to build a large application, especially when you use replication. Servers that have spare capacity can tolerate surges better, have more power to handle slow queries and maintenance jobs (such as OPTIMIZE TABLE operations), and will be better able to keep up in replication.

Replication Administration and Maintenance

Monitoring Replication

On the master, you can use the SHOW MASTER STATUS command to see the master’s current binary log position and configuration.

You can also view replication events in the binary log with the “SHOW BINLOG EVENTS” command.

Measuring Slave Lag

One good solution is a heartbeat record, which is a timestamp that you update once per second on the master. To calculate the lag, you can simply subtract the heartbeat from the current timestamp on the slave.

Determining Whether Slaves Are Consistent with the Master

Resyncing a Slave from the Master

Changing Masters

Planned promotions

Unplanned promotions

Locating the desired log positions

Switching Roles in a Master-Master Configuration

Replication Problems and Solutions

Errors Caused by Data Corruption or Loss

Using Nontransactional Tables

If you’re using MyISAM tables, be sure to run STOP SLAVE before stopping the MySQL server, or the shutdown will kill any running queries (including any incomplete update statements). Transactional storage engine don’t have this problem. If you’re using transactional tables, the failed update will be rolled back on the master and not logged to the binary log.

Mixing Transactional and Nontransactional Tables

When you use a transactional storage engine, MySQL doesn’t log the statements you execute on the binary log until the transactions commit. Thus, if a transaction is rolled back, MySQL won’t log the statements, so they won’t get replayed on the slave.

Nondeterministic Statements

Row-based replication does not have these limitations.

Different Storage Engines on the Master and Slave

Data Changes on the Slave

Nonunique Server IDs

Undefined Server IDs

Dependencies on Nonreplicated Data

Missing Temporary Tables

Not Replicating All Updates

Lock Contention Caused by InnoDB Locking Selects

A workaround to alleviate the lock contention is to replace INSERT…SELECT with a combination of SELECT INTO OUTFILE followed by LOAD DATA INFILE on the master.