测定INNODB REDO LOGS的写入量

原文链接：http://www.mysqlperformanceblog.com/2012/10/08/measuring-the-amount-of-writes-in-innodb-redo-logs/

Choosing a good InnoDB log file size is key to InnoDB write performance. This can be done by measuring the amount of writes in the redo logs. You can find a detailed explanation in this post.

选择合适的INNODB日志文件大小对于INNODB的写入性能是至关重要的。这个可以通过测量REDO日志的写入量来达到这个目的。你可以在这篇文章里找到详细的解释

To sum up, here are the main points: 总的来说，要点如下：

The redo logs should be large enough to store at most an hour of logs at peak-time
REDO LOGS应该足够大，以至于在高峰期能够存下最多1小时的日志。
You can either use the LSN in the SHOW ENGINE INNODB STATUS OUTPUT or the Innodb_os_log_written global status variable (if you are a Percona Server user, the LSN is also given by the Innodb_lsn_current status variable)
你也可以通过看SHOW ENGINE INNODB STATUS输出的LSN值或者Innodb_os_log_written全局状态值（如果你用的是Percona服务，LSN也可以通过Innodb_lsn_current状态值得到）

While reviewing the recommendation I made for a customer, one of my colleagues told me I was wrong in my redo log size calculations. After each one double checked the calculations, it turned out that we experienced something not expected:

当我评估对一个客户的建议时，一个同事告诉我，我计算REDO LOG大小的方法错了。对每一个值都反复检查后，结果表明我们有些经验所得是有问题的。

Using Innodb_os_log_written, I found that around 7.15 GB of redo logs were written per hour
查看Innodb_os_log_written，我发现大约每小时写7.15G的日志
Using the LSN, my colleague found 2.70 GB/hour (almost a 3x difference!)
查看LSN，我同事发现每小时写2.7G的日志（大约差3倍）

Something was obviously wrong in our understanding of how to measure the amount of writes in the redo logs. Let’s first have a look at what the documentation says. It states that

很明显我们理解的如何计算REDO LOG的写入量的方法有些问题。我们先看看文档是怎么说的，它指出

Innodb_os_log_written is the number of bytes written to the log file
Innodb_os_log_written 写入到日志文件的字节数
The LSN is an arbitrary, ever-increasing value [that] represents a point in time corresponding to operations recorded in the redo log
LSN 一个任意的，不断增长的值，它实时记录在RODO LOG里的记录的操作点。

What is not obvious from the documentation is that while Innodb_os_log_written is incremented when the log file is written, the LSN is incremented when the log buffer is written.

文档里面不清晰的是当写日志文件时，Innodb_os_log_written会增长，当写LOG BUFFER时，LSN会增长？

This is interesting. It means that the durability setting can skew the results: if innodb_flush_log_at_trx_commit is set to 0, you can accidentally omit or add 1 second of write activity. Of course if you measure variations over 60s, this will not explain a 3x difference with the LSN. It also means that if the write workload is very non uniform, you can easily get very different numbers if you are not taking measures exactly at the same time for the 2 methods.

这很有意思。这意味着固定的设置会歪曲结果值：如果innodb_flush_log_at_trx_commit设置为0，你可能无意中忽略了或者新加入1秒的写入。当然如果你测量变量值超过60秒，这和LSN的差距不会有3倍了。这也意味着如果写负载很不均匀，如果你不在相同的时间里正确的测试这2个方法，你会很容易得到完全不同的结果。

However, the write workload had not so much variance in my case. I also ran a test with a constant write workload (a mono-threaded script that inserts one row at a time in a table, as fast as it can) and I ended up with the same result: numbers were very different between the 2 methods. Even stranger, the innodb_os_log_written method consistently gave higher numbers than the LSN method, when we would have expected the opposite.

然而，在我这个例子里面写入负载没那么不均衡。我也用一个稳定的写入负载（单线程脚本，尽可能快的每次写入一条记录到表里）得到了同样的结果：2个方法得到的值迥然不同。更奇怪的是，innodb_os_log_written方法得到的值始终比LSN方法要高，这和我们的设想相悖了。

It was time for digging into the source code. All the credits should actually be given to Alexey Kopytov, who not only took the time to read the code again and to make tests, but who also caught something we all missed: writing to the redo logs and increasing the LSN have completely different logics.

是时候深入源码了。实际上所有的功能应该属于Alexey Kopytov，他不仅仅花时间重新阅读代码和做测试，也得到了一些我们漏掉的东西：写REDO LOGO和LSN的增长是完全不同的逻辑。

The LSN simply shows the byte offset, so when you write 100 bytes to the log buffer, the LSN is increased by 100. Writing to the redo logs is a much more complicated process: every write is a 512-byte write and there can be overlapping writes. Not clear? Let’s look at an example when innodb_flush_log_at_trx_commit is set to 1 or 2 (again, thanks Alexey):

LSN简单的显示字节偏移量，当你写100字节到LOG BUFFER，LSN增加了100。写REDO LOG是一个非常复杂的过程：每次写512字节，并且它还能覆盖写。还没明白？把innodb_flush_log_at_trx_commit设置成1或2，看个例子。

Transaction 1 writes 100 bytes to the log buffer
事务1写100字节到LOG BUFFER
At commit, InnoDB writes a 512-byte block at offset xxx and increments Innodb_os_log_written by 512 bytes
提交，INNODB写一个512字节的块，偏移量XXX，Innodb_os_log_written增加512字节
Transaction 2 writes 200 bytes to the log buffer
事务2写200字节到LOG BUFFER
At commit, InnoDB appends those 200 bytes to the same log block and overwrites the same 512-byte file block at offset xxx, then increases Innodb_os_log_written by another 512 bytes
提交，INNODB把这200字节添加到同一个日志块，覆盖了相同的512字节的文件块，偏移量XXX，然后Innodb_os_log_written增长了另外512字节。

At this point, the LSN has increased by 300 and Innodb_os_log_written by 1024 (a 3x difference!). This means that the documentation is correct: Innodb_os_log_written is the number of bytes written to the redo logs. But it does not reflect the growth of the redo logs.

这时候，LSN增加了300，Innodb_os_log_written增加了1024（3倍差距！）。这表明文档是正确的：Innodb_os_log_written是写入REDO LOG的字节数，但是它不能体现REDO LOG的增长。

So when you are trying to size the redo logs, looking at the LSN variations is a much better approximation than looking at the Innodb_os_log_written variations, which can be significantly far from the reality. However keep in mind that even the LSN is an approximate metric: if your write workload is non uniform and your sampling interval too short, you may well underestimate or overestimate the growth of your redo logs.

所以如果你尝试计算REDO LOG的大小，看LSN变量比看Innodb_os_log_written值要更精确些，Innodb_os_log_written可能远远大于真实值。即便LSN是一个近似指标，你还是要注意：如果你的写入负载不均匀而且采样间隔太短，你也会过低或过高的估算你的REDO LOG的增长。