MySQL 5.6.10 Optimizer Limitations: Index Condition Pushdown

原文地址:http://www.mysqlperformanceblog.com/2013/03/14/mysql-5-6-10-optimizer-limitations-index-condition-pushdown/

While preparing the webinar I will deliver this Friday, I ran into a quite interesting (although not very impacting) optimizer issue: a “SELECT *” taking half the time to execute than the same “SELECT one_indexed_column” query in MySQL 5.6.10.

This turned into a really nice exercise for checking the performance and inner workings of one of the nicest features of the newer MySQL optimizer: the Index Condition Pushdown Optimization, or ICP, which we have previously discussed on our blog.

It was the following query in particular that had this surprising outcome:

当我在准备周五的这个活动时,我遇到了一个很有趣(尽管不是很严重)的优化器问题:在5.6.10里,一个select * 的查询执行时间比select one_indexed_column所花费的时间还要多一倍。

这演变成了一个非常好的测试,来检查mysql优化器的最新的一个特性的性能和内部操作:Index Condition Pushdown Optimization,ICP。在前面这篇博文讨论过.

下面这个查询返回出乎意料的结果:

mysql> SELECT * FROM cast_info WHERE role_id = 1 and note like '%Jaime%';

On a table like this:  表结构

CREATE TABLE `cast_info` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `person_id` int(11) NOT NULL,
  `movie_id` int(11) NOT NULL,
  `person_role_id` int(11) DEFAULT NULL,
  `note` varchar(250),
  `nr_order` int(11) DEFAULT NULL,
  `role_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `role_id_note` (`role_id`,`note`)
) ENGINE=InnoDB AUTO_INCREMENT=22187769 DEFAULT CHARSET=utf8;

The table had 22 million rows, with approximately 8 million of them having role_id = 1, and 266 have role_id = 1 and containing the word ‘Jaime’ somewhere in the field note.

The original query had a stable execution time of 1.09 sec, while the following one, which selects less amount of data (just one column) and can take advantage of the covering index technique, did actually take more time to execute:

这个表有2200W记录,role_id=1的记录有800W,role_id=1 并包含Jaime的记录有266条左右。

原始的查询的执行时间稳定在1.09S,然而下面这个查询,少查了很多数据(只查1列),并可以使用覆盖索引,却执行了更长的时间。

mysql> SELECT role_id FROM cast_info WHERE role_id = 1 and note like '%Jaime%'\G
266 rows in set (1.82 sec)

Please note that the times were very stable and the contents of the buffer pool did not affect the results.

What was happening? Well, in order to understand it I must provide you with more background information. My buffer pool was big enough to hold the whole database (data and indexes fit completely in memory). Also, I was testing, as I said before, index condition pushdown. Let’s have a look at the EXPLAIN output:

请注意这个执行时间是很稳定的,BP里的内容不会影响执行结果。

发生什么事了?为了能够理解这个现象,我必须提供更多后台信息。我的BP是很大,可以存下整个数据库(包括数据和索引都完全存在内存里)。另外,我前面说过的,我测试的是ICP。我们看看EXPLAIN的信息

mysql> EXPLAIN SELECT * FROM cast_info WHERE role_id = 1 and note like '%Jaime%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cast_info
         type: ref
possible_keys: role_id_note
          key: role_id_note
      key_len: 4
          ref: const
         rows: 10259274
        Extra: Using index condition
1 row in set (0.00 sec)

With ICP, the actual number of rows read at SQL layer is actually very different from the “rows” value seen above. This is because the second part of the condition -note like '%Jaime%'- is actually tested at engine level, not at handler level.

在有ICP的情况下,在SQL层真实的读取行和上面看到的EXPLAIN显示的rows值是有很大不同的。这是因为第二部分-note like '%Jaime%'的比较是在引擎层完成的,而不是在SERVER层。

Condition pushdown is one of the new features of MySQL 5.6, and actually is a great improvement over MySQL 5.5. For example, in this case, the actual number of “Handler_read_next” calls was reduced from 8346769 (5.5) to just 266 (5.6), reducing the executing time by almost 5 times. Pro tip: make sure you always check the Handler status variables for post-execution analysis.

So why is the “SELECT note” actually slower? It seems that whenever the covering index technique is available, this is always preferred over the ICP optimization:

Condition pushdown是mysql5.6里的一个新特性,相对5.5来说确实有了极大的提升。举个栗子说,在刚才的这个场景下,Handler_read_next数值从8346769(5.5)减少到了266(5.6),执行时间降低了差不多5倍。

mysql> EXPLAIN SELECT role_id FROM cast_info WHERE role_id = 1 and note like '%Jaime%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cast_info
         type: ref
possible_keys: role_id_note
          key: role_id_note
      key_len: 4
          ref: const
         rows: 10259274
        Extra: Using where; Using index
1 row in set (0.00 sec)

I reported this issue to Oracle and they confirmed that this is the intended/current status of the MySQL 5.6.10 optimizer. Other interesting things to notice:

我把这个问题报告给ORACLE了,他们也确认了5.6.10优化器里确实存在这个问题。另外一些有趣的小地方要注意:

  • ICP is a great new feature that already saved us a lot of execution time, probably its cost has to be tuned better in the feature. There are more ways to make a query faster, which means you need more manual care and tuning now.
  • MySQL is conservative about “Using index” -in most cases it will be the right solution because our SELECT will only be faster when the condition is very selective and the buffer pool is effective.
  • There is no workaround, using FORCE-like commands or optimizer_switch flags- we can disable ICP, but not “using index”.

So, I wouldn’t call this a bug, and I’m not especially concerned about this particular case — you may or may not consider it an edge case — but I would call it a limitation of the current query planner. However I would like to see better algorithms, statistics computation and monitoring variables about index usage in the future, now that we have more complex optimization strategies. Even row-level operation counters are sometimes not enough.

原文地址:https://www.cnblogs.com/zuoxingyu/p/3014837.html