MySQL Derived table Derived table实际上是一种特殊的subquery,它位于SQL语句中FROM子句里面,可以看做是一个单独的表。MySQL5.7之前的处理都是对Derived table进行Materialize,生成一个临时表保存Derived table的结果,然后利用临时表来协助完成其他父查询的操作,比如JOIN等操作。MySQL5.7中对Derived table做了一个新特性。该特性允许将符合条件的Derived table中的子表与父查询的表合并进行直接JOIN。 下面我们看一下DBT-3中的一条被新特性优化过的执行计划:
1 2 3 4 5 6 7 8 9 SELECT t2.o_clerk, t1.price - t2.o_totalpriceFROM (SELECT l_orderkey, SUM ( l_extendedprice * (1 - l_discount)) price FROM lineitem GROUP by l_orderkey) t1 JOIN (SELECT o_clerk, o_orderkey, o_totalprice FROM orders WHERE o_orderdate BETWEEN '1995-01-01' AND '1995-12-31' ) t2 ON t1.l_orderkey = t2.o_orderkey WHERE t1.price > t2.o_totalprice;
MySQL5.6执行计划如下图所示(下图通过WorkBench的Visual Explain直观的对执行计划进行了展示): 对应的explain输出结果为:
1 2 3 4 5 ID SELECT_TYPE TABLE TYPE POSSIBLE_KEYS KEY KEY_LEN REF ROWS EXTRA 1 PRIMARY <derived3> ALL NULL NULL NULL NULL 4812318 NULL 1 PRIMARY <derived2> ref <auto_key0> <auto_key0> 4 t2.o_orderkey 599860 Using where; Using index 3 DERIVED orders ALL i_o_orderdate NULL NULL NULL 15000000 Using where 2 DERIVED lineitem index PRIMARY, i_l_shipdate, … PRIMARY 8 NULL 59986052 NULL
可以看到orders已经从Derived table的子表里面merge到了父查询中,尽而简化了执行计划,同时也提高了执行效率。看一下MySQL5.6与MySQL5.7对于上面的DBT-3中的这条Query执行性能的对比图: Merge Derived table有两种方式进行控制。第一种,通过开关optimizer_switch=’derived_merge=on|off’来进行控制。第二种,在CREATE VIEW的时候指定ALGORITHM=MERGE | TEMPTABLE, 默认是MERGE方式。如果指定是TEMPTABLE,将不会对VIEW进行Merge Derived table操作。只要Derived table里不包含如下条件就可以利用该特性进行优化:
UNION clause
GROUP BY
DISTINCT
Aggregation
LIMIT or OFFSET
Derived table里面包含用户变量的设置。
那么Merge Derived table在MySQL中是如何实现的呢?下面我们分析一下源码。 对于Derived table的merge过程是在MySQL的resolve阶段完成的,这意味着对于Merge操作是永久性的,经过resolve阶段之后就不会再对Derived table进行其他的变换。执行的简单流程如下:
1 2 3 4 5 6 7 8 9 SELECT_LEX::prepare | TABLE_LIST::resolve_derived | SELECT_LEX::merge_derived
下面我们重点研究一下merge_derived这个函数实现过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 bool SELECT_LEX::merge_derived(THD *thd, TABLE_LIST *derived_table) { DBUG_ENTER("SELECT_LEX::merge_derived" ); if (!derived_table->is_view_or_derived () || derived_table-> is_merged()) DBUG_RETURN(false ); SELECT_LEX_UNIT *const derived_unit= derived_table-> derived_unit(); DBUG_ASSERT(derived_unit -> is_prepared()); LEX *const lex= parent_lex; if ((master_unit() == lex-> unit && !lex -> can_use_merged()) || lex -> can_not_use_merged()) DBUG_RETURN(false ); if (!derived_unit-> is_mergeable() || derived_table -> algorithm == VIEW_ALGORITHM_TEMPTABLE || (!thd -> optimizer_switch_flag(OPTIMIZER_SWITCH_DERIVED_MERGE) && derived_table -> algorithm != VIEW_ALGORITHM_MERGE)) DBUG_RETURN(false ); SELECT_LEX *const derived_select= derived_unit-> first_select(); if ((active_options() & SELECT_STRAIGHT_JOIN) && derived_select-> has_sj_nests) DBUG_RETURN(false ); ... if (!(derived_table-> nested_join= (NESTED_JOIN *) thd -> mem_calloc(sizeof(NESTED_JOIN)))) DBUG_RETURN(true ); derived_table ->nested_join -> join_list.empty(); if (derived_table-> merge_underlying_tables(derived_select)) DBUG_RETURN(true ); for (TABLE_LIST **tl= &leaf_tables; *tl; tl= &(*tl)-> next_leaf) { if (*tl == derived_table) { for (TABLE_LIST *leaf= derived_select-> leaf_tables; leaf; leaf = leaf-> next_leaf) { if (leaf-> next_leaf == NULL) { leaf ->next_leaf = (*tl)-> next_leaf; break; } } *tl = derived_select-> leaf_tables; break; } } leaf_table_count += (derived_select-> leaf_table_count - 1 ); derived_table_count += derived_select-> derived_table_count; materialized_derived_table_count+= derived_select -> materialized_derived_table_count; has_sj_nests |= derived_select-> has_sj_nests; partitioned_table_count += derived_select-> partitioned_table_count; cond_count += derived_select-> cond_count; between_count += derived_select-> between_count; if (derived_select-> active_options() & OPTION_SCHEMA_TABLE) add_base_options(OPTION_SCHEMA_TABLE); if (derived_table-> is_inner_table_of_outer_join()) propagate_nullability (&derived_table->nested_join -> join_list, true ); select_n_having_items += derived_select-> select_n_having_items; if (derived_table-> merge_where(thd)) DBUG_RETURN(true ); derived_unit -> exclude_level(); derived_table -> set_derived_unit((SELECT_LEX_UNIT *)1 ); if (derived_table-> create_field_translation(thd)) DBUG_RETURN(true ); merge_contexts(derived_select); repoint_contexts_of_join_nests (derived_select-> top_join_list); remap_tables(thd); fix_tables_after_pullout(this, derived_select, derived_table, table_adjust); if (derived_select-> is_ordered()) { if ((lex-> sql_command == SQLCOM_SELECT || lex -> sql_command == SQLCOM_UPDATE || lex -> sql_command == SQLCOM_DELETE) && !(master_unit ()-> is_union() || is_grouped() || is_distinct() || is_ordered() || get_table_list ()-> next_local != NULL)) order_list .push_back(&derived_select-> order_list); } if (derived_select->ftfunc_list -> elements && add_ftfunc_list (derived_select-> ftfunc_list)) DBUG_RETURN(true ); DBUG_RETURN(false ); }
综上所述,本篇文章简要的分析了MySQL Merge Derived table的作用以及实现方式。Merge Derived table的引入可以有效的提升Subquery query的执行效率,更重要的是为以后应对复杂查询提供了新的优化手段。