SQL总结

1. sql的执行过程(相关的博文：SQL语句的解析过程)

关于sql的执行过成，参考：

Reference : Inside Microsoft® SQL Server™ 2005 T-SQL Querying
Publisher: Microsoft Press
Pub Date: March 07, 2006
Print ISBN-10: 0-7356-2313-9
Print ISBN-13: 978-0-7356-2313-2
Pages: 640

Inside Microsoft® SQL Server™ 2005 T-SQL Querying

(8)  SELECT (9) DISTINCT (11) TOP <top_specification> <select_list>
(1)  FROM <left_table>
(3)       <join_type> JOIN <right_table>
(2)       ON <join_condition>
(4)  WHERE <where_condition>
(5)  GROUP BY <group_by_list>
(6)  WITH {CUBE | ROLLUP}
(7)  HAVING <having_condition>
(10) ORDER BY <order_by_list>

The first noticeable aspect of SQL that is different than other programming languages is the order in which the code is processed. In most programming languages, the code is processed in the order in which it is written. In SQL, the first clause that is processed is the FROM clause, while the SELECT clause, which appears first, is processed almost last.

Each step generates a virtual table that is used as the input to the following step. These virtual tables are not available to the caller (client application or outer query). Only the table generated by the final step is returned to the caller. If a certain clause is not specified in a query, the corresponding step is simply skipped.

Brief Description of Logical Query Processing Phases Don't worry too much if the description of the steps doesn't seem to make much sense for now. These are provided as a reference. Sections that come after the scenario example will cover the steps in much more detail.

FROM: A Cartesian product (cross join) is performed between the first two tables in the FROM clause, and as a result, virtual table VT1 is generated.
ON: The ON filter is applied to VT1. Only rows for which the is TRUE are inserted to VT2.
OUTER (join): If an OUTER JOIN is specified (as opposed to a CROSS JOIN or an INNER JOIN), rows from the preserved table or tables for which a match was not found are added to the rows from VT2 as outer rows, generating VT3. If more than two tables appear in the FROM clause, steps 1 through 3 are applied repeatedly between the result of the last join and the next table in the FROM clause until all tables are processed.
WHERE: The WHERE filter is applied to VT3. Only rows for which the is TRUE are inserted to VT4.
GROUP BY: The rows from VT4 are arranged in groups based on the column list specified in the GROUP BY clause. VT5 is generated.
CUBE | ROLLUP: Supergroups (groups of groups) are added to the rows from VT5, generating VT6.
HAVING: The HAVING filter is applied to VT6. Only groups for which the is TRUE are inserted to VT7.
SELECT: The SELECT list is processed, generating VT8.
DISTINCT: Duplicate rows are removed from VT8. VT9 is generated.
ORDER BY: The rows from VT9 are sorted according to the column list specified in the ORDER BY clause. A cursor is generated (VC10).
TOP: The specified number or percentage of rows is selected from the beginning of VC10. Table VT11 is generated and returned to the caller.

Therefore, (INNER JOIN) ON will filter the data (The data count of VT will be reduced here itself) before applying WHERE clause. The subsequent join conditions will be executed with filtered data which makes better performance. After that only WHERE condition will apply filter conditions.

(Applying conditional statements in ON / WHERE will not make much difference in few cases. This depends how many tables you have joined and number of rows available in each join tables)

2. join, left join, right join, outer join

有图有真像：

上例子：

people, property 这两个表，用来做示例：

　　1. inner join, join和使用,表达式是一样的

1 -- 内连接
2 SELECT * FROM people a, property b WHERE a.id = b.`people_id`;
3 SELECT * FROM people a INNER JOIN property b ON a.`id` = b.`people_id`;
4 SELECT * FROM people a JOIN property b ON a.`id` = b.`people_id`;

这三种写法在 mysql 和 oracle 中都支持。查询结果：

　　2. 左外连接

1 -- 内连接2 SELECT * FROM people a left outer JOIN property b ON a.`id` = b.`people_id`;
3 SELECT * FROM people a left JOIN property b ON a.`id` = b.`people_id`

上面写法，在 mysql 和 Oracle 都支持，Oracle的特殊支持：

1 -- oracle 特有写法
2 select * from people a , property b where a.id = b.people_id(+);
3 select * from people a join property b on a.id = b.people_id(+);

查询结果：

　　3. 右外连接

1 -- 右外连接 mysql oracle 都支持
2 SELECT * FROM people a RIGHT OUTER JOIN property b ON a.`id` = b.`people_id`;
3 SELECT * FROM people a RIGHT JOIN property b ON a.`id` = b.`people_id`;

oracle 中的特殊写法：

1 select * from people a , property b where a.id(+) = b.people_id;
2 select * from people a join property b on a.id(+) = b.people_id;

查询结果：

　　4. 全外连接

在MySQL中不支持全外连接，Oracle支持全连接

1 select * from people a full join property b on a.`id` = b.`people_id`;
2 select * from people a full outer join property b on a.id = b.people_id;

在上面的写法，我们也可以使用在 where 子句中 a.id = null 或 b.people_id = null 又可以添加一次过滤。

3.子查询（subquery）

　　3.1 子查询和内连接比较

　　子查询就是位于SELECT、UPDATE、或DELETE语句中内部的查询。例如：

　　当表关系是1对1的情况下，使用内连接和使用子查询，查出来的数据是一样的。但是对于表关系是1：n的情况下，使用子查询是有问题的。例如：

1 -- 子查询
2 SELECT * FROM people a WHERE a.id IN ( SELECT b.`people_id` FROM property b);
3 -- 内连接
4 SELECT * FROM people a INNER JOIN property b ON a.`id` = b.`people_id`;

子查询的结果：

内连接的结果：

连接查询和子查询的区别，如图看上面的查询结果可以看到，使用子查询只是查询出，a表中的数据，并且，数据不全。所以要慎用子查询。或者说子查询这样的用法就是有问题的。

exist,not exist一般都是与子查询一起使用. In可以与子查询一起使用,也可以直接in (a,b.....)

关于 exist ，in 的用法参考： http://www.cnblogs.com/highriver/archive/2011/05/30/2063461.html

注意使用 in ， exist 的目的是过滤数据。

4. having 和 where

having 是用在 group by 之后进行数据过滤，而where 是用在分组（group by）之前进行条件筛选的。

5.总结思考

1 select a.name, b.name, u.name
2 
3   from  a
4   left join b
5     on a.reportid = b.id
6   left join  u
7     on a.proposerid = u.id

上面的查询语句其实在执行时，是这样的 a 4 left join b on a.reportid = b.id 这个语句将生成一张临时表 c ,然后

在 c left join u on a.proposerid = u.id , 这样多个表的连接查询就可以理解了