postgresql 9.6 的并行(parallel)简介

os:centos 6.8
db:postgresql 9.6.8

postgresql 从9.6开始引入并行处理，极大的提升了db的处理能力

select version()

PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit

select *
from pg_settings ps
where 1=1
and ps.name like '%process%'
;

select *
from pg_settings ps
where 1=1
and ps.name in (
'force_parallel_mode',
'max_worker_processes',
'max_parallel_workers_per_gather',
'min_parallel_relation_size',
'parallel_tuple_cost',
'parallel_setup_cost'
)
;

              name               | setting | unit |                category                |                                             short_desc                                             |                                   extra_desc                                   |  context   | vartype |       source       | min_val |   max_val    |     enumvals     | boot_val | reset_val |                  sourcefile                  | sourceline | pending_restart 
---------------------------------+---------+------+----------------------------------------+----------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------+------------+---------+--------------------+---------+--------------+------------------+----------+-----------+----------------------------------------------+------------+-----------------
 force_parallel_mode             | off     |      | Query Tuning / Other Planner Options   | Forces use of parallel query facilities.                                                           | If possible, run query using a parallel worker and with parallel restrictions. | user       | enum    | configuration file |         |              | {off,on,regress} | off      | on        | /var/lib/pgsql/9.6/data/postgresql.auto.conf |         25 | f
 max_parallel_workers_per_gather | 2       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node.                                   |                                                                                | user       | integer | configuration file | 0       | 1024         |                  | 0        | 2         | /var/lib/pgsql/9.6/data/postgresql.auto.conf |         26 | f
 max_worker_processes            | 8       |      | Resource Usage / Asynchronous Behavior | Maximum number of concurrent worker processes.                                                     |                                                                                | postmaster | integer | default            | 0       | 262143       |                  | 8        | 8         |                                              |            | f
 min_parallel_relation_size      | 1024    | 8kB  | Query Tuning / Planner Cost Constants  | Sets the minimum size of relations to be considered for parallel scan.                             |                                                                                | user       | integer | default            | 0       | 715827882    |                  | 1024     | 1024      |                                              |            | f
 parallel_setup_cost             | 1000    |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of starting up worker processes for parallel query.        |                                                                                | user       | real    | default            | 0       | 1.79769e+308 |                  | 1000     | 1000      |                                              |            | f
 parallel_tuple_cost             | 0.1     |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend. |                                                                                | user       | real    | default            | 0       | 1.79769e+308 |                  | 0.1      | 0.1       |                                              |            | f
(6 rows)

force_parallel_mode
允许为测试目的使用并行查询，即便是并不期望在性能上得到效益。 force_parallel_mode的允许值是
off （只在期望改进性能时才使用并行模式）、
on （只要查询被认为是安全的，就强制使用并行查询）以及 regress（和on相似，但是有如下文所解释的额外行为改变）。
更具体地说，把这个值设置为on 会在任何一个对于并行查询安全的查询计划顶端增加一个 Gather节点，这样查询会在一个并行工作者中运行。
即便当一个并行工作者不可用或者不能被使用时，诸如开始一个子事务等在并行查询环境中会被禁止的操作将会被禁止，除非规划器相信这样做会导致查询失败。
当这个选项被设置时如果出现失败或者意料之外的结果，查询使用的某些函数可能需要被标记为PARALLEL UNSAFE （或者可能是PARALLEL RESTRICTED）。
把这个值设置为regress具有设置成on 所有相同的效果，外加一些有助于自动回归测试的额外的效果。
一般来说，来自于一个并行工作者的消息会包括一个上下文行指出这一点，但是设置为regress会消除这一行，这样输出就和非并行执行完全一样。
同样，被这个设置加到计划上的 Gather节点在EXPLAIN输出终会被隐藏起来，这样产生的输出匹配设置为off时产生的输出。

max_worker_processes
设置系统能够支持的后台进程的最大数量。这个参数只能在服务器启动时设置。默认值为 8。
在运行一个后备服务器时，你必须把这个参数设置为等于或者高于主控服务器上的值。否则，后备服务器上可能不会允许查询。

max_parallel_workers_per_gather
必须被设置为大于零的值。这是一种特殊情况，更加普遍的原则是所用的工作者数量不能超过max_parallel_workers_per_gather所配置的数量。
设置单个Gather节点能够开始的工作者的最大数量。
并行工作者会从max_worker_processes建立的进程池中取得。
注意所要求的工作者数量在运行时可能实际无法被满足。
如果这种事情发生，该计划将会以比预期更少的工作者运行，这可能会不太高效。
这个值设置为 0（默认值）将会禁用并行查询执行。
注意并行查询可能消耗比非并行查询更多的资源，因为每一个工作者进程时一个完全独立的进程，它对系统产生的影响大致和一个额外的用户会话相同。
在为这个设置选择值时，以及配置其他控制资源利用的设置（例如work_mem）时，应该把这个因素考虑在内。work_mem 之类的资源限制会被独立地应用于每一个工作者，这意味着所有进程的总资源利用可能会比单个进程时高得多。
例如，一个使用 4 个工作者的并行查询使用的 CPU 时间、内存、I/O 带宽可能是不使用工作者时的 5 倍之多。

min_parallel_relation_size
启用并行查询的表的最小值

parallel_tuple_cost
parallel_setup_cost
计算并行处理的成本，如果成本高于非并行，则不会开启并行处理。