postgresql 10 ltree 使用说明

官方文档 <a href='https://www.postgresql.org/docs/10/static/ltree.html' target=_blank'>https://www.postgresql.org/docs/10/static/ltree.html

ltree是俄罗斯Teodor Sigaev和Oleg Bartunov共同开发的PostgreSQL contrib扩展模块，它包含数据类型的实现、为树状结构组织的数据提供索引和查询。 http://www.sai.msu.su/~megera/postgres/gist/

关于GIST索引的一些介绍 https://yq.aliyun.com/articles/68244

一、安装ltree

CREATE EXTENSION IF NOT EXISTS ltree;

检查是否安装成功

select * from pg_extension where extname = 'ltree';

二、ltree的定义

标签是由一组字符数据（A-Za-z0-9_）组成的，每个标签最大256字节
标签的路径是由0个或多个点号分割(只能是.号)，如 a.b.c.d
标签的总长度必须小于65kb，但最好维持在2kb以下

ltree提供两种数据类型

ltree：存储标签路径
lquery：表示用于匹配ltree值的类似正则表达式的模式。一个简单的单词与路径中的标签匹配。星号（*）匹配零个或多个标签。例如：

foo 匹配确切的标签路径foo
*.foo.* 匹配包含标签foo的任何标签路径
*.foo 匹配最后一个标签为foo的任何标签路径

*标识可以量化，类似正则表达式

* {n}完全匹配n个标签
* {n，}匹配至少n个标签
* {n，m}匹配至少n个但不超过m个标签
* {，m}匹配最多m个标签 - 与* {0，m}相同

有几个修饰符可以放在lquery中非星形标签的末尾，以使其匹配的不仅仅是完全匹配：

@ 匹配大小写不敏感，例如@匹配A.
* 匹配任何带有此前缀的标签，例如foo *匹配foobar
％匹配初始下划线分隔的单词

三、创建表&测试数据

CREATE TABLE test (path ltree);

INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');

四、安装GIST和BTREE索引

CREATE INDEX path_gist_idx ON test USING gist(path);
CREATE INDEX path_idx ON test USING btree(path);

五、操作符

操作符	返回类型	描述
tree @> ltree	boolean	左参数是右边的祖先（或平等）？
tree <@ ltree	boolean	左参数是右（或相等）的后代？
tree ~ lquery	boolean	ltree是否匹配lquery？
query ~ ltree	boolean	ltree是否匹配lquery？
tree ? lquery[]	boolean	ltree是否匹配数组中的任何lquery？
query[] ? ltree	boolean	ltree是否匹配数组中的任何lquery？
tree @ ltxtquery	boolean	ltree是否匹配ltxtquery？
txtquery @ ltree	boolean	ltree是否匹配ltxtquery？
tree		ltree
tree		text
ext		ltree
tree[] @> ltree	boolean	数组是否包含ltree的祖先？
tree <@ ltree[]	boolean	数组是否包含ltree的祖先？
tree[] <@ ltree	boolean	数组是否包含ltree的后代？
tree @> ltree[]	boolean	数组是否包含ltree的后代？
tree[] ~ lquery	boolean	数组包含任何匹配lquery的路径？
query ~ ltree[]	boolean	数组包含任何匹配lquery的路径？
tree[] ? lquery[]	boolean	ltree数组是否包含任何匹配任何lquery的路径？
query[] ? ltree[]	boolean	ltree数组是否包含任何匹配任何lquery的路径？
tree[] @ ltxtquery	boolean	数组包含任何匹配ltxtquery的路径？
txtquery @ ltree[]	boolean	数组包含任何匹配ltxtquery的路径？
tree[] ?@> ltree	ltree	第一个数组条目，是ltree的祖先;如果没有则为NULL
tree[] ?<@ ltree	ltree	第一个数组条目，是ltree的后代;如果没有则为NULL
tree[] ?~ lquery	ltree	第一个匹配lquery的数组条目;如果没有则为NULL
tree[] ?@ ltxtquery	ltree	第一个匹配ltxtquery的数组条目;如果没有则为NULL

六、ltree函数库及示例

执行结果部分全部都亲测过，可以参考下。

函数名	返回类型	描述	示例	执行结果
subltree(ltree, int start, int end)	ltree	ltree寻找子路径，下标从start到end的（从0开始计数）,下标越界会出错	select subltree('a.b.c.d', -1,0); select subltree('a.b.c.d', 0,1); select subltree('a.b.c.d', 2,3); select subltree('a.b.c.d', 4,5); select subltree('a.b.c.d', 2,1)	invalid positions a c invalid positions invalid positions
subpath(ltree, int offset, int len)	ltree	ltree根据偏移量offset和len寻找子路径，如果offset为负则从路径末端开始。如果len为负数，则将许多标签留在路径的末尾， ps:len为负数的最好少用,有点奇怪	select subpath('a.b.c.d', 0,2); select subpath('a.b.c.d', 0,9); select subpath('a.b.c.d', 1,2); select subpath('a.b.c.d', 4,1); select subpath('a.b.c.d', -1,1); select subpath('a.b.c.d', -1,2); select subpath('a.b.c.d', -2,1); select subpath('a.b.c.d', -2,2); select subpath('a.b.c.d', -2,9); select subpath('a.b.c.d', -3,2);	a,b a,b,c,d b,c invalid positions d d c c,d c,d b,c
subpath(ltree, int offset)	ltree	ltree根据偏移量offset寻的子路径，延伸到路径的末尾，如果offset为负，则子路径从路径末端开始。	select subpath('a.b.c.d', 0); select subpath('a.b.c.d', 1); select subpath('a.b.c.d', 2); select subpath('a.b.c.d', 3); select subpath('a.b.c.d', 4); select subpath('a.b.c.d', -1); select subpath('a.b.c.d', -2); select subpath('a.b.c.d', -3); select subpath('a.b.c.d', -4); select subpath('a.b.c.d', -5);	a,b,c,d b,c,d c,d d invalid positions d c,d b,c,d a,b,c,d d
nlevel(ltree)	integer	计算路径中标签数量	select nlevel(''); select nlevel('a'); select nlevel('a.b'); select nlevel('a.b.c.d');	0 1 2 4
index(ltree a, ltree b)	integer	计算第一次在a中出现b的位置，没找到返回-1	select index('a.b.c.d','c.d'); select index('a.b.c.d','a'); select index('a.b.c.d','a,b'); select index('a.b.c.d','e');	2 0 0 -1
index(ltree a, ltree b, int offset)	integer	计算在a中第一次出现b的位置，从偏移处开始搜索; 负偏移意味着从路径末端开始	select index('a.b.c.d.e','c.d',1); select index('a.b.c.d.e','c.d',2); select index('a.b.c.d.e','c.d',3); select index('a.b.c.d.e','c.d',-1); select index('a.b.c.d.e','c.d',-2); select index('a.b.c.d.e','c.d',-3);	2 2 -1 -1 -1 2
text2ltree(text)	ltree	text 转 ltree
ltree2text(ltree)	text	ltree 转 text
lca(ltree, ltree, ...)	ltree	路径最长的公共前缀部分（最多支持8个ltree参数）	select lca('a.b.c.d','a.b.c'); select lca('a.b.c.d','a.b.c','a.b'); select lca('a.b.c.d','c.d');	a.b a ''
lca(ltree[])	ltree	路径最长的公共前缀部分	lca(array['1.2.2.3'::ltree,'1.2.3'])	1.2

七、查询实例

1.查找Top.Science开始的path

SELECT path FROM test WHERE path <@ 'Top.Science';
                path
------------------------------------
 Top.Science
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(4 rows)

2.查找包含Astronomy的path

SELECT path FROM test WHERE path ~ '*.Astronomy.*';
                     path
-----------------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Collections.Pictures.Astronomy
 Top.Collections.Pictures.Astronomy.Stars
 Top.Collections.Pictures.Astronomy.Galaxies
 Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)

3.查找包含Astronomy且前面节点不含pictures的path

SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

4.查找第3个节点是'Astronomy'的，
注意此处最好加上nlevel(path) >2，因为可能有数据没有这么多节点
且where条件nlevel(path) >2要放在前面，postgresql的where条件是从前往后执行的

select  path from test where  nlevel(path) >2 and subpath(path,2,1) ='Astronomy';
                path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)

错误示范：

select  path from test where subpath(path,2,1) ='Science' and nlevel(path) >1;
select  path from test where subpath(path,2,1) ='Science';
------------------------------------
ERROR:  invalid positions
SQL state: 22023

5.查到第二个节点是Science，注意看nlevel的变量

select  path from test where nlevel(path) >1 and subpath(path,1,1) ='Science' ;
------------------------------------
Top.Science
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(4 rows)

select  path from test where nlevel(path) >2 and subpath(path,1,1) ='Science' ;
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)

基本的使用案例就到此，其他的自己可以多去试试。ltree还是非常适合省市县级联这种关系