elastic search使用

elastic使用

使用python时注意保持一个好习惯：不要使用类似str、type这样的变量名，很容易引发错误：

https://blog.csdn.net/lifelegendc/article/details/55051374

每启动的一个elastic search相当于创建一个节点node（多个节点可以构成一个集群cluster）

每个节点中有若干个索引index（相当于RMDB中的数据库）

创建index的方法

put方法：/index名

查询所有index

get方法：/_cat/indices?v

每个索引中有若干个文档document，相当于一条数据，只不过文档的格式不是固定的，只要符合json格式就好

http://www.ruanyifeng.com/blog/2017/08/elasticsearch.html

编写脚本发送get/post/put/delete请求到elasticsearch（也可直接使用chrome的插件Postman

获取index：

Get Host/_cat/indices?v

创建index：

Put host/indexname

成功：{"acknowledged":true,"shards_acknowledged":true} 注意！key是acknowledged，而不是acknowledge，不要漏掉了d！

删除index：

Delete host/indexname

获取mapping：

Get Host/_mapping

Get Host/indexname/_mapping

Get Host/indexname/typename/_mapping

获取document：

Get host/indexname/typename/document_id（单个）

Get host/indexname/typename/_search（所有）

创建/修改document：

Put host/indexname/typename/document_id

json=data

headers=headers

注意：

1、使用python3的requests包的put方法时，如果要put json数据，requests.put中应该把json类型的json数据赋给json参数。

（之前我直接把字符串类型的json数据赋给data参数，英文下是可以put的，但是如果put中文就会出现难以解决的编码错误）

（之前我还直接把json类型的json数据赋给data参数，则elastic search会报只接受二进制流的错误，见下图）

关于requests.post中json参数和data参数的解释详见help(requests.put)

2、必须指定headers中的ContentType字段为application/json

requests.put(url,json=...,headers=...)

3、Put时，仅指定了url，json和headers字段，可以成功put进去，但elastic会报一个error，暂时不影响使用，可无视：

4、如果elasticsearch返回failtoparse错误，一般来说是路径写错了

删除document：

Delete host/indexname/typename/document_id

中文支持：

Elasticsearch中安装插件。

发送请求的脚本中直接发送中文即可，注意从文件中读取中文时，文件要设定编码，然后读取的脚本中要采用对应的编码：

string=""

with open("querystr.txt",encoding='utf-8') as file:

string=file.read()

查询index：

Post host/indexname/typename/_search

json=data

headers=headers

不加headers会报content-type header错误：

将json类型的json数据赋给data参数而不是json参数时，会报parse error错误：

蓝色表示自定义的内容

查询title字段恰好为222的document，多种方式：

法一：

Post Host/indexname/typename/_search?pretty=true

{"query":{"match":{"title":"222"}}}

法二：

Get Host/indexname/typename/_search?pretty=true&q=222

Elastic search默认检索规则：
1、默认只能对document中第一次层的key对应的value进行检索

2、划分规则：对于英文，默认按单词（即按空格）划分，默认无法检索单词的一部分。

3、划分规则：对于中文，默认按字（即每个字都分开）划分，因而可以检索中文词的一部分。

4、匹配规则：

match：英文：有两个完全相同的单词，匹配成功。中文，有两个完全相同的字，匹配成功；但是符号除外，即只有两个完全相同的符号（例如，），匹配失败。因此match可用匹配中文字和英文单词。

match_phrase：英文：有两个完全相同的词组，则匹配成功。中文，有若干个完全相同且顺序相同的字符，即匹配成功。因此match_phrase可用于匹配中文词/词组和英文词组。