ElasticSearch中term和match探索

一.创建测试数据

1.创建一个index

curl -X PUT  http://127.0.0.1:9200/student?pretty -H "Content-Type: application/json" -d '{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "_source": {
            "enabled": true
        },
        "properties": {
            "id": {
                "type": "integer"
            },
            "name": {
                "type": "text"
            },
            "age": {
                "type": "integer"
            },
            "class": {
                "type": "text",
                "analyzer": "ik_max_word"
            },
            "introduce": {
                "type": "text",
                "analyzer": "ik_max_word"
            }
        }
    }
}'

2.验证是否创建成功

curl -XGET "http://127.0.0.1:9200/student?pretty"

3.插入测试数据

curl -X PUT http://127.0.0.1:9200/student/_doc/1?pretty -H "Content-Type: application/json" -d '{
    "id":1,
    "name":"关云长",
    "age":30,
	"class":"蜀国一班"
}'

curl -X PUT http://127.0.0.1:9200/student/_doc/2?pretty -H "Content-Type: application/json" -d '{
    "id":2,
    "name":"吕蒙",
    "age":25,
	"class":"吴国一班"
}'

curl -X PUT http://127.0.0.1:9200/student/_doc/3?pretty -H "Content-Type: application/json" -d '{
    "id":3,
    "name":"吕布",
    "age":40,
	"class":"三姓一班"
}'

curl -X PUT http://127.0.0.1:9200/student/_doc/4?pretty -H "Content-Type: application/json" -d '{
    "id":4,
    "name":"张翼德",
    "age":30,
	"class":"蜀国二班"
}'

4.查询所有数据，验证是否正确

curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '
{
    "query": {
        "match_all": {}
    }
}'

二.验证


#关于term和match，下面两个查询，term没有结果，match有结果，为什么？
curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
    "query": {
           "term": {"name":"吕蒙"}
    }
}'


curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
    "query": {
           "match": {"name":"吕蒙"}
    }
}'

拿A去B里匹配，A能分词，B也能分词。term不会将A分词，match会将A分词，存储数据类型keyword不会将B分词，text会将B分词。

可以看到上面用term方式查找，没有结果，而用match方式查找，能查找到“吕蒙”和“吕布”两个结果

term是不分词（不拆分搜索字）查找目标字段中是否有要查找的文字，也就是完整查找“吕蒙”两个字，而name这个字段用的是text类型存储的，text类型数据默认是分词的，也就是elasticsearch会将name分词后（分成“吕”和“蒙”）再存储，这时候拿完整的搜索字“吕蒙”去存储的“吕”、“蒙”里找肯定是找不到的。

match是分词（拆分搜索字）查找目标字段，也就是说会先将要查找的搜索子“吕蒙”拆成“吕”和“蒙”，再分别去name里找“吕”，如果没有找到“吕”，还会去找“蒙”，而存储的数据里，text已经将“吕蒙”和“吕布”都分词成了“吕”，“蒙”，“吕”，“布”存储了，所以光通过一个“吕”字就能找到两条结果。

这里要区分搜索词的分词，以及字段存储的分词。拿A去B里匹配，A能分词，B也能分词。term不会将A分词，match会将A分词。

既然name的类型，存储的时候就是分词的，那能不能在存储的时候不分词了，可以用将text类型改成keyword类型

#删除所有文档
curl -XPOST "http://127.0.0.1:9200/student/_delete_by_query?pretty" -v -H "Content-Type: application/json" -d '
{
    "query": {
        "match_all": {}
    }
}'

#删除索引
curl -XDELETE "http://127.0.0.1:9200/student?pretty"

#重新创建索引，将name字段的类型改成keyword
curl -X PUT  http://127.0.0.1:9200/student?pretty -H "Content-Type: application/json" -d '{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "_source": {
            "enabled": true
        },
        "properties": {
            "id": {
                "type": "integer"
            },
            "name": {
                "type": "keyword"
            },
            "age": {
                "type": "integer"
            },
            "class": {
                "type": "text",
                "analyzer": "ik_max_word"
            },
            "introduce": {
                "type": "text",
                "analyzer": "ik_max_word"
            }
        }
    }
}'

#重新插入上面四条数据

#请复制上面的语句，执行

#下面这条查询将返回“吕蒙”同学
curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
    "query": {
           "term": {"name":"吕蒙"}
    }
}'


#下面这条查询将返回0结果，因为存储时类型为keyword没有分词，所以存储的是“吕蒙”和“吕布”，这时候拿#“吕”去匹配，没有匹配的结果
curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
    "query": {
           "term": {"name":"吕"}
    }
}'

#下面的结果将只会返回“吕蒙”同学，没有匹配的结果，因为存储时类型为keyword没有分词，所以存储的“吕
#蒙”和“吕布”，这时候拿“吕蒙”去匹配，虽然用的match，会将搜索词拆分成“吕蒙”，“吕”，“蒙”去搜索，但
#“吕”和“蒙”都不会匹配的到存储的“吕蒙”和“吕布”
curl -XGET http://127.0.0.1:9200/student/_search?pretty -H "Content-Type: application/json" -d '{
    "query": {
           "match": {"name":"吕蒙"}
    }
}'