Elastic Search 的搜索

批量导入数据

ES 提供了一个叫 bulk 的 API 来进行批量操作

在ES安装目录下新建一个文件,文件名可以自定义,这里是player

文件内容如下:

{"index":{"_index":"nba","_type":"_doc","_id":"1"}}
{"countryEn":"United States","teamName":"老鹰","birthDay":831182400000,"country":"美国","teamCityEn":"Atlanta","code":"jaylen_adams","displayAffiliation":"United States","displayName":"杰伦 亚当斯","schoolType":"College","teamConference":"东部","teamConferenceEn":"Eastern","weight":"86.2 公斤","teamCity":"亚特兰大","playYear":1,"jerseyNo":"10","teamNameEn":"Hawks","draft":2018,"displayNameEn":"Jaylen Adams","heightValue":1.88,"birthDayStr":"1996-05-04","position":"后卫","age":23,"playerId":"1629121"}
{"index":{"_index":"nba","_type":"_doc","_id":"2"}}
{"countryEn":"New Zealand","teamName":"雷霆","birthDay":743140800000,"country":"新西兰","teamCityEn":"Oklahoma City","code":"steven_adams","displayAffiliation":"Pittsburgh/New Zealand","displayName":"斯蒂文 亚当斯","schoolType":"College","teamConference":"西部","teamConferenceEn":"Western","weight":"120.2 公斤","teamCity":"俄克拉荷马城","playYear":6,"jerseyNo":"12","teamNameEn":"Thunder","draft":2013,"displayNameEn":"Steven Adams","heightValue":2.13,"birthDayStr":"1993-07-20","position":"中锋","age":26,"playerId":"203500"}

注意:最后需要空一行

执行以下命令,可以将文件里的数据批量导入

curl -X POST "localhost:9200/_bulk" -H "Content-Type: application/json" --data-binary @player

ES 之 term 的多种查询

单词级别查询:这些查询通常用于结构化的数据,比如:number, date, keyword 等,而不是对 text。也就是说,全文本查询之前要先对文本内容进行分词,而单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询一般用于数值、日期等类型的字段上。

准备工作

  • 删除nba索引
  • 新增nba索引
  • POST:localhost:9200/nba/_mapping
    
    {
        "properties":{
            "birthDay":{
                "type":"date"
            },
            "birthDayStr":{
                "type":"keyword"
            },
            "age":{
                "type":"integer"
            },
            "code":{
                "type":"text"
            },
            "country":{
                "type":"text"
            },
            "countryEn":{
                "type":"text"
            },
            "displayAffiliation":{
                "type":"text"
            },
            "displayName":{
                "type":"text"
            },
            "displayNameEn":{
                "type":"text"
            },
            "draft":{
                "type":"long"
            },
            "heightValue":{
                "type":"float"
            },
            "jerseyNo":{
                "type":"text"
            },
            "playYear":{
                "type":"long"
            },
            "playerId":{
                "type":"keyword"
            },
            "position":{
                "type":"text"
            },
            "schoolType":{
                "type":"text"
            },
            "teamCity":{
                "type":"text"
            },
            "teamCityEn":{
                "type":"text"
            },
            "teamConference":{
                "type":"keyword"
            },
            "teamConferenceEn":{
                "type":"keyword"
            },
            "teamName":{
                "type":"keyword"
            },
            "teamNameEn":{
                "type":"keyword"
            },
            "weight":{
                "type":"text"
            }
        }
    }
  • 批量导入数据(player文件)

Term query

精准匹配查询(查找号码为23的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "jerseyNo": "23"
        }
    }
}

Exsit Query

在特定的字段中查找非空值的文档(查找队名非空的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "exists": {
            "field": "teamNameEn"
        }
    }
}

Prefix Query

查找包含带有指定前缀 term 的文档(查找队名以Rock开头的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "prefix": {
            "teamNameEn": "Rock"
        }
    }
}

Wildcard Query

支持通配符查询,*表示任意字符,?表示任意单个字符(查找火箭队的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "wildcard": {
            "teamNameEn": "Ro*s"
        }
    }
}

Regexp Query

正则表达式查询(查找火箭队的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "regexp": {
            "teamNameEn": "Ro.*s"
        }
    }
}

Ids Query

id 查询(查找id为1和2的球员)

POST:localhost:9200/nba/_search
{
    "query": {
        "ids": {
            "values": [1,2]
        }
    }
}

ES 的范围查询

查找指定字段在指定范围内包含值(日期、数字或字符串)的文档。

查找在nba打了2年到10年以内的球员
POST:localhost:9200/nba/_search
{
    "query": {
        "range": {
            "playYear": {
                "gte": 2,
                "lte": 10
            }
        }
    }
}


查找1980年到1999年出生的球员
POST:localhost:9200/nba/_search
{
    "query": {
        "range": {
            "birthDay": {
                "gte": "01/01/1999",
                "lte": "2022",
                "format": "dd/MM/yyyy||yyyy"
            }
        }
    }
}

ES 的布尔查询

  • must:必须出现在匹配文档中
  • filter:必须出现在文档中,但是不打分
  • must_not:不能出现在文档中
  • should:应该出现在文档中

must

查找名字叫做 James 的球员

POST:localhost:9200/nba/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "displayNameEn": "james"
                    }
                }
            ]
        }
    }
}

filter

效果同 must,但是不打分(查找名字叫做 James 的球员)

must_not

查找名字叫做 James 的西部球员

POST:localhost:9200/nba/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "displayNameEn": "james"
                    }
                }
            ],
            "must_not": [
                {
                    "term": {
                        "teamConferenceEn": {
                            "value": "Eastern"
                        }
                    }
                }
            ]
        }
    }
}

should

即使匹配不到也返回,只是评分不同

查找名字叫做James的打球时间应该在11到20年西部球员

POST:localhost:9200/nba/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "displayNameEn": "james"
                    }
                }
            ],
            "must_not": [
                {
                    "term": {
                        "teamConferenceEn": {
                            "value": "Eastern"
                        }
                    }
                }
            ],
            "should": [
                {
                    "range": {
                        "playYear": {
                            "gte": 11,
                            "lte": 20
                        }
                    }
                }
            ]
        }
    }
}

如果 minimum_should_match=1,则变成要查出名字叫做 James 的打球时间在11到20年西部球员

POST:localhost:9200/nba/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "displayNameEn": "james"
                    }
                }
            ],
            "must_not": [
                {
                    "term": {
                        "teamConferenceEn": {
                            "value": "Eastern"
                        }
                    }
                }
            ],
            "should": [
                {
                    "range": {
                        "playYear": {
                            "gte": 11,
                            "lte": 20
                        }
                    }
                }
            ],
            "minimum_should_match": 1
        }
    }
}

ES 的排序查询

火箭队中按打球时间从大到小排序的球员

POST:localhost:9200/nba/_search
{
    "query": {
        "match": {
            "teamNameEn": "Rockets"
        }
    },
    "sort": [
        {
            "playYear": {
                "order": "desc"
            }
        }
    ]
}

火箭队中按打球时间从大到小,如果年龄相同则按照身高从高到低排序的球员

POST:localhost:9200/nba/_search
{
    "query": {
        "match": {
            "teamNameEn": "Rockets"
        }
    },
    "sort": [
        {
            "playYear": {
                "order": "desc"
            }
        },
        {
            "heightValue": {
                "order": "asc"
            }
        }
    ]
}

ES 聚合查询之指标聚合

ES 聚合分析是什么

聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。ES 作为搜索引擎兼数据库,同样提供了强大的聚合分析能力。

对一个数据集求最大、最小、和、平均值等指标的聚合,在 ES 中称为指标聚合;而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组 group by,再在组上进行指标聚合,这在 ES 中称为桶聚合。

max/min/sum/avg

求出火箭队球员的平均年龄
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "avgAge": {
            "avg": {
                "field": "age"
            }
        }
    },
    "size": 0
}

value_count

统计非空字段的文档数

求出火箭队中球员打球时间不为空的数量
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "countPlayerYear": {
            "value_count": {
                "field": "playYear"
            }
        }
    },
    "size": 0
}

查出火箭队有多少名球员
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    }
}

Cardinality

值去重计数

查出火箭队中年龄不同的数量
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "counAget": {
            "cardinality": {
                "field": "age"
            }
        }
    },
    "size": 0
}

stats

统计 count max min avg sum 5个值

查出火箭队球员的年龄stats
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "statsAge": {
            "stats": {
                "field": "age"
            }
        }
    },
    "size": 0
}

Extended stats

比 stats 多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间

查出火箭队球员的年龄 Extend stats
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "extendStatsAge": {
            "extended_stats": {
                "field": "age"
            }
        }
    },
    "size": 0
}

Percentiles

占比百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值

查出火箭的球员的年龄占比
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "pecentAge": {
            "percentiles": {
                "field": "age"
            }
        }
    },
    "size": 0
}

查出火箭的球员的年龄占比(指定分位值)
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "percentAge": {
            "percentiles": {
                "field": "age",
                "percents": [
                    20,
                    50,
                    75
                ]
            }
        }
    },
    "size": 0
}

ES 聚合查询之桶聚合

Terms Aggregation

根据字段项分组聚合

火箭队根据年龄进行分组
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "aggsAge": {
            "terms": {
                "field": "age",
                "size": 10
            }
        }
    },
    "size": 0
}

order

分组聚合排序

火箭队根据年龄进行分组,分组信息通过年龄从大到小排序 (通过指定字段)
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "aggsAge": {
            "terms": {
                "field": "age",
                "size": 10,
                "order": {
                    "_key": "desc"
                }
            }
        }
    },
    "size": 0
}


火箭队根据年龄进行分组,分组信息通过文档数从大到小排序 (通过文档数)
POST:localhost:9200/nba/_search
{
    "query": {
        "term": {
            "teamNameEn": {
                "value": "Rockets"
            }
        }
    },
    "aggs": {
        "aggsAge": {
            "terms": {
                "field": "age",
                "size": 10,
                "order": {
                    "_count": "desc"
                }
            }
        }
    },
    "size": 0
}

每支球队按该队所有球员的平均年龄进行分组排序 (通过分组指标值)
POST:localhost:9200/nba/_search
{
    "aggs": {
        "aggsTeamName": {
            "terms": {
                "field": "teamNameEn",
                "size": 30,
                "order": {
                    "avgAge": "desc"
                }
            },
            "aggs": {
                "avgAge": {
                    "avg": {
                        "field": "age"
                    }
                }
            }
        }
    },
    "size": 0
}

筛选分组聚合

湖人和火箭队按球队平均年龄进行分组排序 (指定值列表)
POST:localhost:9200/nba/_search
{
    "aggs": {
        "aggsTeamName": {
            "terms": {
                "field": "teamNameEn",
                "include": [
                    "Lakers",
                    "Rockets",
                    "Warriors"
                ],
                "exclude": [
                    "Warriors"
                ],
                "size": 30,
                "order": {
                    "avgAge": "desc"
                }
            },
            "aggs": {
                "avgAge": {
                    "avg": {
                        "field": "age"
                    }
                }
            }
        }
    },
    "size": 0
}

湖人和火箭队按球队平均年龄进行分组排序 (正则表达式匹配值)
POST:localhost:9200/nba/_search
{
    "aggs": {
        "aggsTeamName": {
            "terms": {
                "field": "teamNameEn",
                "include": "Lakers|Ro.*|Warriors.*",
                "exclude": "Warriors",
                "size": 30,
                "order": {
                    "avgAge": "desc"
                }
            },
            "aggs": {
                "avgAge": {
                    "avg": {
                        "field": "age"
                    }
                }
            }
        }
    },
    "size": 0
}

Range Aggregation

范围分组聚合

NBA球员年龄按20,20-35,35这样分组
POST:localhost:9200/nba/_search
{
    "aggs": {
        "ageRange": {
            "range": {
                "field": "age",
                "ranges": [
                    {
                        "to": 20
                    },
                    {
                        "from": 20,
                        "to": 35
                    },
                    {
                        "from": 35
                    }
                ]
            }
        }
    },
    "size": 0
}

NBA球员年龄按20,20-35,35这样分组 (起别名)
POST:localhost:9200/nba/_search
{
    "aggs": {
        "ageRange": {
            "range": {
                "field": "age",
                "ranges": [
                    {
                        "to": 20,
                        "key": "A"
                    },
                    {
                        "from": 20,
                        "to": 35,
                        "key": "B"
                    },
                    {
                        "from": 35,
                        "key": "C"
                    }
                ]
            }
        }
    },
    "size": 0
}

Date Range Aggregation

时间范围分组聚合

NBA球员按出生年月分组
POST:localhost:9200/nba/_search
{
    "aggs": {
        "birthDayRange": {
            "date_range": {
                "field": "birthDay",
                "format": "MM-yyy",
                "ranges": [
                    {
                        "to": "01-1989"
                    },
                    {
                        "from": "01-1989",
                        "to": "01-1999"
                    },
                    {
                        "from": "01-1999",
                        "to": "01-2009"
                    },
                    {
                        "from": "01-2009"
                    }
                ]
            }
        }
    },
    "size": 0
}

Date Histogram Aggregation

时间柱状图聚合:按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合

NBA球员按出生年分组
POST:localhost:9200/nba/_search
{
    "aggs": {
        "birthday_aggs": {
            "date_histogram": {
                "field": "birthDay",
                "format": "yyyy",
                "interval": "year"
            }
        }
    },
    "size": 0
}

ES 之 query_string 查询

query_string 查询,如果熟悉 lucene 的查询语法,我们可以直接用 lucene 查询语法写一个查询串进行查询,ES 中接到请求后,通过查询解析器,解析查询串生成对应的查询。

指定单个字段查询

POST:localhost:9200/nba/_search
{
    "query": {
        "query_string": {
            "default_field": "displayNameEn",
            "query": "james OR curry"
        }
    },
    "size": 100
}

{
    "query": {
        "query_string": {
            "default_field": "displayNameEn",
            "query": "james AND harden"
        }
    },
    "size": 100
}

指定多个字段查询

POST:localhost:9200/nba/_search
{
    "query": {
        "query_string": {
            "fields": [
                "displayNameEn",
                "teamNameEn"
            ],
            "query": "James AND Rockets"
        }
    },
    "size": 100
}
原文地址:https://www.cnblogs.com/jwen1994/p/12639827.html