es-03-DSL的简单使用

以下操作在kibana中进行, 如果在linux的shell中, 请使用

curl -Xget 'http://node1:9200/index/type/id' -d '{ ... }' 的形式, 其中 -d 是传参

1, 获取集群状态

1), 查看健康状况:

GET /_cat/health?v

2), 查看节点:

GET /_cat/nodes?v

2, index操作(类似数据库databases)

1, index操作

1), 创建数据库

put lag
{
    "settings": {
        "index": {
            "number_of_shards": 5,
            "number_of_replicas": 1
        }
    }
}

2), 修改settings

分片不可以更改, 副本可以更改

put lag/_settings
{
    "number_of_shards": 3
}

3), 获取所有的索引

    get _all

获取索引

get lag/_settings
get _all/settings
get .kibana,lagou/_settings
get _settings

4), 查看所有index

GET /_cat/indices?v

5), 创建数据

put customer/_doc/1?pretty
{
  "name": "vini"
}

4), 查询

get customer/_doc/1?pretty

5), 删除index

delete customer?pretty

GET /_cat/indices?v

2, document操作(类似记录 record)

1), 保存文档

index/type/id 不指定id的话, 会自动生成uuid

put lag/job/1
{
    "title": 'python 爬虫“,
    ‘salary”: 15000,
    ’city‘: ’bj‘,
    ’company“:   {
        "name": "Baidu",
        "company_addr": "bj"
    },
    "publish_date": "2018"
}

2), 获取文档

get lagou/job/1

或者

看下面query

3), 修改数据

PUT /customer/_doc/1/_update?pretty
{
  "name": "wenbronk"
}

就可以将原来的name进行更改

4), 使用post进行修改, 只修改某些字段

只能更新已经存在的id, 增量修改, 没有的字段会添加, 有的会覆盖

post lagou/doc/1/_update?pretty
{
  "doc": {
    "name": "vini",
    "age": 28
  }
}

5), 进行简单的脚本计算

post customer/_doc/1/_update?pretty
{
  "script": "ctx._source.age += 5"
  
}

6), 删除document

DELETE /customer/_doc/1?pretty

3, batch处理

可以合并多个操作, 比如index, delete, update, 也可以从一个index导入另一个index

1), 批量插入数据

每条数据由2行构成, delete除外, 第一行为元数据行, 第二行为数据行, upsert比较特殊, 可能为upsert, doc, 或者script

元数据必须放在一行!!!!!

POST /customer/_doc/_bulk?pretty
{"index":{"_id":"1"}}　　　　# 针对哪个索引完成的
{"name": "John Doe" }　　　　# 数据行, 必须放在一行, 不能做美化
{"index":{"_id":"2"}}
{"name": "Jane Doe" }

如果不写index或者type, 需要在元数据中指定

2), 执行修改第一条, 删除第二条

delte操作, 只有一行,没有元数据行

POST /customer/_doc/_bulk?pretty
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}

单条出错不影响, 会继续执行剩下的

3), 批量修改

post _bulk?pretty
{
    "update": {"_index": "lag", "_type": "job", "_id": 1}
    {"doc": {"fileds": "values"}
}

4), 批量获取

get _mget{
    "docs": [
        {"index": "tested",
        "_type": "job",
        "_id": 1
        },
        {"_index": "lag",
            "_type": "job2",
            "_id": 2
        }
    ]
}

或者同一个index或者同一个 type

get lagou/job1
{
    "docs": [
        {"_id": 1},
        {"_id": 2}
    ]
}

或者缩写

get lagou/job1
{
　　"ids": [1, 2]
}

4, 查询

基本查询, 组合查询, 过滤查询

1), 导入基础数据

https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json

curl -H "Content-Type: application/json" -XPOST "10.110.122.172:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"

GET /_cat/indices?v

2), 使用 q 进行查询

GET /bank/_search?q=*&sort=account_number:asc&pretty

只获取部分字段

get lag/job/1?_source

3) 使用body体进行查询

from 从哪开始, size: 取多少条, sort: 排序

使用 wildcard 进行 * 通配符查询

4), match 分词匹配, 部分匹配

a. match_all 查询所有

get /bank/_search
{
  "query": {"match_all": {}}, 
  "from": 10,
  "size": 10,
  "sort": [
    {"account_number": "asc"}
    ]
}

_source: 显示取字段

get bank/_search
{
  "query": {"match": {
    "age": 37
  }}, 
  "_source": [
    "account_number", "age", "address"
    ]
  
}

5), macth_parse 短语匹配

会将查询进行分词, 满足所有分词才会返回结果

term: 完全匹配, 不分词

get bank/_search
{
  "query": {"match_phrase": {
    "address": "mill lane", 
    “slop”: 6        # 必须大于设置的词距离才会被搜索到

  }}, 
  "_source": [
    "account_number", "age", "address"
    ]
  
}

6) term查询, 完全匹配

如果没有指定schema是什么类型的, 可能会查询失败

get /ban/_search
{
    "query" : {
        "term" : {
            "abc": "1234"
        }
    }
}

terms 查询

可传入多个词, 只要有一个匹配, 就可以被查询到

get /ban/_search
{
    "query" : {
        "term" : {
            "abc": ["1234", “568”, “23”]
        }
    }
}

7), 使用range查询, 范围查询

get /ban/_search
{
    "query": {
        "range": {
            "price": {
                "gte": 10,
                "lte": 99
            }
        }
    }
}

8) multi_match: 多字段匹配

get /bank/_search 
{
　　"query': {
　　　　"bool": {
　　　　　　"must": {
　　　　　　　　"multi_match": {
　　　　　　　　　　"operator": "and",
　　　　　　　　　　"fileds": [ "name", "author^3"]     # 把titil的权重提高, 分值较高的
　　　　　　　　　　"query": "Guide"
　　　　　　　　}
　　　　　　},
　　　　　　"filter": {　
　　　　　　　　"terms": {
　　　　　　　　　　"price": [ 35.99, 188.99]
　　　　　　　　}
　　　　　　}
　　　　}
　　}
}

5 bool匹配

1) must

get bank/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"address": "mill"}},
        {"match": {"address": "lane"}}
        ]
      }
    }
  }
}

2) or 匹配, should

GET /bank/_search
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

3) must_not匹配

GET /bank/_search
{
  "query": {
    "bool": {
      "must_not": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

4) 混搭

GET /bank/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}

get lag/testjob/_search
{
    "query":{
        "bool": {
            "should": [
                {"term": {"title"; "python"}},
                {"bool": {
                    "must": [
                        {"term": {"title": "es"}}, 
                        {"term": {"salary": 30}}
                    ]
                }
            }
        }
    }
 }

select * from test job where title = 'python' or
(title = 'es' and salary = 30)

5) fliter查询, es5.x之后, 被 bool 替换, 包裹在bool查询内

1), 使用filtre实现 gte lte

GET /bank/_search
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000, 
            "boost": 2.0

          }
        }
      }
    }
  }
}

GET /bank/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "abc": "123"
        }
      }
    }
  }
}

fitler查询多个值

GET /bank/_search
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "term": [‘adb’, ‘12']
      }
    }
  }
}

判断字段是否存在

exists

7, 聚合查询

默认 limit 10

size : 0 为了不显示搜索结果

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      }
    }
  }
}

相当于

SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC LIMIT 10;

2), 增加avg聚合

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword",
        "order": {
          "average_balance": "desc"
        }
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

3), from-to, 控制查询的返回数量, 其实就是分页

from: 从..开始

to: 到..结束

size: 10

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "size": 10
          }
        ]
      },
      "aggs": {
        "group_by_gender": {
          "terms": {
            "field": "gender.keyword"
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
  }
}

4), sort

get lagou/_search
{
    'query': {
        'match_all': {}
    }
    "sort": [{
        "comments": {
             "order": "asa"
         }
    }]
}

注意, 被排序的字段, 必须被存储, 即 stored: true