Elasticsearch 索引及文档的基本操作

查询Elasticsearch信息

GET /

和在浏览器输入 localhost:9200 的效果一样,可以看到

{
  "name" : "7c235ebfc86d",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "7o5pkQRlQHi4162eYSqq8Q",
  "version" : {
    "number" : "7.10.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "1c34507e66d7db1211f66f3513706fdf548736aa",
    "build_date" : "2020-12-05T01:00:33.671820Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

创建索引及文档

创建索引

创建一个叫做 twitter 的索引(index),分别指定字段的类型。设置了2个shards,并且有一个replica。

PUT twitter
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1
  }, 
  "mappings": {
    "properties": {
      "user":{
        "type": "text"
      },
      "city":{
        "type": "keyword"
      },
      "province":{
        "type": "keyword"
      },
      "country":{
        "type": "keyword"
      }
    }
  }
}

返回一下结果说明创建成功!

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "twitter"
}

text和keyword的区别

  • text 会被分词器解析。
  • keyword 不会被分词器解析。

创建文档

在创建文档时可以使用PUT请求和POST请求来创建索引,区别如下:

  • 指定文档ID:PUT 索引名/_doc/文档ID。如果文档不存在,就索引新的文档。否则现有文档会被删除,新的文档会被索引。_version + 1。
  • 系统自动生成文档ID:POST 索引名/_doc ,使用POST请求创建系统会自动生成文档ID

使用 PUT 请求在 twitter 索引下创建一个文档

PUT twitter/_doc/1
{
  "user": "TL",
  "city": "ShangHai",
  "province": "ShangHai",
  "country": "China"
}

返回以下结果说明创建成功!

{
  "_index" : "twitter",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

返回信息描述

  • _index : index名称
  • _type : type 名称
  • _version : 版本信息,每次执行那个 POST 或者 PUT 接口时,如果文档已经存在,那么相应的版本就会自动加1,之前的版本抛弃。

使用 POST 请求在 twitter 索引下创建一个文档

POST twitter/_doc
{
  "user": "TL",
  "city": "ShangHai",
  "province": "ShangHai",
  "country": "China"
}

返回以下结果说明创建成功!

{
  "_index" : "twitter",
  "_type" : "_doc",
  //系统生成的id
  "_id" : "UJyJhHkBrelCLU6_H44s",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

查询索引及文档

查询索引

GET twitter

查询索引下全部文档信息

GET twitter/_search

根据文档id精确查询

1为文档的id

GET twitter/_doc/1

查询文档 返回source信息

# 7.0版本之前
GET twitter/_doc/1/_source
# 7.0版本之后
GET twitter/_source/1

查询文档 source 部分信息

usercity 为字段名称

GET twitter/_source/1?_source=user,city

查询多个文档

  • _index : 索引名称
  • _id : 文档id
GET _mget
{
  "docs":[
      {
        "_index":"twitter",
        "_id":"1"
      },
      {
        "_index":"twitter",
        "_id":"2"
      }
    ]
}

指定Id查询多个文档

GET twitter/_doc/_mget
{
  "ids":[1,2,3]
}

查询多个文档,指定字段

_source 中指定字段

GET _mget
{
  "docs":[
      {
        "_index":"twitter",
        "_id":"1",
        "_source":["user","city"]
      },
      {
        "_index":"twitter",
        "_id":"2"
      }
    ]
}

索引、文档 更新及删除

检查索引是否存在

HEAD twitter

如果存在返回

200 - OK

不存在则返回

{"statusCode":404,"error":"Not Found","message":"404 - Not Found"}

删除索引

DELETE twitter

看到以下结果说明删除成功,所有在 twitter 下的文档都会被删除。

{
  "acknowledged" : true
}

检查文档是否存在

HEAD twitter/_doc/1

返回一下结果说明存在

200 - OK

PUT 请求更新一个文档

PUT twitter/_doc/1
{
  "user": "TL-NEW",
  "city": "ShangHai",
  "province": "ShangHai",
  "country": "China"
}

更新完成后 version +1,可以使用上面的 GET 请求来查询更新后的信息。

通过查询的方式更新文档

POST twitter/_update_by_query
{
  "query": {
    "match": {
      "_id": "1"
    }
  },
  "script": {
    "source": "ctx._source.city = params.city;ctx._source.province = params.province",
    "lang": "painless",
    "params": {
      "city":"BJ",
      "province":"BJ"
    }
  }
}

UPSERT 文档

"upsert" 表示更新或插入,即更新文档(如果存在),否则,插入新文档。

doc_as_upsert 参数检查具有给定ID的文档是否已经存在,并将提供的 doc 与现有文档合并。 如果不存在具有给定 ID 的文档,则会插入具有给定文档内容的新文档。

POST twitter/_update/3
{
  "doc": {
    "user": "TL-3",
    "city": "ShangHai",
    "province": "ShangHai",
    "country": "China"
  },
  "doc_as_upsert":"true"
}

根据Id删除一个文档

DELETE twitter/_doc/1

通过查询的方式删除文档

POST twitter/_delete_by_query
{
  "query":{
    "match":{
      "user":"TL"
    }
  }
}

这样我们就把所有的 user 是 TL 的文档都删除了。

批处理命令 Bulk API

  • 支持在一次 API 调用中,对不同的索引进行操作
  • 支持四种类型操作
    • Index
    • Create
    • Update
    • Delete
  • 可以在 URI 中指定 Index,也可以在请求的 Payload 中进行
  • 操作中 单条 操作失败,并不会影响其他操作
  • 返回结果中包括了每一条执行的结果

Bulk Index 方式批量插入文档

//千万不要添加除了换行以外的空格,否则会导致错误
POST _bulk
{"index" : {"_index" :"student" ,"_id":"1"}}
{"name":"山西太原-张三","age":"23","address":{"city":"太原","province":"山西"}}
{"index" : {"_index" :"student" ,"_id":"2"}}
{"name":"山西长治-李四","age":"24","address":{"city":"长治","province":"山西"}}
{"index" : {"_index" :"student" ,"_id":"3"}}
{"name":"山西吕梁-王五","age":"25","address":{"city":"吕梁","province":"山西"}}
{"index" : {"_index" :"student" ,"_id":"4"}}
{"name":"广东广州-赵六","age":"26","address":{"city":"广州","province":"广东"}}

使用 GET student/_count 查询刚才批量插入的文档数量

{
  "count" : 4,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Bulk create 方式批量插入文档

上面使用index创建文档,在student文档已存在的情况下,下面使用create来创建:

POST _bulk
{"create" : {"_index" :"student" ,"_id":"1"}}
{"name":"山西太原-张三","age":"23","address":{"city":"太原","province":"山西"}}
{"create" : {"_index" :"student" ,"_id":"2"}}
{"name":"山西长治-李四","age":"24","address":{"city":"长治","province":"山西"}}
{"create" : {"_index" :"student" ,"_id":"3"}}
{"name":"山西吕梁-王五","age":"25","address":{"city":"吕梁","province":"山西"}}
{"create" : {"_index" :"student" ,"_id":"4"}}
{"name":"广东广州-赵六","age":"26","address":{"city":"广州","province":"广东"}}

再次使用create创建相同的文档,会提示:

{
    "create" : {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "status" : 409,
        "error" : {
            //插入失败
            "type" : "version_conflict_engine_exception",
            "reason" : "[1]: version conflict, document already exists (current version [20])",
            "index_uuid" : "hX1ecN9KQ6-dAkbCLLma-g",
            "shard" : "0",
            "index" : "student"
        }
    }
}

从上面的情况可以看出,index 总是可以成功,它可以覆盖之前的已经创建的文档。但是 create 则不行,如果插入的文档Id已经存在,就不会成功。

Bulk update 批量更新文档

POST _bulk
{"update" : {"_index" :"student" ,"_id":"1"}}
{"doc":{"name":"zs-n","age":"233"}}
{"update" : {"_index" :"student" ,"_id":"2"}}
{"name":"ls-n","age":"244","address":{"city":"太原","province":"长治"}}

通过 GET student/_doc/1 查询id为1的文档,发现已经修改成功:

{
  "_index" : "student",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 21,
  "_seq_no" : 80,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "zs-n",
    "age" : "233",
    "address" : {
      "city" : "太原",
      "province" : "山西"
    }
  }
}

Bulk delete 批量删除文档

POST _bulk
{"delete" : {"_index" :"student" ,"_id":"1"}}
{"delete" : {"_index" :"student" ,"_id":"2"}}
{"delete" : {"_index" :"student" ,"_id":"3"}}
{"delete" : {"_index" :"student" ,"_id":"4"}}

返回结果显示已经删除成功:

{
  "took" : 11,
  "errors" : false,
  "items" : [
    {
      "delete" : {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 22,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 81,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "delete" : {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 21,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 82,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "delete" : {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 21,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 83,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "delete" : {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 21,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 84,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}
If you’re going to reuse code, you need to understand that code!
原文地址:https://www.cnblogs.com/leizzige/p/14787315.html