(55)ElasticSearch之使用scroll滚动技术实现大数据量搜锁

  1、scroll及其步骤简单说明

  如果一次性要查出来比如10万条数据,那么性能会很差,此时一般会采取用scroll滚动查询,一批一批的查,直到所有数据都查询完为止。

  1)scroll搜索会在第一次搜索的时候,保存一个当时的视图快照,之后只会基于该旧的视图快照提供数据搜索,如果这个期间数据变更,是不会让用户看到的。

  2)采用基于_doc(不使用_score)进行排序的方式,性能较高。(默认是基于_score的相关度由高到低排序查询)。

  3)每次发送scroll请求,我们还需指定一个scoll参数,指定一个时间窗口,每次搜索请求只要在这个时间窗口内能完成就可以了。

  2、操作演示

  准备数据

PUT /lib
{
    "settings":{
        "number_of_shards":3,
        "number_of_replicas":0
      },
        "mappings":{
            "user":{
                "properties":{
                    "name":{"type":"text"},
                    "address":{"type":"text"},
                    "age":{"type":"integer"},
                    "interests":{"type":"text"},
                    "birthday":{"type":"date"}
                }
            }
        }
}
put /lib/user/1
{
    "name":"zhaoliu",
    "address":"hei long jiang sheng tie ling shi",
    "age":50,
    "birthday":"1970-12-12",
    "interests":"xi huang hejiu,duanlian,lvyou"
}

put /lib/user/2
{
    "name":"zhaoming",
    "address":"bei jing hai dian qu qing he zhen",
    "age":20,
    "birthday":"1998-10-12",
    "interests":"xi huan hejiu,duanlian,changge"
}

put /lib/user/3
{
    "name":"lisi",
    "address":"bei jing hai dian qu qing he zhen",
    "age":23,
    "birthday":"1998-10-12",
    "interests":"xi huan hejiu,duanlian,changge"
}

put /lib/user/4
{
    "name":"wangwu",
    "address":"bei jing hai dian qu qing he zhen",
    "age":26,
    "birthday":"1998-10-12",
    "interests":"xi huan biancheng,tingyinyue,lvyou"
}

put /lib/user/5
{
    "name":"zhangsan",
    "address":"bei jing chao yang qu",
    "age":29,
    "birthday":"1988-10-12",
    "interests":"xi huan tingyinyue,changge,tiaowu"
}

  执行下面的语句:一次查询出3条,在1分钟内完成即可

GET lib/user/_search?scroll=1m
{
  "query": {
    "match_all": {}
  },
  "sort":["_doc"],
  "size":3
}

  查询出了id为1、2、4的文档,返回了一个_scroll_id

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw==",
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": null,
    "hits": [
      {
        "_index": "lib",
        "_type": "user",
        "_id": "2",
        "_score": null,
        "_source": {
          "name": "zhaoming",
          "address": "bei jing hai dian qu qing he zhen",
          "age": 20,
          "birthday": "1998-10-12",
          "interests": "xi huan hejiu,duanlian,changge"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "lib",
        "_type": "user",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "zhaoliu",
          "address": "hei long jiang sheng tie ling shi",
          "age": 50,
          "birthday": "1970-12-12",
          "interests": "xi huang hejiu,duanlian,lvyou"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "lib",
        "_type": "user",
        "_id": "4",
        "_score": null,
        "_source": {
          "name": "wangwu",
          "address": "bei jing hai dian qu qing he zhen",
          "age": 26,
          "birthday": "1998-10-12",
          "interests": "xi huan biancheng,tingyinyue,lvyou"
        },
        "sort": [
          1
        ]
      }
    ]
  }
}

  继续执行查询,scroll_id的值是上面的那个返回的_scroll_id的值

GET /_search/scroll
{
  "scroll":"1m",
  "scroll_id":"DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw=="
}

  查询结果,返回了id是3、5的文档,如果后面还有文档,只需改变_scroll_id的值,继续执行即可。

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw==",
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": null,
    "hits": [
      {
        "_index": "lib",
        "_type": "user",
        "_id": "3",
        "_score": null,
        "_source": {
          "name": "lisi",
          "address": "bei jing hai dian qu qing he zhen",
          "age": 23,
          "birthday": "1998-10-12",
          "interests": "xi huan hejiu,duanlian,changge"
        },
        "sort": [
          1
        ]
      },
      {
        "_index": "lib",
        "_type": "user",
        "_id": "5",
        "_score": null,
        "_source": {
          "name": "zhangsan",
          "address": "bei jing chao yang qu",
          "age": 29,
          "birthday": "1988-10-12",
          "interests": "xi huan tingyinyue,changge,tiaowu"
        },
        "sort": [
          2
        ]
      }
    ]
  }
}
原文地址:https://www.cnblogs.com/javasl/p/12662713.html