Elasticsearch 7.x 去重查询并返回去重后的总数

Elasticsearch version: 7.8

需求是分页去重获取索引中的数据, 类似 MySQL 的 distinct. Elasticsearch 中的 collapse 可以实现该需求.

collapse 官网文档

You can use the collapse parameter to collapse search results based on field values. The
collapsing is done by selecting only the top sorted document per collapse key.

你可以使用 collapse 参数根据字段值折叠搜索结果, 折叠是通过每个折叠键仅选择排序最靠前的文档来完成的.

注意:

The total number of hits in the response indicates the number of matching documents without collapsing. The total number of distinct group is unknown.

响应中的总数表示没有折叠的匹配文档数, 去重后的总数是不知道的.

那么怎么获取去重后的总数呢? 可以使用 Aggregation 中的 cardinality 来实现.

cardinality 官方文档

DSL example:

{
  "from": 0,
  "size": 5,
  "sort": [
    {
      "createTime": {
        "order": "desc"
      }
    }
  ],
  "collapse": {
    "field": "app_id"
  },
  "aggs": {
    "total_size": {
      "cardinality": {
        "field": "app_id"
      }
    }
  }
}

Java API example:

SortBuilder sortBuilder = SortBuilders.fieldSort(CREATE_TIME).order(SortOrder.DESC);
CollapseBuilder collapseBuilder = new CollapseBuilder(APP_ID);
AggregationBuilder aggregation = AggregationBuilders.cardinality(TOTAL_COUNT_KEY).field(APP_ID);
原文地址:https://www.cnblogs.com/xxoome/p/14155316.html