ElasticSearch搜索

ElasticSearch搜索

1 DSL搜索

DSL(Domain Specifific Language)是ES提出的基于json的搜索方式,在搜索时传入特定的json格式的数据来完成不同的搜索需求。

1.1.搜索全部记录并分页

  @Test
    public void testSearchAll() throws Exception {
        //搜索请求对象
        SearchRequest searchRequest = new SearchRequest("xc_course");
        searchRequest.types("doc");
        //搜索构建源对象
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        //设置分页
        //页码
        int page =1;
        int size =2;
        //起始记录的下标
        int from=(page-1) * size;
        searchSourceBuilder.from(from); //起始记录的下标,从0开始
        searchSourceBuilder.size(size); //每业显示几条
        //设置搜索方式
        searchSourceBuilder.query(QueryBuilders.matchAllQuery()); //搜索全部
        //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
        searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
        // 向搜索请求对象中设置搜索源
        searchRequest.source(searchSourceBuilder);
        //执行搜索,向es发送http请求
        SearchResponse search = client.search(searchRequest);
        //搜索结果
        SearchHits hits = search.getHits();
        //匹配到的总记录数
        long totalHits = hits.getTotalHits();
        // 得到匹配度高的文档
        SearchHit[] hits1 = hits.getHits();
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        for (SearchHit hit : hits1){
             //文档的主键
             String id = hit.getId();
             //源文档内容
             Map<String, Object> sourceAsMap = hit.getSourceAsMap();
             System.out.println(sourceAsMap);
             //日期
//            Date timestamp = dateFormat.parse((String) sourceAsMap.get("timestamp"));
//            System.out.println(timestamp);
        }
    }

1.2.Term Query

Term Query为精确查询,在搜索时会整体匹配关键字,不再将关键字分词。

将1.1.中搜索方式设置为:

QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring

1.3.根据id精确匹配查询

将1.1.中搜索方式改:

QueryBuilders.termQuery("name","spring") //根据名字查询,包含spring

1.4.match Query

1、基本使用

match Query即全文检索,它的搜索方式是先将搜索字符串分词,再使用各各词条从索引中搜索。

match query与Term query区别是match query在搜索前先将搜索关键字分词,再拿各各词语去索引中搜索。

发送:post http://localhost:9200/xc_course/doc/_search

{ "query": 
 { "match" :
  { "description" :
   { "query" : "spring开发", "operator" : "or" }
  }
 }
}

operator:or 表示 只要有一个词在文档中出现则就符合条件,and表示每个词都在文档中出现则才符合条件

query:搜索的关键字,对于英文关键字如果有多个单词则中间要用半角逗号分隔,而对于中文关键字中间可以用

逗号分隔也可以不用。

  1. minimum_should_match

    指定文档匹配词的占比:

{ "query": 
 { "match" :
  { "description" :
   { 
       "query" : "spring开发框架", "minimum_should_match": "80%"
   }
  } 
 } 
}

设置"minimum_should_match": "80%"表示,三个词在文档的匹配占比为80%,即3*0.8=2.4,向上取整得2,表

示至少有两个词在文档中要匹配成功。

java中实现:

//设置搜索方式
searchSourceBuilder.query(QueryBuilders.matchQuery("description","spring开发框架")
          .minimumShouldMatch("80%"));

1.5.multi Query

termQuery和matchQuery一次只能匹配一个Field,本节学习multiQuery,一次可以匹配多个字段。

1、基本使用

单项匹配是在一个fifield中去匹配,多项匹配是拿关键字去多个Field中匹配。

例: 拿关键字 “spring css”去匹配name 和description字段。

{ 
    "query": { 
        "multi_match" : {
            "query" : "spring css",
            "minimum_should_match": "50%",
            "fields": [ "name", "description" ]
        }
    }
}

2.提升boost

匹配多个字段时可以提升字段的boost(权重)来提高得分;

在搜索的时候如果一个关键词在名字的权重比内容中的全重大,则优先搜索到名字权重大的;

javaClient中:

 //设置搜索方式
        searchSourceBuilder.query(QueryBuilders.multiMatchQuery("spring css","name","description")
                .minimumShouldMatch("80%")  //拼配程度
                .field("name",10)); //name的占比提高10倍

1.6. 布尔查询

例: 对name、description进行匹配查询,并且对studymodel进行精确查询

{ "_source" :["name", "studymodel", "description"],
 "from" : 0,
 "size" : 1,
 "query":
 { "bool" :
  { "must":
   [
       { "multi_match" :
        { "query" : "spring框架", 
         "minimum_should_match": "50%",
         "fields": [ "name^10", "description" ] 
        }
       },
       {
           "term":{
               "studymodel" : "201001"
           } 
       } 
   ]
  }
 } 
}

must:表示必须,多个查询条件必须都满足。(通常使用must)

should:表示或者,多个查询条件只要有一个满足即可。

must_not:表示非。

/**
     * BoolQuery
     *
     */
    @Test
    public void testBoolQuery() throws Exception {
        //搜索请求对象
        SearchRequest searchRequest = new SearchRequest("xc_course");
        searchRequest.types("doc");
        //搜索构建源对象
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        /**
        *BoolQuery设置搜索方式
        */
        // 1.multiMatchQueryBuilder
        MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
                .minimumShouldMatch("80%")  //拼配程度
                .field("name", 10);
        //2.再定义一个termQuery
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("studymodel", "201001");
        //BoolQuery
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.must(multiMatchQueryBuilder); 
        boolQueryBuilder.must(termQueryBuilder);  //必须满足这两个条件
        
        searchSourceBuilder.query(boolQueryBuilder);
        //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
        searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
        // 向搜索请求对象中设置搜索源
        searchRequest.source(searchSourceBuilder);
        //执行搜索,向es发送http请求
        SearchResponse search = client.search(searchRequest);
        //搜索结果
        SearchHits hits = search.getHits();
        //匹配到的总记录数
        long totalHits = hits.getTotalHits();
        // 得到匹配度高的文档
        SearchHit[] hits1 = hits.getHits();
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        for (SearchHit hit : hits1){
            //文档的主键
            String id = hit.getId();
            //源文档内容
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            System.out.println(sourceAsMap);
        }
    }

1.7.过滤器

过虑是针对搜索的结果进行过虑,过虑器主要判断的是文档是否匹配,不去计算和判断文档的匹配度得分,所以过

虑器性能比查询要高且方便缓存,推荐尽量使用过虑器去实现查询或者过虑器和查询共同使用。

{
    "_source" : [ "name", "studymodel", "description","price"],
    "query": {
        "bool" : {
            "must":[{
                "multi_match" : { 
                    "query" : "spring框架", 
                    "minimum_should_match": "50%",
                    "fields": [ "name^10", "description" ] 
                    }} ],
            "filter": [ { 
                "term": { "studymodel": "201001" }},
                        { "range": { 
                            "price": {
                                "gte": 60 ,"lte" : 100
                            }}} ] } } 
}

range:范围过虑,保留大于等于60 并且小于等于100的记录。

 /**
     * filter
     *
     */
    @Test
    public void testBoolQueryByFilter() throws Exception {
        //搜索请求对象
        SearchRequest searchRequest = new SearchRequest("xc_course");
        searchRequest.types("doc");
        //搜索构建源对象
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        /**
        *BoolQuery设置搜索方式
        */
        //1.multiMatchQueryBuilder
        MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("spring css", "name", "description")
                .minimumShouldMatch("80%")  //拼配程度
                .field("name", 10);
        //BoolQuery
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.must(multiMatchQueryBuilder);
        //2.定义一个过滤器
        boolQueryBuilder.filter(QueryBuilders.termQuery("studymodel","201001"));
        boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
        
        searchSourceBuilder.query(boolQueryBuilder);

        //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
        searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});
        // 向搜索请求对象中设置搜索源
        searchRequest.source(searchSourceBuilder);
        //执行搜索,向es发送http请求
        SearchResponse search = client.search(searchRequest);
        //搜索结果
        SearchHits hits = search.getHits();
        //匹配到的总记录数
        long totalHits = hits.getTotalHits();
        // 得到匹配度高的文档
        SearchHit[] hits1 = hits.getHits();
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        for (SearchHit hit : hits1){
            //文档的主键
            String id = hit.getId();
            //源文档内容
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            System.out.println(sourceAsMap);
        }
    }

1.8.排序sort

{
    "_source" : [ "name", "studymodel", "description","price"], 
    "query": { 
        "bool" : { 
            "filter": [ { 
                "range": { 
                    "price": {
                        "gte": 0 ,"lte" : 100}}}
                      ] } }, 
    "sort" : [ {
        "studymodel" : "desc" }, 
         { "price" : "asc" } 
             ]
}

java client:

  SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();      
//BoolQuery
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        //定义一个过滤器
        boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(60).lte(100));
        searchSourceBuilder.query(boolQueryBuilder);
         //添加排序
         searchSourceBuilder.sort("studymodel", SortOrder.DESC);
         searchSourceBuilder.sort("price",SortOrder.ASC);

1.9. 高亮显示

{
    "_source" : [ "name", "studymodel", "description","price"],
    "query": {
        "bool" : {
            "must":[{
                "multi_match" : { 
                    "query" : "开发框架", 
                    "minimum_should_match": "50%",
                    "fields": [ "name^10", "description" ] 
                    }} ],
            "filter": [ { 
                "term": { "studymodel": "201001" }},
                        { "range": { 
                            "price": {
                                "gte": 60 ,"lte" : 100
                            }}} ] } } 
     "sort" : [ {
        "studymodel" : "desc" }, 
         { "price" : "asc" } 
             ],
    "highlight": {
        "pre_tags": ["<tag1>"],    #前缀
        "post_tags": ["</tag2>"],  #后缀
        "fields": { "name": {}, "description":{} } }                            
}

在name和description中出现“开发框架”时,进行高亮,在前后加标签

java client:

/**
     * 高亮
     *
     */
    @Test
    public void testHightLight() throws Exception {
        //搜索请求对象
        SearchRequest searchRequest = new SearchRequest("xc_course");
        searchRequest.types("doc");
        //搜索构建源对象
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        /**
        * BoolQuery设置搜索方式
        */
        //1. multiMatchQueryBuilder
        MultiMatchQueryBuilder multiMatchQueryBuilder = QueryBuilders.multiMatchQuery("开发框架", "name", "description")
                .minimumShouldMatch("80%")  //拼配程度
                .field("name", 10);
        //BoolQuery
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.must(multiMatchQueryBuilder);
        //定义一个过滤器
        boolQueryBuilder.filter(QueryBuilders.rangeQuery("price").gte(0).lte(100));
        searchSourceBuilder.query(boolQueryBuilder);

        //设置原字段过滤,第一个参数表示:包括哪些字段,第二个表示不包括那个字段
        searchSourceBuilder.fetchSource(new String[]{"name","studymodel","price","timestamp"},new String[]{});

        //设置高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.preTags("<Tag>");
        highlightBuilder.postTags("</Tag>");
        highlightBuilder.fields().add(new HighlightBuilder.Field("name"));
        searchSourceBuilder.highlighter(highlightBuilder);
        // 向搜索请求对象中设置搜索源
        searchRequest.source(searchSourceBuilder);
        //执行搜索,向es发送http请求
        SearchResponse search = client.search(searchRequest);
        //搜索结果
        SearchHits hits = search.getHits();
        //匹配到的总记录数
        long totalHits = hits.getTotalHits();
        // 得到匹配度高的文档
        SearchHit[] hits1 = hits.getHits();
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        for (SearchHit hit : hits1){
            //文档的主键
            String id = hit.getId();
            //源文档内容
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            //取出name高亮字段
            String name = null;
            Map<String, HighlightField> highlightFields = hit.getHighlightFields();
            if (highlightFields!=null){
                HighlightField nameHighlightField = highlightFields.get("name");
                if (nameHighlightField != null){
                    Text[] fragments = nameHighlightField.getFragments();
                    StringBuffer stringBuffer = new StringBuffer();
                    for (Text fra: fragments){
                        stringBuffer.append(fra);

                    }
                    name = stringBuffer.toString();
                }
            }
            System.out.println(name);
        }
    }
}
原文地址:https://www.cnblogs.com/cqyp/p/13651279.html