全文检索ElasticSearch

版本及下载地址

  ES 7.6.1;

  • ES:https://mirrors.huaweicloud.com/elasticsearch/7.6.1/?C=N&O=D
  • logstash: https://mirrors.huaweicloud.com/logstash/?C=N&O=D
  • kibana: https://mirrors.huaweicloud.com/kibana/?C=N&O=D

熟悉目录

bin       启动文件
config    配置文件
    log4j2              日志配置文件
    jvm.options         java虚拟机相关的配置
    elasticsearch.yml   elasticsearch的配置文件 默认端口 9200
lib      相关jar包
logs      日志
modules    功能模块
plugins    插件

ES集群可视化工具 - elasticsearch head

  下载地址:https://codeload.github.com/mobz/elasticsearch-head/zip/master

  启动

cnpm install
npm run start

  跨域解决(在es配置文件中添加允许跨域访问)

http.cors.enabled: true
http.cors.allow-origin: "*"

 Kibana

语言国际化修改:kibana.yml  i18n.locale: "zh-CN"

ES核心概念

  • 索引
  • 字段类型(mapping)
  • 文档(documents)

IK分词器(中文分词器)

下载ik分词器,将ik分词器放入到ES中的plugins文件夹下

elasticsearch-plugin list通过这个查看加载的插件

  • ik_smart 最少切分
  • ik_max_word 最细粒度划分,穷尽词库的可能
GET _analyze
{
  "analyzer": "ik_smart",
  "text": "林允儿爱吃苹果"
}

GET _analyze
{
  "analyzer": "ik_max_word",
  "text": "林允儿爱吃苹果"
}

添加自定义分词

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典 -->
    <entry key="ext_dict">sgrslim.dic</entry>
     <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords"></entry>
    <!--用户可以在这里配置远程扩展字典 -->
    <!-- <entry key="remote_ext_dict">words_location</entry> -->
    <!--用户可以在这里配置远程扩展停止词字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

Rest风格说明

method url地址           描述
PUT localhost:9200/索引名称/类型名称/文档id 创建文档(指定文档id)
POST localhost:9200/索引名称/类型名称 创建文档(随机文档id)
POST localhost:9200/索引名称/类型名称/文档id/_update 修改文档
DELETE localhost:9200/索引名称/类型名称/文档id 删除文档
GET localhost:9200/索引名称/类型名称/文档id 查询文档(通过id)
POST localhost:9200/索引名称/类型名称/_search   查询所有数据

索引操作

创建一个索引

PUT /索引名/~类型名~/文档id
{请求体}

PUT /test1/type1/1
{
  "name":"sgrslim",
  "age":"18"
}

 数据类型

  • 字符串类型:text、keyword
  • 数值类型:long、integer、short、byte、double、float、scaled_float、half_float
  • 日期类型:date
  • 布尔类型:boolean
  • 二进制类型:binary  
  • 。。。

指定类型

创建索引规则

PUT /test2
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text"
      },
      "age":{
        "type": "integer"
      }
    }
  }
}

获取索引信息

GET test2

创建索引(ES7之后)

PUT /索引名/_doc/文档id
{请求体}

PUT /test3/_doc/1
{
  "name":"sgrslim",
  "age":"18"
}

扩展:通过GET _cat/ 可以获得es当前的很多信息

修改索引

POST /test3/_doc/1/_update
{
  "doc":{
    "name":"法外狂徒"
  }
}

 删除索引

DELETE test1

文档操作

PUT & POST _update  更新

##post 只改desc。put 没有修改的值全置为空
POST /test4/_doc/1/_update { "doc":{ "desc":"111" } }

简单查询、结果字段过滤、排序、分页

{
  "query": {
    "match": {
      "name": "sgrslim2"
    }
  },
  "_source": ["name","age"],      ##结果字段过滤
}

Bool查询 (多条件查询)

must(and),所有条件都要符合      

should  或查询

GET /user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "sgrslim"
          }
        },
        {
          "match": {
            "age": 19
          }
        }
      ]
    }
  }
}

 term查询  

  • term查询,参数不进行分词
  • keyword,是对存储的数据不进行分词
GET /user/_search
{
  "query": {
    "term": {
      "name": "sgrslim"
    }
  }
}

 集成Springboot

导入依赖

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.6.1</version>
</dependency>

ES配置类

@Configuration
public class ElasticSearchClientConfig {

    @Bean
    public RestHighLevelClient restHighLevelClient(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("localhost", 9200, "http")));
        return client;
    }
}

索引API

创建索引、获取索引

//1.创建索引请求
        CreateIndexRequest sgr_index = new CreateIndexRequest("sgr_index");
        RequestOptions requestOptions = RequestOptions.DEFAULT;
        //2.执行请求IndicesClient,请求后获得响应
        CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(sgr_index, requestOptions);
        

文档API

/**
* 添加文档
*/
@Test
void testAddDocument() throws IOException {
//1.创建对象
User sgrslim = new User("sgrslim", 18);
//2.创建请求
IndexRequest indexRequest = new IndexRequest("sgr_index");

//3. 设置请求规则 put /sgr_index/_doc/1
indexRequest.id("1");
indexRequest.timeout(TimeValue.timeValueSeconds(1));
indexRequest.timeout("1s");

//4.将数据放入请求 json
indexRequest.source(JSON.toJSONString(sgrslim), XContentType.JSON);

//5.客户端发送请求
IndexResponse index = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(index.toString());
System.out.println(index.status());
}

/**
* 测试文档是否存在
* @throws IOException
*/
@Test
void testIsExist() throws IOException {
GetRequest sgr_index = new GetRequest("sgr_index","1");
//不获取_source上下文的内容
sgr_index.fetchSourceContext(new FetchSourceContext(false));
sgr_index.storedFields("_none_");

boolean exists = restHighLevelClient.exists(sgr_index, RequestOptions.DEFAULT);
System.out.println(exists);
}

/**
* 获取文档内容
* @throws IOException
*/
@Test
void testSearchdocument() throws IOException {
GetRequest sgr_index = new GetRequest("sgr_index", "1");
GetResponse documentFields = restHighLevelClient.get(sgr_index, RequestOptions.DEFAULT);
System.out.println(documentFields.getSourceAsString());
}

/**
* 更新文档内容
* @throws IOException
*/
@Test
void testUpdateDocument() throws IOException {
User user = new User();
user.setAge(29);
UpdateRequest sgr_index = new UpdateRequest("sgr_index", "1");
sgr_index.doc(JSON.toJSONString(user),XContentType.JSON);
UpdateResponse update = restHighLevelClient.update(sgr_index, RequestOptions.DEFAULT);
}

/**
* 批量插入数据
* @throws IOException
*/
@Test
void testBuldCreateDocument() throws IOException {
BulkRequest sgr_index = new BulkRequest();
ArrayList<User> userList = new ArrayList<>();
userList.add(new User("sgr",3));
userList.add(new User("sgrs",4));
userList.add(new User("sgrsl",5));

for (int i = 0; i < userList.size(); i++) {
IndexRequest sgr_index1 = new IndexRequest("sgr_index").id((i + 2) + "").source(JSON.toJSONString(userList.get(i)), XContentType.JSON);
sgr_index.add(sgr_index1);
}
BulkResponse bulk = restHighLevelClient.bulk(sgr_index, RequestOptions.DEFAULT);
System.out.println(bulk.hasFailures());
}

/**
* 测试查询
* @throws IOException
*/
@Test
void testSearchdocumnt() throws IOException {
SearchRequest searchRequest = new SearchRequest("sgr_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "sgr");
searchSourceBuilder.query(termQueryBuilder);
//searchSourceBuilder.highlighter();
searchRequest.source(searchSourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
for (int i = 0; i < search.getHits().getHits().length; i++) {
System.out.println(search.getHits().getHits()[i].toString());
}
}

整合测试

  @Autowired
    private RestHighLevelClient restHighLevelClient;

  //批量插入
public Boolean parseContent(String keyword) throws IOException { List<Content> javaList = new HtmlParseUtil().getList("java"); BulkRequest bulkRequest = new BulkRequest(); for (Content content : javaList) { IndexRequest goods_index = new IndexRequest("goods_index"); goods_index.source(JSON.toJSONString(content), XContentType.JSON); bulkRequest.add(goods_index); } BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT); return !bulk.hasFailures(); }
   //查询
public List<Map<String,Object>> searchList(String keyword,int pageNo,int pageSize) throws IOException { SearchRequest goods_index = new SearchRequest("goods_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); MatchQueryBuilder title = QueryBuilders.matchQuery("title", keyword); searchSourceBuilder.query(title); searchSourceBuilder.from(pageNo); searchSourceBuilder.size(pageSize); goods_index.source(searchSourceBuilder); SearchResponse search = restHighLevelClient.search(goods_index, RequestOptions.DEFAULT); List<Map<String, Object>> maps =new ArrayList<Map<String,Object>>(); SearchHit[] hits = search.getHits().getHits(); for (SearchHit hit : hits) { maps.add(hit.getSourceAsMap()); } return maps; }
原文地址:https://www.cnblogs.com/sgrslimJ/p/13740757.html