Elasticsearch之常见问题

　　一. 聚合操作时，报Fielddata is disabled on text fields by default

GET /megacorp/employee/_search
{
  "aggs": {
    "all_interests": {
      "terms": {"field": "interests" }
    }
  }
}


{ "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "megacorp", "node": "sNvWT__lQl6p0dMTRaAOAg", "reason": { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } ], "caused_by": { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.", "caused_by": { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } } }, "status": 400 }

text类型的字段在查询时使用的是在内存中的称为fielddata的数据结构。这种数据结构是在第一次将字段用于聚合/排序/脚本时基于需求建立的。

它通过读取磁盘上每个segmet上所有的倒排索引来构建，反转term和document的关系(倒排)，并将结果存在Java堆上(内存中)。(因此会耗费很多的堆空间，特别是在加载很高基数的text字段时)。一旦fielddata被加载到堆中，它在segment中的生命周期还是存在的。

因此，加载fielddata是一个非常消耗资源的过程，甚至能导致用户体验到延迟.这就是为什么 fielddata 默认关闭。

PUT megacorp/_mapping/employee/
{
  "properties": {
    "interests": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

　　二.Too many dynamic script compilations within, max: [75/5m]

需要设置索引允许最大编译速度

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'{ "transient": { "script.max_compilations_rate": "100000/1m"}}'

　　三. max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

在root用户下

　　临时修改:

sysctl -w vm.max_map_count=262144

sysctl -p  
# 重启恢复原值

　　永久修改:

echo "vm.max_map_count=262144" > /etc/sysctl.conf
sysctl -p

　　四. the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured

　　解决:

cluster.name: "docker-cluster"
network.host: 0.0.0.0


# custom config
node.name: "node-1"
discovery.seed_hosts: ["127.0.0.1", "[::1]"]
cluster.initial_master_nodes: ["node-1"]
# 开启跨域访问支持，默认为false
http.cors.enabled: true
# 跨域访问允许的域名地址，(允许所有域名)以上使用正则
http.cors.allow-origin: /.*/ 


重点: node.name 和 cluster.initial_master_nodes 设置

　　五. 有时候新启动的es服务无法组成集群,只能单机运行

　　解决:

　　　　需要将之前的数据进行删除,并重新启动服务

rm -rf /elasticsearch/data/*

　　六.

同一个index新增type报错 Rejecting mapping update to [website] as the final mapping would have more than 1 type: [blog2, blog]

　　7之后, 已经不推荐使用type,所以在添加数据的时候可以不指定type即可　　

　七.

原因: 磁盘空间不足, 超过95%, 则开启只读模式, 可以进行数据删除 ( df -h 查看)

{
  "index": {
    "blocks": {
      "read_only_allow_delete": "false"
    }
  }
}

　　八. Can't update non dynamic settings [[index.analysis.filter

场景: 在为索引添加分词器的时候, 发生报错

解决方案: 先关闭索引, 再进行设置, 设置完成之后, 重新打开索引即可

# 关闭索引
POST mp_account2/_close

# 设置分词器
PUT mp_account2/_settings
{
  "index": {
    "analysis": {
      "analyzer": {
        "ik_pinyin_analyzer": {
          "type": "custom",
          "tokenizer": "ik_smart",
          "filter": "pinyin_filter"
        }
      },
      "filter": {
        "pinyin_filter": {
          "type": "pinyin",
          "keep_first_letter": false
        }
      }
    }
  }
}

# 开启索引
POST mp_account2/_open