esm数据迁移

一、参考

esm source code

二、安装运行

2.1 下载

releases 下载地址

安装运行

macOS 下载 darwin64.tar.gz


mkdir -p /Users/yz/work/github/esm/

cd /Users/yz/work/github/esm/

tar -zxvf darwin64.tar.gz

cd /Users/yz/work/github/esm/bin/darwin64

./esm --help

三、使用场景

3.1 命令参数

Usage:
  esm [OPTIONS]

Application Options:
  -s, --source=                    source elasticsearch instance, ie: http://localhost:9200
  -q, --query=                     query against source elasticsearch instance, filter data before migrate, ie: name:medcl
  -d, --dest=                      destination elasticsearch instance, ie: http://localhost:9201
  -m, --source_auth=               basic auth of source elasticsearch instance, ie: user:pass
  -n, --dest_auth=                 basic auth of target elasticsearch instance, ie: user:pass
  -c, --count=                     number of documents at a time: ie "size" in the scroll request (10000)
      --buffer_count=              number of buffered documents in memory (1000000)
  -w, --workers=                   concurrency number for bulk workers (1)
  -b, --bulk_size=                 bulk size in MB (5)
  -t, --time=                      scroll time (10m)
      --sliced_scroll_size=        size of sliced scroll, to make it work, the size should be > 1 (1)
  -f, --force                      delete destination index before copying
  -a, --all                        copy indexes starting with . and _
      --copy_settings              copy index settings from source
      --copy_mappings              copy index mappings from source
      --shards=                    set a number of shards on newly created indexes
  -x, --src_indexes=               indexes name to copy,support regex and comma separated list (_all)
  -y, --dest_index=                indexes name to save, allow only one indexname, original indexname will be used if not
                                   specified
  -u, --type_override=             override type name
      --green                      wait for both hosts cluster status to be green before dump. otherwise yellow is okay
  -v, --log=                       setting log level,options:trace,debug,info,warn,error (INFO)
  -o, --output_file=               output documents of source index into local file
  -i, --input_file=                indexing from local dump file
      --input_file_type=           the data type of input file, options: dump, json_line, json_array, log_line (dump)
      --source_proxy=              set proxy to source http connections, ie: http://127.0.0.1:8080
      --dest_proxy=                set proxy to target http connections, ie: http://127.0.0.1:8080
      --refresh                    refresh after migration finished
      --fields=                    filter source fields, comma separated, ie: col1,col2,col3,...
      --rename=                    rename source fields, comma separated, ie: _type:type, name:myname
  -l, --logstash_endpoint=         target logstash tcp endpoint, ie: 127.0.0.1:5055
      --secured_logstash_endpoint  target logstash tcp endpoint was secured by TLS
      --repeat_times=              repeat the data from source N times to dest output, use align with parameter regenerate_id
                                   to amplify the data size
  -r, --regenerate_id              regenerate id for documents, this will override the exist document id in data source
      --compress                   use gzip to compress traffic
  -p, --sleep=                     sleep N seconds after each bulk request (-1)

Help Options:
  -h, --help                       Show this help message

3.2 index 数据 迁移到 本地文件

./esm -s http://127.0.0.1:9200 -x "yz_test" -m elastic:password -c 5000 --refresh -o=dump.bin --copy_mappings

./esm -s http://127.0.0.1:9200 -x "logging" -q "date:[1610467200000 TO 1610553600000]" -m elastic:password -c 5000 --refresh -o=2021-01-13-log.bin

./esm -s http://127.0.0.1:9200 -x "logging" -q "date:[1610467200000 TO 1610553600000]" -m elastic:password -c 5000 --refresh -o=2021-01-13-log.bin --copy_settings --copy_mappings

3.3 本地文件数据 恢复到 index


./esm -d http://127.0.0.1:9200 -y "yz_test_recovery" -n elastic:password -c 5000 -b 5 --refresh -i=dump.bin

./esm -d http://127.0.0.1:9200 -y "log_recovery" -n elastic:password -c 5000 -b 5 --refresh -i=2021-01-13-log.bin

限制:

-y , --dest_index 只能是一个 index,且必须已经创建

--copy_mappings, 新创建的索引的 mapping 和 setting 需要重新指定

原文地址:https://www.cnblogs.com/thewindyz/p/14308900.html