ESRally压测ElasticSearch性能 CentOS 7.5 安装 Python3.7

1,CentOS 7.5 安装 Python3.7 

1、安装开发者工具

yum -y groupinstall "Development Tools"
2、安装Python编译依赖包

yum -y install openssl-devel zlib-devel bzip2-devel sqlite-devel readline-devel libffi-devel systemtap-sdt-devel
3、下载安装包

wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tgz
4、解压&编译

tar zvxf Python-3.7.0.tgz
cd Python-3.7.0
./configure --prefix=/usr/local/python3.7 --enable-optimizations
make && make install

# 编译完成后,创建软链接文件到执行文件路径:
ln -s /usr/local/python3/bin/python3 /usr/bin/python3
ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3
# 我们可以清除之前编译的可执行文件及配置文件 && 清除所有生成的文件:
make clean && make distclean


5、配置环境变量

文件: /etc/profile.d/python37.sh

if [ -z ${PYTHON37_HOME} ]; then
export PYTHON37_HOME=/usr/local/python3.7
export PATH=${PYTHON37_HOME}/bin:${PATH}
fi
6、加载环境变量

source /etc/profile.d/python37.sh
7、测试

python3 -c "import sys; print(sys.version)"

bug: 使用pip 命令失败
2.1 错误信息
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Collecting virtualenv
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
Could not fetch URL https://pypi.org/simple/virtualenv/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/virtualenv/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
Could not find a version that satisfies the requirement virtualenv (from versions: )
No matching distribution found for virtualenv
pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping

2.2 原因
系统版本centos6.5,其中openssl的版本为OpenSSL 1.0.1e-fips 11 Feb 2013,而python3.7需要的openssl的版本为1.0.2或者1.1.x,需要对openssl进行升级,并重新编译python3.7.0。yum 安装的openssl 版本都比较低。

2.3 升级openssl
# 1.下贼openssl
wget https://www.openssl.org/source/openssl-1.1.1a.tar.gz
tar -zxvf openssl-1.1.1a.tar.gz
cd openssl-1.1.1a
# 2.编译安装
./config --prefix=/usr/local/openssl no-zlib #不需要zlib
make
make install
# 3.备份原配置
mv /usr/bin/openssl /usr/bin/openssl.bak
mv /usr/include/openssl/ /usr/include/openssl.bak
# 4.新版配置
ln -s /usr/local/openssl/include/openssl /usr/include/openssl
ln -s /usr/local/openssl/lib/libssl.so.1.1 /usr/local/lib64/libssl.so
ln -s /usr/local/openssl/bin/openssl /usr/bin/openssl
# 5.修改系统配置
## 写入openssl库文件的搜索路径
echo "/usr/local/openssl/lib" >> /etc/ld.so.conf
## 使修改后的/etc/ld.so.conf生效 
ldconfig -v
# 6.查看openssl版本
openssl version

openssl version 提示:

 /usr/local/openssl/bin/openssl: error while loading shared libraries: libssl.so.1.1: cannot open shared object file: No such file or directory

假如你的libssl.so.1.1 文件在/usr/local/openssl/lib/下面,可以这样做

ln -s /usr/local/openssl/lib/libssl.so.1.1 /usr/lib64/libssl.so.1.1

ln -s /usr/local/openssl/lib/libcrypto.so.1.1 /usr/lib64/libcrypto.so.1.1

再重新装3.7
./configure --prefix=/usr/local/python3 --with-openssl=/usr/local/openssl
make && make install
Fatal Python error: initfsencoding: Unable to get the locale encoding
LookupError: unknown encoding: GB18030

设置字符集:
export LANG=zh_CN.UTF-8
export LANGUAGE=zh_CN.UTF-8
之后就解决了
装好后,unset下
遇到奇葩找不到源的问题 No matching distribution found for esrally:
用国内豆瓣代理
 pip3 install  --trusted-host  http://pypi.douban.com/simple/   esrally

 2,git2 安装

centos7系统默认的git安装版本是1.8,但是在项目构建中发现git版本过低,于是用源码编译的方式进行升级.

 

安装流程

1、第一步卸载原有的git。

yum remove git

2、安装相关依赖

yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel asciidoc
yum install  gcc perl-ExtUtils-MakeMaker

3、安装git

wget https://github.com/git/git/archive/v2.10.5.tar.gz(这个没有configure,无法加载 更新为openssl 1.1的版本)
 wget https://www.kernel.org/pub/software/scm/git/git-2.11.1.tar.gz

tar -xzvf v2.10.5.tar.gz
cd git-2.10.5
编译安装git(如果更新了openssl到1.1版本,需要指定一下:--with-openssl=/usr/local/openssl)

./configure --prefix=/usr/local/git --with-openssl=/usr/local/openssl

sudo make && make install

配置环境变量

echo "export PATH=$PATH:/usr/local/git/bin" >> /etc/profile && source /etc/profile

查看git版本

git --version

安装完成:
生成ssh key :
#ssh-keygen -t rsa -C “xxx@gmail.com”
登录Github点击Edit your profile->SSH keys,添加./.ssh/id_rsa.pub中的内容

问题解决

正常的流程就是按照上面的流程进行安装即可,下面总结一些在安装过程中遇到的几个问题.
1、make prefix=/usr/local/git all进行编译的时候提示如下错误

 LINK git-credential-store
libgit.a(utf8.o): In function `reencode_string_iconv':
/usr/src/git-2.8.3/utf8.c:463: undefined reference to `libiconv'
libgit.a(utf8.o): In function `reencode_string_len':
/usr/src/git-2.8.3/utf8.c:502: undefined reference to `libiconv_open'
/usr/src/git-2.8.3/utf8.c:521: undefined reference to `libiconv_close'
/usr/src/git-2.8.3/utf8.c:515: undefined reference to `libiconv_open'
collect2: ld returned 1 exit status
make: *** [git-credential-store] Error 1

这个问题主要是系统缺少libiconv库导致的。根据上面提供的链接,下载libiconv即可。

cd /usr/local/src
wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.14.tar.gz
tar -zxvf libiconv-1.14.tar.gz
cd libiconv-1.14
配置
./configure --prefix=/usr/local/libiconv
编译
make
安装
make install
建立软连接
ln -s /usr/local/lib/libiconv.so /usr/lib
ln -s /usr/local/lib/libiconv.so.2 /usr/lib

这时候还libiconv库已经安装完成,下面进入我们的git安装目录,按照下面的方式进行安装

make configure
./configure --prefix=/usr/local --with-iconv=/usr/local/libiconv
编译
make
安装
make install
加入环境变量
export PATH=$PATH:/usr/local/bin/git
检测版本号
git --version

2、在安装libiconv时会遇到./stdio.h:1010:1: error: ‘gets’ undeclared here (not in a function)的错误提示,进行下面的操作即可解决.

进入错误文件路径
cd libiconv-1.14/srclib
编辑文件stdio.in.h找到698行的样子,内容是_GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
将这一行注释掉(注意注释一定要用/**/来进行注释),替换为下面的内容
#if defined(__GLIBC__) && !defined(__UCLIBC__) && !__GLIBC_PREREQ(2, 16)
_GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
#endif

安装git编译的时候发生报错:

  1. [root@localhost git-2.4.5]# make
  2. SUBDIR perl
  3. /usr/bin/perl Makefile.PL PREFIX='/usr/local/git' INSTALL_BASE='' --localedir='/usr/local/git/share/locale'
  4. Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at Makefile.PL line 3.
  5. BEGIN failed--compilation aborted at Makefile.PL line 3.
  6. make[1]: *** [perl.mak] Error 2
  7. make: *** [perl/perl.mak] Error 2

解决办法如下:
yum install perl-ExtUtils-Embed -y
安装完以后重新编译解决问题

如果有其他的问题,可以参考公众干号:浪子编程走四方


作者:一介布衣q
链接:https://www.imooc.com/article/275738
来源:慕课网
本文原创发布于慕课网 ,转载请注明出处,谢谢合作

——————————————————————————————————————————

4,使用ESRally压测ElasticSearch性能

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ / __ `/ / \__ / ___/ __ / ___/ _
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------

| Metric | Task | Value | Unit |
|-------------------------------:|---------------------:|----------:|-------:|
| Total indexing time | | 28.0997 | min |
| Total merge time | | 6.84378 | min |
| Total refresh time | | 3.06045 | min |
| Total flush time | | 0.106517 | min |
| Total merge throttle time | | 1.28193 | min |
| Median CPU usage | | 471.6 | % |
| Total Young Gen GC | | 16.237 | s |
| Total Old Gen GC | | 1.796 | s |
| Index size | | 2.60124 | GB |
| Total written | | 11.8144 | GB |
| Heap used for segments | | 14.7326 | MB |
| Heap used for doc values | | 0.115917 | MB |
| Heap used for terms | | 13.3203 | MB |
| Heap used for norms | | 0.0734253 | MB |
| Heap used for points | | 0.5793 | MB |
| Heap used for stored fields | | 0.643608 | MB |
| Segment count | | 97 | |
| Min Throughput | index-append | 31925.2 | docs/s |
| Median Throughput | index-append | 39137.5 | docs/s |
| Max Throughput | index-append | 39633.6 | docs/s |
| 50.0th percentile latency | index-append | 872.513 | ms |
| 90.0th percentile latency | index-append | 1457.13 | ms |
| 99.0th percentile latency | index-append | 1874.89 | ms |
| 100th percentile latency | index-append | 2711.71 | ms |
| 50.0th percentile service time | index-append | 872.513 | ms |
| 90.0th percentile service time | index-append | 1457.13 | ms |
| 99.0th percentile service time | index-append | 1874.89 | ms |
| 100th percentile service time | index-append | 2711.71 | ms |
| ... | ... | ... | ... |
| ... | ... | ... | ... |
| Min Throughput | painless_dynamic | 2.53292 | ops/s |
| Median Throughput | painless_dynamic | 2.53813 | ops/s |
| Max Throughput | painless_dynamic | 2.54401 | ops/s |
| 50.0th percentile latency | painless_dynamic | 172208 | ms |
| 90.0th percentile latency | painless_dynamic | 310401 | ms |
| 99.0th percentile latency | painless_dynamic | 341341 | ms |
| 99.9th percentile latency | painless_dynamic | 344404 | ms |
| 100th percentile latency | painless_dynamic | 344754 | ms |
| 50.0th percentile service time | painless_dynamic | 393.02 | ms |
| 90.0th percentile service time | painless_dynamic | 407.579 | ms |
| 99.0th percentile service time | painless_dynamic | 430.806 | ms |
| 99.9th percentile service time | painless_dynamic | 457.352 | ms |
| 100th percentile service time | painless_dynamic | 459.474 | ms |

----------------------------------
[INFO] SUCCESS (took 2634 seconds)
----------------------------------

在部署完一套ES集群之后,我们肯定想知道这套集群性能如何?是否可以支撑未来业务发展?存不存在性能上的瓶颈?要想有依据的回答这些问题,我们需要通过压力测试结果中找答案。

介绍

Rally是Elasticsearch的基准测试框架,由官方提供维护。

安装

  1. 安装Python3.5及以上版本,系统默认可能是2.x版本,如果需要升级请参考《在CentOS7上安装Python3》。
  2. 安装git1.9及以上版本
  3. 安装esrally pip3 install esrally
  4. 配置esrally esrally configure,执行此命令后会在当前用户根目录下生成 .rally 目录,可以 ll ~/.rally这样来确认。

使用

快速开始

如果想测试当前机器上某个版本单点ES性能,可以像下面这样:

1
2
3
4
esrally --distribution-version=6.5.3

# 同样的如果你想测试其它版本
esrally --distribution-version=6.8.1   --car="4gheap" 

当执行上面的命令之后会自动下载对应es版本软件,并在本地启动,接着执行测试。这个过程在rally中被称为比赛,而赛道是用默认的,即geonames

测试远程集群

上面的示例不能测试存在的es集群,下面介绍使用方法:

  1. 指定跑道和ES集群地址后就可以执行测试。

esrally --pipeline=benchmark-only 
--track=http_logs
--target-hosts=192.168.1.100:9200,192.168.1.101:9200,192.168.1.102:9200
--report-file=/tmp/report_http_logs.md
--track-params="bulk_indexing_clients:96"
 --include-tasks="index-append"
--challenge=append-fast-with-no-conflicts (只测试写)

 --report-format=csv


备注: esrally list tracks 命令可以查看可用跑道(–track)

可以使用 --report-file=/path/to/your/report.md 将此报告也保存到文件中,并使用 --report-format=csv将其另存为CSV。

修改默认跑道参数

如果直接在默认跑道上修改,会被还原,所以只能通过增加跑道的方式。

  1. 在 .rally/benchmarks/tracks 下面创建新的赛道,比如 custom

  2. 在 custom/http_logs/challenges/default.json 文件中调整赛道的配置并保存,例如下面修改default对应的操作,由10个客户端发起,每个用户端发出100次操作。

    1
    2
    3
    4
    5
    6
    7
    {
    "operation": "default",
    "clients": 10,
    "warmup-iterations": 500,
    "iterations": 100,
    "target-throughput": 100
    }
  3. 在启动时指定跑道:esrally --track=custom ....,例如:

    1
    2
    3
    4
    5
    esrally --pipeline=benchmark-only 
    --track=custom
    --track=http_logs
    --target-hosts=192.168.1.100:9200,192.168.1.101:9200,192.168.1.102:9200
    --report-file=/tmp/report_http_logs.md

 

参考文献:

  • https://esrally.readthedocs.io/en/stable/index.html
  • https://github.com/elastic/rally
  • 比较全的压测介绍:https://www.jianshu.com/p/c89975b50447
  • 使用docker 运行 esrally(包含离线数据集):https://www.jianshu.com/p/3a019c135e2a
  • 测试参数解释:https://www.jianshu.com/p/979f548c233e
  • 讨论为什么elasticsearch没有被压满:https://discuss.elastic.co/t/es-benchmark-using-rally-to-stress-a-2-node-setup/150020/6
  • 有压测结果对比:https://www.jianshu.com/p/e7de3b24f505

数据集:
国内下载慢,可以先执行一遍
esrally --distribution-version=6.5.3 --track=geonames
这样即使下载测试数据失败,但是目录结构都生成好了。可以自行下载bz文件,在

http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents.json.bz2
存成默认定义的:
documents-2.json.bz2
documents-2.json.offset
然后直接拷贝过去即可:~/.rally/benchmarks/data/geonames

esrally list tracks

This will show the following list:

Name        Description                                          Documents  Compressed Size    Uncompressed Size    Default Challenge        All Challenges
----------  -------------------------------------------------  -----------  -----------------  -------------------  -----------------------  ---------------------------
geonames    POIs from Geonames                                    11396505  252.4 MB           3.3 GB               append-no-conflicts      append-no-conflicts,appe...
geopoint    Point coordinates from PlanetOSM                      60844404  481.9 MB           2.3 GB               append-no-conflicts      append-no-conflicts,appe...
http_logs   HTTP server log data                                 247249096  1.2 GB             31.1 GB              append-no-conflicts      append-no-conflicts,appe...
nested      StackOverflow Q&A stored as nested docs               11203029  663.1 MB           3.4 GB               nested-search-challenge  nested-search-challenge,...
noaa        Global daily weather measurements from NOAA           33659481  947.3 MB           9.0 GB               append-no-conflicts      append-no-conflicts,appe...
nyc_taxis   Taxi rides in New York in 2015                       165346692  4.5 GB             74.3 GB              append-no-conflicts      append-no-conflicts,appe...
percolator  Percolator benchmark based on AOL queries              2000000  102.7 kB           104.9 MB             append-no-conflicts      append-no-conflicts,appe...
pmc         Full text benchmark with academic papers from PMC       574199  5.5 GB             21.7 GB      

这个地址里面 https://github.com/elastic/rally-tracks/tree/master/,进入子目录有各个数据集可配置的参数:

如http_logs:

Parameters (--track-params="bulk_indexing_clients:96"

This track allows to overwrite the following parameters with Rally 0.8.0+ using --track-params:

  • bulk_size (default: 5000)
  • bulk_indexing_clients (default: 8): Number of clients that issue bulk indexing requests.

5,自己定义压测:

参考:https://esrally.readthedocs.io/en/latest/adding_tracks.html

主要步骤:

1,创建目录:~/rally-tracks/tutorial

2,手动下载 geonames 压测数据: http://download.geonames.org/export/dump/allCountries.zip 。它里面是用tab分开的文本文件。需要转成json格式。

用这段python代码转:

import json

cols = (("geonameid", "int", True),
        ("name", "string", True),
        ("asciiname", "string", False),
        ("alternatenames", "string", False),
        ("latitude", "double", True),
        ("longitude", "double", True),
        ("feature_class", "string", False),
        ("feature_code", "string", False),
        ("country_code", "string", True),
        ("cc2", "string", False),
        ("admin1_code", "string", False),
        ("admin2_code", "string", False),
        ("admin3_code", "string", False),
        ("admin4_code", "string", False),
        ("population", "long", True),
        ("elevation", "int", False),
        ("dem", "string", False),
        ("timezone", "string", False))


def main():
    with open("allCountries.txt", "rt", encoding="UTF-8") as f:
        for line in f:
            tup = line.strip().split("	")
            record = {}
            for i in range(len(cols)):
                name, type, include = cols[i]
                if tup[i] != "" and include:
                    if type in ("int", "long"):
                        record[name] = int(tup[i])
                    elif type == "double":
                        record[name] = float(tup[i])
                    elif type == "string":
                        record[name] = tup[i]
            print(json.dumps(record, ensure_ascii=False))


if __name__ == "__main__":
    main()

存到刚才~/rally-tracks/tutorial目录下,python3 toJSON.py documents.json

3,对于7.0以下的es,保持这个成index.json

{
  "settings": {
    "index.number_of_replicas": 0
  },
  "mappings": {
    "docs": {
      "dynamic": "strict",
      "properties": {
        "geonameid": {
          "type": "long"
        },
        "name": {
          "type": "text"
        },
        "latitude": {
          "type": "double"
        },
        "longitude": {
          "type": "double"
        },
        "country_code": {
          "type": "text"
        },
        "population": {
          "type": "long"
        }
      }
    }
  }
}

5,再保存一个track.json

{
  "version": 2,
  "description": "Tutorial benchmark for Rally",
  "indices": [
    {
      "name": "geonames",
      "body": "index.json",
      "types": [ "docs" ]
    }
  ],
  "corpora": [
    {
      "name": "rally-tutorial",
      "documents": [
        {
          "source-file": "documents.json",
          "document-count": 11658903,
          "uncompressed-bytes": 1544799789
        }
      ]
    }
  ],
  "schedule": [
    {
      "operation": {
        "operation-type": "delete-index"
      }
    },
    {
      "operation": {
        "operation-type": "create-index"
      }
    },
    {
      "operation": {
        "operation-type": "cluster-health",
        "request-params": {
          "wait_for_status": "green"
        }
      }
    },
    {
      "operation": {
        "operation-type": "bulk",
        "bulk-size": 5000
      },
      "warmup-time-period": 120,
      "clients": 8
    },
    {
      "operation": {
        "operation-type": "force-merge"
      }
    },
    {
      "operation": {
        "name": "query-match-all",
        "operation-type": "search",
        "body": {
          "query": {
            "match_all": {}
          }
        }
      },
      "clients": 8,
      "warmup-iterations": 1000,
      "iterations": 1000,
      "target-throughput": 100
    }
  ]
}

其中documents 属性里面的字段值是这么来的:

wc -l documents.json json个数

stat -f "%z" documents.json 文件大小

7.0以后版本要去掉types 。

6,检查建立成功没有:esrally list tracks --track-path=~/rally-tracks/tutorial

7,执行自己的track:esrally --distribution-version=6.0.0 --track-path=~/rally-tracks/tutorial

--test-mode 来检测配置文件对否。

这个来生成1000条数据:head -n 1000 documents.json documents-1k.json


原文地址:https://www.cnblogs.com/bigben0123/p/11188461.html