【原创】大数据基础之Doris(1)编译安装和启动

一 编译

doris编译有两种方式,一种是docker编译,一种是直接裸机编译,推荐使用docker编译,可以避免大量的环境依赖问题

docker编译

1 安装docker

yum install docker
systemctl start docker
systemctl enable docker
docker pull apachedoris/doris-dev:build-env-1.2

2 下载源码

wget https://mirrors.bfsu.edu.cn/apache/incubator/doris/0.13.0-incubating/apache-doris-0.13.0-incubating-src.tar.gz
tar xvf apache-doris-0.13.0-incubating-src.tar.gz

3 启动容器

docker run -it -v /root/.m2:/root/.m2 -v /path/to/apache-doris-0.13.0-incubating-src:/root/apache-doris-0.13.0-incubating-src apachedoris/doris-dev:build-env-1.2

需要做两个目录映射,一个是maven的repository目录,一个是doris源码目录,避免容器挂了之后之前下载或编译的内容丢失

4 编译doris

cd /root/apache-doris-0.13.0-incubating-src
sh -x build.sh

编译之后输出至output目录,有3个子目录:be、fe、udf,只需要拷贝output目录到其他服务器即可

编译时报错

[ERROR] Plugin net.sourceforge.czt.dev:cup-maven-plugin:1.6-cdh or one of its dependencies could not be resolved: Failed to read artifact descriptor for net.sourceforge.czt.dev:cup-maven-plugin:jar:1.6-cdh: Could not transfer artifact net.sourceforge.czt.dev:cup-maven-plugin:pom:1.6-cdh from/to spring-plugins (https://repo.spring.io/plugins-release/): Authentication failed for https://repo.spring.io/plugins-release/net/sourceforge/czt/dev/cup-maven-plugin/1.6-cdh/cup-maven-plugin-1.6-cdh.pom 401 Unauthorized -> [Help 1

修改如下:

                 <!-- for java-cup -->
                 <repository>
                 <!--
                     <id>cloudera-thirdparty</id>
                     <url>https://repository.cloudera.com/content/repositories/third-party/</url>
                     -->
                     <id>cloudera-public</id>
                     <url>https://repository.cloudera.com/artifactory/public/</url>
                 </repository>


                 <!-- for cup-maven-plugin -->
                 <pluginRepository>
                 <!--
                     <id>spring-plugins</id>
                     <url>https://repo.spring.io/plugins-release/</url>
                     -->
                     <id>cloudera-public</id>
                     <url>https://repository.cloudera.com/artifactory/public/</url>
                 </pluginRepository>

5 编译broker

cd fs_brokers/apache_hdfs_broker
sh -x build.sh

6 编译spark connector

cd extension/spark-doris-connector
sh -x build.sh

裸机编译

1 准备

jdk8+
maven

sudo yum groupinstall 'Development Tools' && sudo yum install maven cmake byacc flex automake libtool bison binutils-devel zip unzip ncurses-devel curl git wget python2 glibc-static libstdc++-static java-1.8.0-openjdk

其中:centos7上gcc默认4.8.5,cmake默认2.8.12

2 升级GCC

yum install gcc-c++
wget http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-7.3.0/gcc-7.3.0.tar.gz
tar zxvf tar zxvf gcc-7.3.0.tar.gz
cd gcc-7.3.0
yum install lbzip2
./contrib/download_prerequisites
mkdir build
cd build/
../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
make
make install

3 升级CMAKE

wget https://cmake.org/files/v3.6/cmake-3.6.2.tar.gz
tar xvf cmake-3.6.2.tar.gz && cd cmake-3.6.2/
./bootstrap
gmake
gmake install
mv /usr/bin/cmake /usr/bin/cmake.bak
ln -s /usr/local/bin/cmake /usr/bin/

4 编译

sh -x build.sh

报错处理

报错1

Downloading libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip from https://doris-incubating-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip to /usr/local/app/doris/thirdparty/src
--2021-01-11 09:32:59-- https://doris-incubating-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip
Resolving doris-incubating-repo.bj.bcebos.com (doris-incubating-repo.bj.bcebos.com)... 2409:8c00:6c21:10ad:0:ff:b00e:67d, 220.181.33.44, 220.181.33.43
Connecting to doris-incubating-repo.bj.bcebos.com (doris-incubating-repo.bj.bcebos.com)|2409:8c00:6c21:10ad:0:ff:b00e:67d|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2021-01-11 09:32:59 ERROR 404: Not Found.

因为url已经修改,参考git上最新的文件如下:
https://github.com/apache/incubator-doris/blob/master/thirdparty/vars.sh

解决方法:将url修改为

LIBEVENT_DOWNLOAD="https://doris-thirdparty-repo.bj.bcebos.com/thirdparty/libevent-20180622-24236aed01798303745470e6c498bf606e88724a.zip"

参考:https://github.com/apache/incubator-doris/issues/5519

报错2

下载第三方DataTables报错403,直接访问
https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.22
提示

Error: Only libraries of the current release version can be used by the. package builder. The library DataTables's current release version is 1.10.24. Please reload the download builder page to have it use the latest libraries.

解决方法:修改thirdparty/vars.sh,将版本改为最新的1.10.24,同时修改md5sum

# datatables, bootstrap 3 and jQuery 3
#DATATABLES_DOWNLOAD="https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.22"
#DATATABLES_NAME="DataTables.zip"
#DATATABLES_SOURCE="DataTables-1.10.22"
#DATATABLES_MD5SUM="62558846fc6a6db1428e7816a2a351f7"
DATATABLES_DOWNLOAD="https://datatables.net/download/builder?bs-3.3.7/jq-3.3.1/dt-1.10.24"
DATATABLES_NAME="DataTables.zip"
DATATABLES_SOURCE="DataTables-1.10.24"
DATATABLES_MD5SUM="22404292d02cf3c5f4cd9f5a02d4b42c"
报错3

checking how to run the C preprocessor... /usr/lib64/ccache/../bin/cpp
configure: error: in `/data/app/apache-doris-0.13.0-incubating-src/thirdparty/src/unixODBC-2.3.7':
configure: error: C preprocessor "/usr/lib64/ccache/../bin/cpp" fails sanity check

解决方法:

ln -s /usr/local/bin/cpp /usr/lib64/bin/cpp

报错4

./comp_err: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

解决方法:

yum -y install libatomic

报错5

cp: cannot stat ‘./zstd_ep-install/lib/libzstd.a’: No such file or directory

解决方法:

cd thirdparty/src/zstd-1.3.7/tests
make zstd-staticLib
mkdir -p /path/to/apache-doris-0.12.0-incubating-src/thirdparty/src/arrow-apache-arrow-0.15.1/cpp/release/zstd_ep-install/lib64
cp ../lib/libzstd.a /path/to/apache-doris-0.12.0-incubating-src/thirdparty/src/arrow-apache-arrow-0.15.1/cpp/release/zstd_ep-install/lib64

二 启动

1 启动FE

cd output/fe
mkdir doris-meta
bin/start_fe.sh --daemon

日志在log目录下

注意:

  • 默认8030端口可能与yarn的resourcemanager冲突
  • 启动之后检查fe绑定端口的ip是否正确,如果绑定ip错误(比如安装docker之后取到docker的ip),会导致be无法连接fe,需要配置fe.conf中的priority_networks,配置为正确的网段

priority_networks=192.168.1.0/24

2 启动BE

cd output/be
mkdir /path/to/storage
vim conf/be.conf
bin/start_be.sh --daemon

日志在log目录下

注意:

  • 默认8040端口可能与yarn的nodemanager冲突
  • 如果be启动失败,一般可能有两个原因:一个是端口被占用,一个是limit,根据日志排查,比如报错
    Doris Be http service did not start correctly. exiting.
    则是因为端口占用导致web启动失败

修改limit

临时修改

limit -n 65535

永久修改

vim /etc/security/limit.conf
*               hard    nofile             65535
*               soft    nofile             65535

3 启动Broker

cd fs_brokers/apache_hdfs_broker/output/apache_hdfs_broker
bin/start_broker.sh --daemon

三 使用

1 FE web访问

http://fe_server:8030

注意:

  • 端口为http_port
  • 默认账号root,密码空

2 FE 命令访问

mysql -P9030 -uroot
show proc '/frontends';

注意:

  • 端口为query_port

3 添加或删除FE

alter system add follower '$host:$port';
show proc '/frontends';

4 添加或删除BE

alter system add backend '$host:$port';
alter system dropp backend '$host:$port';
show proc '/backends';

注意:

  • port默认为9050,即heartbeat_service_port

5 添加或删除Broker

alter system add broker $broker_name '$host:$port';
show proc '/brokers';

注意:

  • port默认为8000,即broker ipc_port

四 数据导入

hive数据导入

create database test;
CREATE TABLE test.test_user_doris
(
  `id` varchar(128) , 
  `name` varchar(128) , 
  `country` varchar(128) , 
  `province` varchar(128) , 
  `city` varchar(128) ,  
  `order_count` int SUM
)
AGGREGATE KEY(id, name, country, province, city)
DISTRIBUTED BY HASH(id) BUCKETS 10
PROPERTIES("replication_num" = "3");

LOAD LABEL load_test_user_doris
(
    DATA INFILE("hdfs://nameservice1/user/hive/warehouse/test.db/test_user/*")
    INTO TABLE `test_user_doris`
    FORMAT as "parquet"
)
WITH BROKER $broker_name 
(
    "dfs.nameservices" = "nameservice1",
    "dfs.ha.namenodes.nameservice1" = "namenode1,namenode2",
    "dfs.namenode.rpc-address.nameservice1.namenode1" = "nn1:8020",
    "dfs.namenode.rpc-address.nameservice1.namenode2" = "nn2:8020"
)
PROPERTIES
(
    "timeout"="36000",
    "max_filter_ratio"="0.1"
);

show load;

五 参数配置

show variables;

同mysql

参考:


---------------------------------------------------------------- 结束啦,我是大魔王先生的分割线 :) ----------------------------------------------------------------
  • 由于大魔王先生能力有限,文中可能存在错误,欢迎指正、补充!
  • 感谢您的阅读,如果文章对您有用,那么请为大魔王先生轻轻点个赞,ありがとう
原文地址:https://www.cnblogs.com/barneywill/p/14808288.html