cloudera笔记

一:Cloudera平台搭建

 运行后启动的服务

 运行三台机子后主机启动的服务

启动后首先安装kafka,测试hdfs

    9  hadoop dfs -mkdir /test
   10  hadoop dfs -put words /test
hadoop jar /opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/hadoop-mapreduce-examples-2.6.0-cdh5.14.2.jar wordcount /test/words /test/output

但是内存开销巨大

 

 

 关闭服务后依然在运行

 有时候ui会不能登录

尝试tcpdump

 二:测试

尝试kafka

kafka-topics --create --zookeeper node03:2181/kafka --replication-factor 1 --partitions 1 --topic wordcount

由于zookeeper leader在node03上

Error while executing topic command : Replication factor: 1 larger than available brokers: 0.
19/11/26 00:50:34 ERROR admin.TopicCommand$: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.

结果是代码错误

kafka-topics --create --zookeeper node03:2181 --replication-factor 1 --partitions 1 --topic wordcount
#显示成功
INFO admin.AdminUtils$: Topic creation {"version":1,"partitions":{"0":[68]}} Created topic "wordcount".

列出所有topic

kafka-topics --zookeeper node03:2181 --list

 这时断电所有机器,topic还是存在,说明集群稳定性良好

简单kafka测试:

kafka-console-producer --broker-list node02:9092,node03:9092 --topic wordcount

kafka-console-consumer --zookeeper node01:2181 --topic wordcount --from-beginning

 三: 安装kafka manager

https://www.jianshu.com/p/f65e76efe895

https://github.com/yahoo/kafka-manager

 测试:

 kafka-topics --zookeeper node01:2181 --list (在主机node03上也可以)

 

安装成功

四:安装redis,spark2:

 https://blog.csdn.net/silentwolfyh/article/details/83818525

 

 启动:spark2-shell,pyspark2

五:flume

https://blog.51cto.com/douya/1860390

原文地址:https://www.cnblogs.com/cschen588/p/11921506.html