简要记录MacOS本地大数据环境搭建信息

简要记录MacOS本地大数据环境搭建,重要配置和启动相关命令

相关工具持续记录中...

1. brew安装软件

brew查看安装路径的命令,如:brew info hadoopbrew -h查看命令帮助

  • brew install hadoop

  • brew install hive

  • brew install apache-flink

  • brew install kafka

  • brew install zookeeper

2. hadoop

  • 编辑vim ./libexec/etc/hadoop/hadoop-env.sh配置JAVA_HOME

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home
    
  • 编辑vim ./libexec/etc/hadoop/core-site.xml配置NameNode的主机名和端口号:

    <configuration>
    <property>
     <name>hadoop.proxyuser.hadoop.hosts</name>
     <value>*</value>
    </property>
    <property>
     <name>hadoop.proxyuser.hadoop.groups</name>
     <value>*</value>
    </property>
    
      <!-- hdfs地址 -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
            <description>A base for other temporary directories</description>
        </property>
        <property>
            <name>fs.default.name</name>
            <value>hdfs://localhost:9000</value>
        </property>
    </configuration>
    
  • 编辑vim ./libexec/etc/hadoop/hdfs-site.xml变量dfs.replication指定了每个HDFS数据库的复制次数。 通常为3, 由于我们只有一台主机和一个伪分布式模式的DataNode,将此值修改为1

<configuration>
<property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
</property>

<!-- 新加 -->
 <property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>
</configuration>
  • 命令hdfs namenode -format初次格式化,格式化hdfs操作只要第一次才使用,否则会造成数据全部丢失

  • 命令./sbin/start-dfs.sh启动 NameNode 和 DataNode,[http://localhost:9870]

  • 命令./start-yarn.sh启动yarn服务:[http://localhost:8088/cluster]

也可以合并为

./start-all.sh
./stop-all.sh

3. hive

  • vim libexec/conf/hive-site.xml用mysql保存元数据信息
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
</property>
</configuration>
  • 复制mysql的驱动程序到 $HIVE_HOME/lib下面

  • 终端输入hive命令启动

5. kafka

  • 配置zookeeper,配置文件地址:/usr/local/etc/zookeeper/

  • 启动zookeeper服务

nohup zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties &
  • 编辑vim /usr/local/etc/kafka/server.properties解除注释:listeners=PLAINTEXT://localhost:9092

  • 启动Kafka服务

nohup kafka-server-start /usr/local/etc/kafka/server.properties &
原创 Doflamingo https://www.cnblogs.com/doflamingo
原文地址:https://www.cnblogs.com/doflamingo/p/14336228.html