大数据平台搭建 cdh5.11.1 oozie安装

一、简介

oozie是hadoop平台开源的工作流调度引擎,用来管理hadoop作业,属于web应用程序,由oozie server 和oozie client构成。

oozie server运行与tomcat容器中

oozie的工作流必须是一个有向无环图,当用户需要执行多个关联的MapReduce作业时,只需要把作业写进workflow.xml中,再提交到oozie,oozie便可以托管服务,按照预先的配置有序执行任务。

二、安装

1.下载编译好的cdh版本

http://archive.cloudera.com/cdh5/cdh/5/

下载4.1-cdh5.11.1即可

2.先停hbase和zookeeper

bin/hbase-daemon.sh stop master
bin/hbase-daemon.sh stop regionserver
bin/hbase-daemon.sh stop zookeeper
3.再停hadoop集群
sbin/stop-dfs.sh
sbin/stop-yarn.sh
4.解压oozie压缩包到本地目录
5.配置hadoop的一个代理用户
<!-- OOZIE -->
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>hadoop001</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>

 

6.在解压过的根目录中,再解压oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz文件到当前目录下,会多一个目录:oozie-4.1.0-cdh5.11.1目录

7.在oozie根目录下创建libext目录

拷贝刚刚解压的jar包到libext目录

cp -r ./oozie-4.1.0-cdh5.11.1/hadooplibs/hadooplib-2.6.0-cdh5.11.1.oozie-4.1.0-cdh5.11.1/* ~/app/oozie/libext/

8.拷贝ext-2.2.zip到libext目录

9.打包oozie到war包中

bin/oozie-setup.sh prepare-war

这个命令会把libext下的jar包,打成war包

10.启动hadoop

sbin/start-dfs.sh

sbin/start-yarn.sh

11.修改oozie-site.xml,新增配置(在oozie新版本中,会有oozie-default.xml和oozie-site.xml,如果有修改的地方,请拷贝属性到oozie-site.xml中,不要直接去修改oozie-default.xml否则不生效)


	<property>
        <name>oozie.service.WorkflowAppService.system.libpath</name>
        <value>/user/oozie/share/lib</value>
        <description>
            System library path to use for workflow applications.
            This path is added to workflow application if their job properties sets
            the property 'oozie.use.system.libpath' to true.
        </description>
    </property>
	

	<property>
        <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
        <value>*=/home/hadoop/app/hadoop/etc/hadoop</value>
        <description>
            Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
            the Hadoop service (JobTracker, YARN, HDFS). The wildcard '*' configuration is
            used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
            the relevant Hadoop *-site.xml files. If the path is relative is looked within
            the Oozie configuration directory; though the path can be absolute (i.e. to point
            to Hadoop client conf/ directories in the local filesystem.
        </description>
    </property>

<property>
<name>oozie.processing.timezone</name>
<value>GMT+0800</value>
<description>
Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India
timezone. All dates parsed and genered dates by Oozie Coordinator/Bundle will be done in the specified
timezone. The default value of 'UTC' should not be changed under normal circumtances. If for any reason
is changed, note that GMT(+/-)#### timezones do not observe DST changes.
</description>
</property>


  


  12.把共享包传到hdfs上

bin/oozie-setup.sh sharelib create -fs hdfs://hadoop004:8020 -locallib oozie-sharelib-4.1.0-cdh5.11.1-yarn.tar.gz
13.配置oozie的数据库为mysql
oozie-site.xml新加配置


<property>
        <name>oozie.service.JPAService.jdbc.driver</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>
            JDBC driver class.
        </description>
    </property>

    <property>
        <name>oozie.service.JPAService.jdbc.url</name>
        <value>jdbc:mysql://hadoop001:3306/oozie?createDatabaseIfNotExist=true</value>
        <description>
            JDBC URL.
        </description>
    </property>

    <property>
        <name>oozie.service.JPAService.jdbc.username</name>
        <value>root</value>
        <description>
            DB user name.
        </description>
    </property>

    <property>
        <name>oozie.service.JPAService.jdbc.password</name>
        <value>123456</value>
        <description>
            DB user password.

            IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
                       if empty Configuration assumes it is NULL.
        </description>
    </property>

  

 

利用命令在数据库中创建表结构及数据

bin/ooziedb.sh create -sqlfile oozie.sql -run DB Connection

14.启动oozie

bin/oozied.sh start

15.访问:

hadoop001:11000 即可访问了
原文地址:https://www.cnblogs.com/nicekk/p/9043486.html