hue集成hadoop和yarn

一、配置hadoop配置文件

这里修改分为两种模式,一种是hdfs HA模式,一种是hdfs Non HA模式

1.1 非HA模式配置

使用webhdfs方式

1)修改hdfs-site.xml文件,添加如下配置:

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>

2)修改core-site.xml,添加如下配置

<property>
  <name>hadoop.proxyuser.hduser.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hduser.groups</name>
  <value>*</value>
</property>

修改完记得分发配置到其他节点,然后重启hadoop集群,线上集群就只能分批下线了

2.2HA模式

HA模式只能使用httpfs方式访问

1)修改httpfs-site.xml文件,添加

<property>
  <name>httpfs.proxyuser.hduser.hosts</name>
  <value>*</value>
</property>
<property>
  <name>httpfs.proxyuser.hduser.groups</name>
  <value>*</value>
</property>

2) 启动httpsfs

[hduser@yjt hadoop]$ httpfs.sh start

默认监听在14000

二,修改hue配置

1、修改hue.ini

主要修改以下几项配置

default_hdfs_superuser=hduser  # 默认是hdfs,这个是配置启动集群的用户,如果不修改,界面访问hdfs的时候,可能出现权限问题

[[hdfs_clusters]]
# HA support by using HttpFs

[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://yjt:9000   # hdfs 文件系统的URL

# NameNode logical name.
logical_name=yjt

# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://192.168.0.230:14000/webhdfs/v1  # 使用httpfs的url

# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false   #kerberos相关

# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True

# Directory of the Hadoop configuration
## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
hadoop_conf_dir=${HADOOP_HOME}/etc/hadoop/conf  # 集群配置的目录

重启集群以及重启hue

三、界面查看

 然后选择files

二、配置yarn

1、修改Hue.ini文件,

先找到[[yarn_clusters]]这个标签,信息如下:

非HA:
[[yarn_clusters]] [[[
default]]] resourcemanager_host=yjt # 配置resourcemanager的地址 resourcemanager_port=8032 submit_to=True resourcemanager_api_url=http://yjt:8088 proxy_api_url=http://yjt:8088 history_server_api_url=http://yjt:19888
HA:
这里需要说明一下,[[[default]]] 和 [[ha]]中各配置一个RM。
# Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]
 
    [[[default]]]
 
      # Whether to submit jobs to this cluster
      submit_to=True
 
      # Name used when submitting jobs
      logical_name=rm1  # 这个配置的是yarn.resourcemanager.ha.rm-ids对应的值
 
      # URL of the ResourceManager API
      resourcemanager_api_url=http://log1:8088 # web访问地址yarn.resourcemanager.webapp.address.rm1 对应的值
 
      # URL of the ProxyServer API
      proxy_api_url=http://log1:8088
 
      # URL of the HistoryServer API
      history_server_api_url=http://log1:19888 # mapred-site.xml 文件里面mapreduce.jobhistory.webapp.address对应的值
 
    [[[ha]]]
      # Enter the host on which you are running the failover Resource Manager
      resourcemanager_api_url=http://log2:8088
      logical_name=rm2
      submit_to=True

修改完重启hue

2、web界面查看

1)先点击左上角三横线这个按钮,然后点击jobs

2)可能出现的错误

点击jobs的时候界面出现错误:

Failed to contact an active Resource Manager: YARN RM returned a failed response: { "RemoteException" : { "message" : "User: hue is not allowed to impersonate admin", "exception" : "AuthorizationException", "javaClassName" : "org.apache.hadoop.security.authorize.AuthorizationException" } } (error 403)

解决办法:

修改 desktop/conf/hue.ini 

  # Webserver runs as this user
   #server_user=hue
   #server_group=hue

  # This should be the Hue admin and proxy user
   default_user=hduser  # 修改这个值为启动集群的用户,默认值是hue

借鉴:

https://www.cnblogs.com/zlslch/p/6817226.html

https://www.cnblogs.com/liuchangchun/p/4657520.html

原文地址:https://www.cnblogs.com/yjt1993/p/13086508.html