HDFS集群常见报错汇总

                  HDFS集群常见报错汇总

                                           作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

一.DataXceiver error processing WRITE_BLOCK operation 

报错信息以及截图如下:

calculation112.aggrx:50010:DataXceiver error processing WRITE_BLOCK operation  src: /10.1.1.116:36274 dst: /10.1.1.112:50010
java.io.IOException: Premature EOF from inputStream
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:901)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
    at java.lang.Thread.run(Thread.java:748)
  ......
 

  报错原因:

    文件操作超租期,实际上就是data stream操作过程中文件被删掉了。

  解决方案:

第一步骤:(修改进程最大文件打开数)

[root@calculation101 ~]# cat /etc/security/limits.conf  | grep -v ^#


*        soft    nofile        1000000
*         hard    nofile        1048576
*        soft    nproc        65536
*        hard    nproc        unlimited
*        soft    memlock        unlimited
*        hard    memlock        unlimited
*         -      nofile          1000000
*         -      nproc           1000000
[root@calculation101 ~]# 

第二步骤:(修改数据传输线程个数)

原文地址:https://www.cnblogs.com/yinzhengjie/p/10400260.html