【异常】org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.:

1 详细异常

org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /wm1/link/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/003993.sst
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:181)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /wm1/link/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/003993.sst
        at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:950)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:937)
        at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:210)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        ... 5 more
2020-01-06 10:14:24,136 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at ****。****
************************************************************/

  

 
发现疑似目录:/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state下存在: 005615.sst 005638.log 005640.log CURRENT LOCK MANIFEST-004397移除所有文件。重启nodemanager 成功。 回顾错误原因可能是,我在该nodemanager终止情况下,在集群中添加了新的nodemanager,使得角色数目增加,而启动失败的nodemanager时,它使用存储的状态来恢复,在和数据库校验过程中发现数目不符合而启动失败。因此删除上述目录下的文件。
 
 
 
原文地址:https://www.cnblogs.com/QuestionsZhang/p/12182374.html