RHadoop: REDUCE capability required is more than the supported max container capability in the cluster

I have not used RHadoop. However I've had a very similar problem on my cluster, and this problem seems to be linked only to MapReduce.

The maxContainerCapability in this log refers to the yarn.scheduler.maximum-allocation-mbproperty of your yarn-site.xml configuration. It is the maximum amount of memory that can be used in any container.

The mapResourceReqt and reduceResourceReqt in your log refer to the mapreduce.map.memory.mb and mapreduce.reduce.memory.mb properties of your mapred-site.xml configuration. It is the memory size of the containers that will be created for a Mapper or a Reducer in mapreduce.

If the size of your Reducer's container is set to be greater than yarn.scheduler.maximum-allocation-mb, which seems to be the case here, your job will be killed because it is not allowed to allocate so much memory to a container.

Check your configuration at http://[your-resource-manager]:8088/conf and you should normally find these values and see that this is the case.

Maybe your new environment has these values set to 4096 Mb (which is quite big, the default in Hadoop 2.7.1 being 1024).

Solution

You should either lower the mapreduce.[map|reduce].memory.mb values down to 1024, or if you have lots of memory and want huge containers, raise the yarn.scheduler.maximum-allocation-mb value to 4096. Only then MapReduce be able to create containers.

I hope this helps.

简而言之：yarn-site.xml 中的yarn.scheduler.maximum-allocation-mb配置的值 >= mapred-site.xml 中mapreduce.map.memory.mb、mapreduce.reduce.memory.mb 的值