大数据性能优化

hdfs优化

https://www.cnblogs.com/yinzhengjie/p/10006880.html

spark优化

https://blog.csdn.net/u012102306/article/details/51637366

MIN_CONTAINER_SIZE = 2048 MB

containers = min (2*CORES, 1.8*DISKS, (Total available RAM) / MIN_CONTAINER_SIZE)

# of containers = min (2*12, 1.8*16, (60 * 1024) / 2048)

# of containers = min (24,29,39)

# of containers = 24

RAM-per-container = max(MIN_CONTAINER_SIZE, (Total Available RAM) / containers))

RAM-per-container = max(2048, (60 * 1024) / 24))

RAM-per-container = 2560 MB

yarn配置：

yarn.nodemanager.resource.memory-mb = containers * RAM-per-container 24*2560=61,440m

yarn.scheduler.minimum-allocation-mb = RAM-per-container 2560m

yarn.scheduler.maximum-allocation-mb = containers * RAM-per-container 24*2560=61,440m

mapreduce.map.memory.mb = RAM-per-container 2560m

mapreduce.reduce.memory.mb = 2 * RAM-per-container 5,120m

mapreduce.map.java.opts = 0.8 * RAM-per-container 2,048m

mapreduce.reduce.java.opts = 0.8 * 2 * RAM-per-container 4,096‬m

yarn.nodemanager.resource.memory-mb = 22 * 2560 MB 56,320‬m

yarn.scheduler.minimum-allocation-mb = 2560 MB

yarn.scheduler.maximum-allocation-mb = 22 * 2560 MB 56,320‬m

mapreduce.map.memory.mb = 2560 MB

mapreduce.reduce.memory.mb = 22 * 2560 MB 56,320‬m

mapreduce.map.java.opts = 0.8 * 2560 MB 4,096‬m

mapreduce.reduce.java.opts = 0.8 * 2 * 2560 MB 4,096‬m