10.Execution failed with exit status: 3

错误信息：

insert overwrite table t_mobile_mid_use_p_tmp4_rcf

select '201411' as month_id,

a.prov_id, a.city, a.client_imsi, a.os_version,

b.install_status, b.install_date, b.unstall_status, b.unstall_date,

a.label_name, a.package_name, a.app_version, a.app_type_id, a.type_label_name,

b.run_time, monthSpace(b.install_date) as install_days,

a.flow, a.use_time, a.run_count, a.active_days, a.is_from_plugin,

from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:mm:ss') as load_date

from t_mobile_mid_use_p_tmp3_1_rcf a

join t_mobile_client_p_rcf b on (a.client_imsi = b.client_imsi and a.label_name = b.label_name);

Query ID = ca_20141218152020_9e4ebfa2-f663-47b8-a0cf-5303b9c0e482

Total jobs = 1

14/12/18 15:21:02 WARN conf.Configuration: file:/tmp/ca/hive_2014-12-18_15-20-54_155_1926187970964040123-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.

Execution log at: /tmp/ca/ca_20141218152020_9e4ebfa2-f663-47b8-a0cf-5303b9c0e482.log

2014-12-18 03:21:03 Starting to launch local task to process map join; maximum memory = 1065484288

2014-12-18 03:21:08 Processing rows: 200000 Hashtable size: 199999 Memory usage: 112049704 percentage: 0.105

2014-12-18 03:21:09 Processing rows: 300000 Hashtable size: 299999 Memory usage: 160367688 percentage: 0.151

2014-12-18 03:21:10 Processing rows: 400000 Hashtable size: 399999 Memory usage: 209294088 percentage: 0.196

2014-12-18 03:21:11 Processing rows: 500000 Hashtable size: 499999 Memory usage: 257089944 percentage: 0.241

2014-12-18 03:21:12 Processing rows: 600000 Hashtable size: 599999 Memory usage: 305440536 percentage: 0.287

2014-12-18 03:21:14 Processing rows: 700000 Hashtable size: 699999 Memory usage: 347305664 percentage: 0.326

2014-12-18 03:21:14 Processing rows: 800000 Hashtable size: 799999 Memory usage: 403916624 percentage: 0.379

2014-12-18 03:21:16 Processing rows: 900000 Hashtable size: 899999 Memory usage: 452238592 percentage: 0.424

2014-12-18 03:21:16 Processing rows: 1000000 Hashtable size: 999999 Memory usage: 499593552 percentage: 0.469

2014-12-18 03:21:18 Processing rows: 1100000 Hashtable size: 1099999 Memory usage: 547966320 percentage: 0.514

2014-12-18 03:21:19 Processing rows: 1200000 Hashtable size: 1199999 Memory usage: 593792800 percentage: 0.557

2014-12-18 03:21:21 Processing rows: 1300000 Hashtable size: 1299999 Memory usage: 641564688 percentage: 0.602

2014-12-18 03:21:21 Processing rows: 1400000 Hashtable size: 1399999 Memory usage: 690130432 percentage: 0.648

2014-12-18 03:21:21 Processing rows: 1500000 Hashtable size: 1499999 Memory usage: 737340976 percentage: 0.692

2014-12-18 03:21:24 Processing rows: 1600000 Hashtable size: 1599999 Memory usage: 793258352 percentage: 0.745

2014-12-18 03:21:25 Processing rows: 1700000 Hashtable size: 1699999 Memory usage: 841009952 percentage: 0.789

2014-12-18 03:21:25 Processing rows: 1800000 Hashtable size: 1799999 Memory usage: 887464680 percentage: 0.833

2014-12-18 03:21:28 Processing rows: 1900000 Hashtable size: 1899999 Memory usage: 934581288 percentage: 0.877

2014-12-18 03:21:28 Processing rows: 2000000 Hashtable size: 1999999 Memory usage: 984062056 percentage: 0.924

Execution failed with exit status: 3

Obtaining error information

Task failed!

Task ID:

Stage-5

官方FAQ解释：

`Hive converted a join into a locally running and faster 'mapjoin', but ran out of memory while doing so. There are two bugs responsible for this.`

hives metric for converting joins miscalculated the required amount of memory. This is especially true for compressed files and ORC files, as hive uses the filesize as metric, but compressed tables require more memory in their uncompressed 'in memory representation'.

The later option may lead to bug number two if you happen to have a affected Hadoop version.

Hive/Hadoop ignores 'hive.mapred.local.mem' ! (more exactly: bug in Hadoop 2.2 where hadoop-env.cmd sets the -xmx parameter multiple times, effectively overriding the user set hive.mapred.local.mem setting. see:

2) & 3) can be set in Big-Bench/engines/hive/conf/hiveSettings.sql
原因：

t_mobile_client_p_rcft_mobile_mid_use_p_tmp3_1_rcf 因此，Hive优化器认为是小表，所以，会将这张表数据加到DistributeCache中，造成内存溢出。

======select count(1) from t_mobile_mid_use_p_tmp3_1_rcf;

/**

*MapReduce Jobs Launched:

*Job 0: Map: 14 Reduce: 1 Cumulative CPU: 102.42 sec HDFS Read: 172923550 HDFS Write: 9 SUCCESS

*Total MapReduce CPU Time Spent: 1 minutes 42 seconds 420 msec

*OK

*34304843

*Time taken: 33.022 seconds, Fetched: 1 row(s)

======select count(*) from t_mobile_client_p_rcf;

/**

*MapReduce Jobs Launched:

*Job 0: Map: 5 Reduce: 1 Cumulative CPU: 62.47 sec HDFS Read: 116257926 HDFS Write: 10 SUCCESS

*Total MapReduce CPU Time Spent: 1 minutes 2 seconds 470 msec

*OK

*165830880

*Time taken: 37.75 seconds, Fetched: 1 row(s)

*/

解决方法：

set hive.auto.convert.join=false;关闭自动转化MapJoin，默认为true;

set hive.ignore.mapjoin.hint=false; 关闭忽略mapjoin的hints（不忽略，hints有效），默认为true(忽略hints)。