spark 问题

  1. driver报下面错,同时报在我自己写的代码 collect 部分. top user 不报错,top file报错,我猜是因为file 比user多得多
20/08/24 08:37:15 ERROR MicroBatchExecution: Query [id = de341482-5e75-4c34-b924-146a7eb6c9b0, runId = 13007eb2-10eb-4ef0-a799-dc048a7fc0bf] terminated with error
org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 (start at top_n.scala:646) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 2         at org.apache.spark.MapOutputTracker$.$anonfun$convertMapStatuses$2(MapOutputTracker.scala:1010)         a

executor 报错

20/08/24 08:30:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:43 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:58 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:17 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:35 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:53 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:12 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.

 ref:

  https://blog.csdn.net/lingbo229/article/details/84943560

  https://stackoverflow.com/questions/39963946/pyspark-taskmemorymanager-failed-to-allocate-a-page-need-help-in-error-analys

Solution:

  memory 从16G -> 24G, 然后改成G1 GC collector, 同时加了GC 打印

"spark.executor.extraJavaOptions":  "-XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps",
原文地址:https://www.cnblogs.com/mashuai-191/p/13554718.html