记录一次JVM调优【GC日志的分析】

首先查看服务器版本默认信息:

修改tomcat/bin/catalina.sh,在最顶端加入JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCDetails -Xloggc:/usr/local/java/apache-tomcat-7.0.78/bin/gc.log"

随后在 -Xloggc指定路径下,打开gc.log,部分信息如下:

Java HotSpot(TM) 64-Bit Server VM (25.144-b01) for linux-amd64 JRE (1.8.0_144-b01), built on Jul 21 2017 21:57:33 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 4056480k(1084216k free), swap 0k(0k free)
CommandLine flags: -XX:InitialHeapSize=64903680 -XX:MaxHeapSize=1038458880 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC

[GC (Allocation Failure) [PSYoungGen: 15872K->2544K(18432K)] 15872K->4199K(60928K), 0.0076594 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
0.801: [GC (Allocation Failure) [PSYoungGen: 18416K->2554K(34304K)] 20071K->7458K(76800K), 0.0084303 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
1.102: [GC (Allocation Failure) [PSYoungGen: 34298K->2532K(34304K)] 39202K->10385K(76800K), 0.0073978 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
1.324: [GC (Allocation Failure) [PSYoungGen: 34276K->2540K(66048K)] 42129K->14849K(108544K), 0.0077529 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
1.538: [GC (Allocation Failure) [PSYoungGen: 66028K->2542K(66048K)] 78337K->25278K(108544K), 0.0107662 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
1.710: [GC (Allocation Failure) [PSYoungGen: 66030K->11795K(137216K)] 88766K->34540K(179712K), 0.0176510 secs] [Times: user=0.03 sys=0.01, real=0.02 secs]
2.065: [GC (Allocation Failure) [PSYoungGen: 136723K->14318K(139264K)] 159468K->48794K(181760K), 0.0246635 secs] [Times: user=0.04 sys=0.00, real=0.03 secs]
2.090: [Full GC (Ergonomics) [PSYoungGen: 14318K->2377K(139264K)] [ParOldGen: 34476K->42413K(78848K)] 48794K->44791K(218112K), [Metaspace: 13713K->13713K(1062912K)], 0.0826740 secs] [Times: user=0.15 sys=0.01, real=0.08 secs]
2.488: [GC (Allocation Failure) [PSYoungGen: 127305K->22522K(222720K)] 169719K->65057K(301568K), 0.0318258 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
3.086: [GC (Allocation Failure) [PSYoungGen: 222714K->26617K(268288K)] 265249K->87196K(347136K), 0.0409823 secs] [Times: user=0.07 sys=0.02, real=0.05 secs]
3.731: [GC (Allocation Failure) [PSYoungGen: 268281K->37864K(278016K)] 328860K->112315K(356864K), 0.0531917 secs] [Times: user=0.09 sys=0.02, real=0.05 secs]
3.785: [Full GC (Ergonomics) [PSYoungGen: 37864K->0K(278016K)] [ParOldGen: 74450K->59866K(113152K)] 112315K->59866K(391168K), [Metaspace: 13760K->13760K(1062912K)], 0.1234969 secs] [Times: user=0.24 sys=0.00, real=0.13 secs]
4.434: [GC (Allocation Failure) [PSYoungGen: 240128K->45645K(286208K)] 299994K->105511K(399360K), 0.0358239 secs] [Times: user=0.07 sys=0.00, real=0.04 secs]
4.989: [GC (Allocation Failure) [PSYoungGen: 285773K->42527K(269824K)] 345639K->109322K(382976K), 0.0452064 secs] [Times: user=0.09 sys=0.01, real=0.04 secs]
5.489: [GC (Allocation Failure) [PSYoungGen: 260127K->37422K(255488K)] 326922K->110639K(368640K), 0.0331642 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
7.060: [GC (Allocation Failure) [PSYoungGen: 255022K->11882K(278016K)] 328239K->89219K(391168K), 0.0199962 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]


7.328: [GC (Metadata GC Threshold) [PSYoungGen: 65740K->160K(276480K)] 143076K->81525K(389632K), 0.0071875 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
7.336: [Full GC (Metadata GC Threshold) [PSYoungGen: 160K->0K(276480K)] [ParOldGen: 81365K->30917K(90624K)] 81525K->30917K(367104K), [Metaspace: 23265K->23265K(1071104K)], 0.0701499 secs] [Times: user=0.12 sys=0.00, real=0.07 secs]
9.690: [GC (Allocation Failure) [PSYoungGen: 216064K->15373K(272896K)] 246981K->46298K(363520K), 0.0266823 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]


9.819: [GC (Metadata GC Threshold) [PSYoungGen: 25761K->8213K(274944K)] 56686K->39146K(365568K), 0.0153547 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
9.835: [Full GC (Metadata GC Threshold) [PSYoungGen: 8213K->0K(274944K)] [ParOldGen: 30933K->37095K(108032K)] 39146K->37095K(382976K), [Metaspace: 38475K->38475K(1083392K)], 0.0922216 secs] [Times: user=0.16 sys=0.00, real=0.09 secs]


77.664: [GC (Allocation Failure) [PSYoungGen: 217088K->56304K(242176K)] 254183K->119858K(350208K), 0.0250691 secs] [Times: user=0.04 sys=0.01, real=0.03 secs]
80.575: [GC (Allocation Failure) [PSYoungGen: 242160K->3375K(262144K)] 305714K->122388K(387072K), 0.0230121 secs] [Times: user=0.02 sys=0.02, real=0.03 secs]
80.599: [Full GC (Ergonomics) [PSYoungGen: 3375K->0K(262144K)] [ParOldGen: 119013K->108779K(213504K)] 122388K->108779K(475648K), [Metaspace: 54741K->54741K(1099776K)], 0.1379975 secs] [Times: user=0.24 sys=0.00, real=0.13 secs]

这里

-Xmx1038m:等价于-XX:MaxHeapSize,设置JVM最大堆内存为1038M。

-Xms64m:等价于-XX:InitialHeapSize,设置JVM初始堆内存为64M。

使用ParallelGC收集器

总共5次full gc,其中两次是元空间(jdk8默认值21.8M左右)引发的full gc,三次是HotSpot自动选择和调优引发的FullGC,自适应策略

改善:-XX:MetaspaceSize=64m 初始元空间大小

增加  JAVA_OPTS="$JAVA_OPTS -server -showversion -Xms512m -Xmx512m -XX:MetaspaceSize=64m"

如果-Xmx分配过大,会导致无法为其他tomcat分配内存:There is insufficient memory for the Java Runtime Environment to continue.这里尝试之后,选择512m

可以看见

Java HotSpot(TM) 64-Bit Server VM (25.144-b01) for linux-amd64 JRE (1.8.0_144-b01), built on Jul 21 2017 21:57:33 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 4056480k(1077372k free), swap 0k(0k free)
CommandLine flags: -XX:InitialHeapSize=2147483648 -XX:MaxHeapSize=2147483648 -XX:MetaspaceSize=67108864 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
2.563: [GC (Allocation Failure) [PSYoungGen: 524800K->85308K(611840K)] 524800K->85388K(2010112K), 0.1003043 secs] [Times: user=0.17 sys=0.02, real=0.10 secs]
3.991: [GC (Allocation Failure) [PSYoungGen: 610108K->87037K(611840K)] 610188K->103584K(2010112K), 0.1100503 secs] [Times: user=0.17 sys=0.04, real=0.11 secs]
5.245: [GC (Allocation Failure) [PSYoungGen: 611837K->87018K(611840K)] 628384K->140886K(2010112K), 0.1087186 secs] [Times: user=0.12 sys=0.03, real=0.11 secs]
8.794: [GC (Allocation Failure) [PSYoungGen: 611818K->62918K(611840K)] 665686K->116794K(2010112K), 0.0452909 secs] [Times: user=0.08 sys=0.00, real=0.04 secs]
38.881: [GC (Allocation Failure) [PSYoungGen: 587718K->87036K(611840K)] 641594K->186752K(2010112K), 0.1298147 secs] [Times: user=0.09 sys=0.12, real=0.13 secs]

在38ms之前是减少了4次full gc

重点观察,新生代->老年代的增量,避免promotion failed.   新生代的差量(预计回收,因为先minor gc,再old判断)-heap的差量(实际回收)=老年代的增量

eg:5.245: [GC (Allocation Failure) [PSYoungGen: 611837K->87018K(611840K)] 628384K->140886K(2010112K), 0.1087186 secs] [Times: user=0.12 sys=0.03, real=0.11 secs]

old= 611837-87018K-(628384-140886)=37321K,老年代增长了37M左右

继续改进,替换ParallelGC,改为CMS+ParNew的组合://当前物理内存4G,剩余内存1G

JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCDetails -Xloggc:/usr/local/java/apache-tomcat-robot/apache-tomcat-web/bin/gc.log -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/usr/local/java/apache-tomcat-robot/apache-tomcat-web/bin/java.hprof -XX:+PrintHeapAtGC"
JAVA_OPTS="$JAVA_OPTS -XX:+ExplicitGCInvokesConcurrent"
JAVA_OPTS="$JAVA_OPTS -server -showversion -Xms2g -Xmx2g -Xmn256m
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:MaxGCPauseMillis=200
-XX:+UseCompressedOops
-XX:+UseCMSCompactAtFullCollection
-XX:+CMSClassUnloadingEnabled
-XX:MetaspaceSize=128m
-XX:MaxMetaspaceSize=512m
-XX:MaxTenuringThreshold=7
-XX:TargetSurvivorRatio=90
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+ExplicitGCInvokesConcurrent
-XX:LargePageSizeInBytes=128m
-XX:+CMSParallelRemarkEnabled
-XX:CMSInitiatingOccupancyFraction=70"

TIP:

 -XX:TargetSurvivorRatio=90:允许90%的Survivor区被占用(JVM默认为50%)。提高对于Survivor区的使用率

 -XX:+ExplicitGCInvokesConcurrent  //system.gc使用cms算法

 -XX:+DisableExplicitGC//禁止system.gc

-XX:HeapDumpPath=/home/admin/logs/java.hprof //OOM快照文件的地址

-XX:SurvivorRatio=10
Eden与Survivor的占用比例,例如10表示,一个survivor区占用 1/10 的Eden内存,即1/12的新生代内存,
新生代有2个survivor,即S0和S1。所以survivor总共是占用新生代内存的 2/12,Eden与新生代的占比则为 10/12

-XX:MaxTenuringThreshold=6

-XX:+UseCMSCompactAtFullCollection //CMS开启碎片整理,CMS默认不会整理堆碎片,因此为了防止堆碎片引起full gc,可以开启该选项
-XX:+UseCompressedOops  //启用指针压缩,在64位HotSpot中使用32位指针,默认64位会比32位的内存使用多出1.5倍

-XX:+PrintHeapAtGC  //每一次GC前和GC后,都打印堆信息。

-XX:+CMSClassUnloadingEnabled //对方法区gc

-XX:PretenureSizeThreshold=65535 //eden区的大对象阈值65M

产生的正常日志片断:

。。。
{Heap before GC invocations=22 (full 0): par new generation total 235968K, used 235968K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000) eden space 209792K, 100% used [0x0000000080000000, 0x000000008cce0000, 0x000000008cce0000) from space 26176K, 100% used [0x000000008cce0000, 0x000000008e670000, 0x000000008e670000) to space 26176K, 0% used [0x000000008e670000, 0x000000008e670000, 0x0000000090000000) concurrent mark-sweep generation total 1835008K, used 1123030K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000) Metaspace used 64924K, capacity 66906K, committed 67072K, reserved 1107968K class space used 7769K, capacity 8333K, committed 8448K, reserved 1048576K 2223.576: [GC (Allocation Failure) 2223.576: [ParNew: 235968K->26176K(235968K), 0.0583532 secs] 1358998K->1225544K(2070976K), 0.0585896 secs] [Times: user=0.08 sys=0.02, real=0.06 secs] Heap after GC invocations=23 (full 0): par new generation total 235968K, used 26176K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000) eden space 209792K, 0% used [0x0000000080000000, 0x0000000080000000, 0x000000008cce0000) from space 26176K, 100% used [0x000000008e670000, 0x0000000090000000, 0x0000000090000000) to space 26176K, 0% used [0x000000008cce0000, 0x000000008cce0000, 0x000000008e670000) concurrent mark-sweep generation total 1835008K, used 1199368K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000) Metaspace used 64924K, capacity 66906K, committed 67072K, reserved 1107968K class space used 7769K, capacity 8333K, committed 8448K, reserved 1048576K }
。。。

打开GC前后输出堆信息:

黄色(GC前/后):

before:老年代总大小1835008k,已使用1123030k

after:老年代总大小1835008k,已使用1199368k

橘色(新生代GC变化):235968-26176-(1358998-1225544)=黄色差值(1199368-1123030)

当启动服务器后(时隔约1小时),第一次产生full gc:

{Heap before GC invocations=24 (full 0):
 par new generation   total 235968K, used 235967K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000)
  eden space 209792K, 100% used [0x0000000080000000, 0x000000008cce0000, 0x000000008cce0000)
  from space 26176K,  99% used [0x000000008cce0000, 0x000000008e66fff8, 0x000000008e670000)
  to   space 26176K,   0% used [0x000000008e670000, 0x000000008e670000, 0x0000000090000000)
 concurrent mark-sweep generation total 1835008K, used 1275665K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 65433K, capacity 67440K, committed 67584K, reserved 1107968K
  class space    used 7770K, capacity 8336K, committed 8448K, reserved 1048576K
2699.930: [GC (Allocation Failure) 2699.930: [ParNew: 235967K->26176K(235968K), 0.0610195 secs] 1511633K->1386131K(2070976K), 0.0612115 secs] [Times: user=0.09 sys=0.02, real=0.06 secs] 
Heap after GC invocations=25 (full 0):
 par new generation   total 235968K, used 26176K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000)
  eden space 209792K,   0% used [0x0000000080000000, 0x0000000080000000, 0x000000008cce0000)
  from space 26176K, 100% used [0x000000008e670000, 0x0000000090000000, 0x0000000090000000)
  to   space 26176K,   0% used [0x000000008cce0000, 0x000000008cce0000, 0x000000008e670000)
 concurrent mark-sweep generation total 1835008K, used 1359955K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 65433K, capacity 67440K, committed 67584K, reserved 1107968K
  class space    used 7770K, capacity 8336K, committed 8448K, reserved 1048576K
}
2699.992: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1359955K(1835008K)] 1386239K(2070976K), 0.0043239 secs] [Times: user=0.01 sys=0.02, real=0.00 secs] 
2699.997: [CMS-concurrent-mark-start]
2700.036: [CMS-concurrent-mark: 0.039/0.039 secs] [Times: user=0.08 sys=0.00, real=0.04 secs] 
2700.036: [CMS-concurrent-preclean-start]
2700.040: [CMS-concurrent-preclean: 0.004/0.004 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
2700.040: [GC (CMS Final Remark) [YG occupancy: 26284 K (235968 K)]2700.040: [Rescan (parallel) , 0.0018878 secs]2700.042: [weak refs processing, 0.0027371 secs]2700.045: [class unloading, 0.0212439 secs]2700.066: [scrub symbol table, 0.0075598 secs]2700.074: [scrub string table, 0.0012169 secs][1 CMS-remark: 1359955K(1835008K)] 1386239K(2070976K), 0.0456603 secs] [Times: user=0.06 sys=0.00, real=0.04 secs] 
2700.086: [CMS-concurrent-sweep-start]
2700.170: [CMS-concurrent-sweep: 0.082/0.083 secs] [Times: user=0.09 sys=0.00, real=0.09 secs] 
2700.170: [CMS-concurrent-reset-start]
2700.175: [CMS-concurrent-reset: 0.005/0.005 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
{Heap before GC invocations=25 (full 1):
 par new generation   total 235968K, used 235968K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000)
  eden space 209792K, 100% used [0x0000000080000000, 0x000000008cce0000, 0x000000008cce0000)
  from space 26176K, 100% used [0x000000008e670000, 0x0000000090000000, 0x0000000090000000)
  to   space 26176K,   0% used [0x000000008cce0000, 0x000000008cce0000, 0x000000008e670000)
 concurrent mark-sweep generation total 1835008K, used 431812K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 65340K, capacity 67220K, committed 67584K, reserved 1107968K
  class space    used 7748K, capacity 8294K, committed 8448K, reserved 1048576K
2938.667: [GC (Allocation Failure) 2938.667: [ParNew: 235968K->26176K(235968K), 0.0448878 secs] 667780K->541410K(2070976K), 0.0450784 secs] [Times: user=0.09 sys=0.00, real=0.05 secs] 
Heap after GC invocations=26 (full 1):
 par new generation   total 235968K, used 26176K [0x0000000080000000, 0x0000000090000000, 0x0000000090000000)
  eden space 209792K,   0% used [0x0000000080000000, 0x0000000080000000, 0x000000008cce0000)
  from space 26176K, 100% used [0x000000008cce0000, 0x000000008e670000, 0x000000008e670000)
  to   space 26176K,   0% used [0x000000008e670000, 0x000000008e670000, 0x0000000090000000)
 concurrent mark-sweep generation total 1835008K, used 515234K [0x0000000090000000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 65340K, capacity 67220K, committed 67584K, reserved 1107968K
  class space    used 7748K, capacity 8294K, committed 8448K, reserved 1048576K
}

可以看到当超过老年代阈值的时候,会触发cms,清理掉约1g的垃圾


GC之日分析其他实例,参考 http://hot66hot.iteye.com/blog/2075819

新生代(ParNew YGC)promotion failed日志 

  1. 2014-02-27T21:19:42.460+0800: 210095.040: [GC 210095.040: [ParNew (promotion failed): 1887487K->1887488K(1887488K), 0.4818790 secs]210095.522: [CMS: 13706434K->7942818K(23068672K), 9.7152990 secs] 15358303K->7942818K(24956160K), [CMS Perm : 27424K->27373K(98304K)], 10.1974110 secs] [Times: user=12.06 sys=0.01, real=10.20 secs]  

 promotion failed一般发生在新生代晋升老年代时,引发Full Gc.  //CMS: 13706434K->7942818K(23068672K)指的是老年代的前后变化

解决思路:YGC晋升对象过大,过老.(mid/long Time Object),

调整-XX:PretenureSizeThreshold=65535,//虚拟机提供了一个-XX:PretenureSizeThreshold参数,令大于这个设置值的对象直接在老年代分配。这样做的目的是避免在Eden区及两个Survivor区之间发生大量的内存复制 ,注意:PretenureSizeThreshold参数只对Serial和ParNew两款收集器有效

 -XX:MaxTenuringThreshold=6,缩短年龄,使新生代空间得以保障

老年代(CMS old gc)分析:

  1. 2014-02-28T23:58:42.314+0800: 25789.661: [GC [1 CMS-initial-mark: 17303356K(23068672K)] 18642315K(24903680K), 1.0400410 secs] [Times: user=1.04 sys=0.00, real=1.04 secs]  
  2. 2014-02-28T23:58:43.354+0800: 25790.701: [CMS-concurrent-mark-start]  
  3. 2014-02-28T23:58:43.717+0800: 25791.064: [CMS-concurrent-mark: 0.315/0.363 secs] [Times: user=1.64 sys=0.02, real=0.37 secs]  
  4. 2014-02-28T23:58:43.717+0800: 25791.064: [CMS-concurrent-preclean-start]  
  5. 2014-02-28T23:58:43.907+0800: 25791.254: [CMS-concurrent-preclean: 0.181/0.190 secs] [Times: user=0.20 sys=0.01, real=0.19 secs]  
  6. 2014-02-28T23:58:43.907+0800: 25791.254: [CMS-concurrent-abortable-preclean-start]  
  7.  CMS: abort preclean due to time 2014-02-28T23:58:49.082+0800: 25796.429: [CMS-concurrent-abortable-preclean: 5.165/5.174 secs] [Times: user=5.40 sys=0.04, real=5.17 secs]  
  8. 2014-02-28T23:58:49.083+0800: 25796.430: [GC[YG occupancy: 1365142 K (1835008 K)]2014-02-28T23:58:49.083+0800: 25796.430: [Rescan (parallel) , 0.9690640 secs]2014-02-28T23:58:50.052+0800: 25797.399: [weak refs processing, 0.0006190 secs]2014-02-28T23:58:50.053+0800: 25797.400: [scrub string table, 0.0006290 secs] [1 CMS-remark: 17355150K(23068672K)] 18720292K(24903680K), 0.9706650 secs] [Times: user=16.49 sys=0.06, real=0.97 secs]  
  9. 2014-02-28T23:58:50.054+0800: 25797.401: [CMS-concurrent-sweep-start]  
  10. 2014-02-28T23:58:51.940+0800: 25799.287: [CMS-concurrent-sweep: 1.875/1.887 secs] [Times: user=2.03 sys=0.03, real=1.89 secs]  
  11. 2014-02-28T23:58:51.941+0800: 25799.288: [CMS-concurrent-reset-start]  
  12. 2014-02-28T23:58:52.067+0800: 25799.414: [CMS-concurrent-reset: 0.127/0.127 secs] [Times: user=0.13 sys=0.00, real=0.13 secs]  
  13. 2014-03-01T00:00:36.293+0800: 25903.640: [GC2014-03-01T00:00:36.293+0800: 25903.640: [ParNew: 1805234K->226801K(1835008K), 0.1020510 secs] 10902912K->9434796K(24903680K), 0.1023150 secs] [Times: user=1.35 sys=0.02, real=0.10 secs]  
  14. 2014-03-01T00:07:13.559+0800: 26300.906: [GC2014-03-01T00:07:13.559+0800: 26300.906: [ParNew: 1799665K->248991K(1835008K), 0.0876870 secs] 14086673K->12612462K(24903680K), 0.0879620 secs] [Times: user=1.24 sys=0.01, real=0.09 secs]  

 CMS的gc日志分为一下几个步骤,重点关注initial-mark和remark这两个阶段,因为这两个阶段会stop进程。

初始标记(init mark):收集根引用,这是一个stop-the-world阶段。

并发标记(concurrent mark):这个阶段可以和用户应用并发进行。遍历老年代的对象图,标记出活着的对象。

并发预清理(concurrent preclean):这同样是一个并发的阶段。主要的用途也是用来标记,用来标记那些在前面标记之后,发生变化的引用。主要是为了缩短remark阶段的stop-the-world的时间。

重新标记(remark):这是一个stop-the-world的操作。暂停各个应用,统计那些在发生变化的标记。

并发清理(concurrent sweep):并发扫描整个老年代,回收一些在对象图中不可达对象所占用的空间。

并发重置(concurrent reset):重置某些数据结果,以备下一个回收周期

老年代(CMS old GC ) concurrent mode failure日志

  1. 2014-03-03T09:38:26.457+0800: 233373.804: [GC [1 CMS-initial-mark: 17319615K(23068672K)] 17351070K(24903680K), 0.0419440 secs]   
  2. [Times: user=0.04 sys=0.00, real=0.04 secs]  
  3. 2014-03-03T09:38:26.499+0800: 233373.846: [CMS-concurrent-mark-start]  
  4. 2014-03-03T09:38:28.175+0800: 233375.522: [GC2014-03-03T09:38:28.175+0800: 233375.522: [CMS2014-03-03T09:38:28.887+0800: 233376.234:   
  5. [CMS-concurrent-mark: 1.989/2.388 secs] [Times: user=14.37 sys=0.24, real=2.39 secs]  
  6.  (concurrent mode failure): 17473174K->8394653K(23068672K), 19.3309170 secs] 18319691K->8394653K(24903680K),   
  7.  [CMS Perm : 23157K->23154K(98304K)], 19.3311700 secs] [Times: user=22.18 sys=0.00, real=19.33 secs]  

 concurrent mode failure一般发生在CMS GC 运行过程中(有新的大对象或者满足old阈值的对象晋升老年代),老年代空间不足,引发MSC(Full GC)

上面的这条发日志说明CMS运行到CMS-concurrent-mark过程中就出现空间不足,产生并发失败(17319615K(23068672K)占77%),

解决思路:降低YGC频率,降低CMS GC触发时机,适当降低CMSInitiatingOccupancyFraction(即分配给老年代更大的预留空间).

由新生代触发的老年代(CMS old GC ) concurrent mode failure日志

197.976: [GC 197.976: [ParNew: 260872K->260872K(261952K), 0.0000688 secs]

197.976: [CMS197.981: [CMS-concurrent-sweep: 0.516/0.531 secs] (concurrent mode failure): 402978K->248977K(786432K), 2.3728734 secs] 663850K->248977K(1048384K), 2.3733725 secs]

这个例子表明:GC算法进入了concurrent mode failure状态,由于老年代的内存碎片太多导致的老年代空间不足,需要调用一个serail Old GC(阻塞了其他线程)来清理系统的Heap

concurrent mode failure和promotion failed差别:

promotion failed是在进行Minor GC时,survivor space放不下、对象只能放入旧生代,而此时旧生代也放不下造成的;

concurrent mode failure是在执行CMS GC的过程中同时有对象要放入旧生代,而此时旧生代空间不足造成的。

当出现concurrent mode failure和promotion failed,会接着引发full gc的。

所以在别一次性占用太多的内存,如果是读文件,可以采用拆分的方法(批次读入缓存区)。或者把GC内存调大点

原文地址:https://www.cnblogs.com/brxHqs/p/9636449.html