KVM虚拟机内存不足,调整参数

Dec 20 21:23:45 vgfs001 kernel: tiotest_AMD_x86 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
Dec 20 21:23:45 vgfs001 kernel: tiotest_AMD_x86 cpuset=/ mems_allowed=0
Dec 20 21:23:45 vgfs001 kernel: Pid: 1937, comm: tiotest_AMD_x86 Not tainted 2.6.32-431.29.2.lustre.el6.x86_64 #1
Dec 20 21:23:45 vgfs001 kernel: Call Trace:
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff810d07b1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122b80>] ? dump_header+0x90/0x1b0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8122894c>] ? security_real_capable_noaudit+0x3c/0x70
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81123002>] ? oom_kill_process+0x82/0x2a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122f41>] ? select_bad_process+0xe1/0x120
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81123440>] ? out_of_memory+0x220/0x3c0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8112fd5f>] ? __alloc_pages_nodemask+0x89f/0x8d0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81167cea>] ? alloc_pages_current+0xaa/0x110
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8111ff77>] ? __page_cache_alloc+0x87/0x90
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81120c8e>] ? grab_cache_page_write_begin+0x8e/0xc0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a8f228>] ? ll_write_begin+0x58/0x1a0 [lustre]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff811204f3>] ? generic_file_buffered_write+0x123/0x2e0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81078fd7>] ? current_fs_time+0x27/0x30
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81121f50>] ? __generic_file_aio_write+0x260/0x490
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05211a5>] ? cl_env_info+0x15/0x20 [obdclass]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122208>] ? generic_file_aio_write+0x88/0x100
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0aa3907>] ? vvp_io_write_start+0x137/0x2a0 [lustre]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05301da>] ? cl_io_start+0x6a/0x140 [obdclass]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05348e4>] ? cl_io_loop+0xb4/0x1b0 [obdclass]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a46306>] ? ll_file_io_generic+0x2a6/0x610 [lustre]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a47192>] ? ll_file_aio_write+0x142/0x2c0 [lustre]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a4747c>] ? ll_file_write+0x16c/0x2a0 [lustre]
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81189298>] ? vfs_write+0xb8/0x1a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81189c61>] ? sys_write+0x51/0x90
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff810e204e>] ? __audit_syscall_exit+0x25e/0x290
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Dec 20 21:23:45 vgfs001 kernel: Mem-Info:
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    4: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    5: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    6: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    7: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    8: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    9: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU   10: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU   11: hi:    0, btch:   1 usd:   0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32 per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU    0: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    1: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    5: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    6: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    7: hi:  186, btch:  31 usd:  11
Dec 20 21:23:45 vgfs001 kernel: CPU    8: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    9: hi:  186, btch:  31 usd:  46
Dec 20 21:23:45 vgfs001 kernel: CPU   10: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU   11: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU    0: hi:  186, btch:  31 usd:   2
Dec 20 21:23:45 vgfs001 kernel: CPU    1: hi:  186, btch:  31 usd:   7
Dec 20 21:23:45 vgfs001 kernel: CPU    2: hi:  186, btch:  31 usd:  27
Dec 20 21:23:45 vgfs001 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    4: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    5: hi:  186, btch:  31 usd:  39
Dec 20 21:23:45 vgfs001 kernel: CPU    6: hi:  186, btch:  31 usd:  33
Dec 20 21:23:45 vgfs001 kernel: CPU    7: hi:  186, btch:  31 usd:   0
Dec 20 21:23:45 vgfs001 kernel: CPU    8: hi:  186, btch:  31 usd:   1
Dec 20 21:23:45 vgfs001 kernel: CPU    9: hi:  186, btch:  31 usd:  35
Dec 20 21:23:45 vgfs001 kernel: CPU   10: hi:  186, btch:  31 usd:  29
Dec 20 21:23:45 vgfs001 kernel: CPU   11: hi:  186, btch:  31 usd:   2
Dec 20 21:23:45 vgfs001 kernel: active_anon:1198006 inactive_anon:171400 isolated_anon:96
Dec 20 21:23:45 vgfs001 kernel: active_file:548228 inactive_file:548497 isolated_file:0
Dec 20 21:23:45 vgfs001 kernel: unevictable:0 dirty:899 writeback:2342 unstable:0
Dec 20 21:23:45 vgfs001 kernel: free:29297 slab_reclaimable:10639 slab_unreclaimable:376601
Dec 20 21:23:45 vgfs001 kernel: mapped:1032 shmem:0 pagetables:5613 bounce:0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA free:15708kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 3512 12097 12097
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32 free:53892kB min:19596kB low:24492kB high:29392kB active_anon:4kB inactive_anon:44kB active_file:1249260kB inactive_file:1249288kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3596496kB mlocked:0kB dirty:3436kB writeback:4180kB mapped:0kB shmem:0kB slab_reclaimable:24608kB slab_unreclaimable:689432kB kernel_stack:8kB pagetables:196kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4212142 all_unreclaimable? no
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 0 8585 8585
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal free:47588kB min:47900kB low:59872kB high:71848kB active_anon:4792020kB inactive_anon:685556kB active_file:943652kB inactive_file:944700kB unevictable:0kB isolated(anon):384kB isolated(file):0kB present:8791040kB mlocked:0kB dirty:160kB writeback:5188kB mapped:4128kB shmem:0kB slab_reclaimable:17948kB slab_unreclaimable:816972kB kernel_stack:5040kB pagetables:22256kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2346101 all_unreclaimable? no
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 0 0 0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA: 3*4kB 2*8kB 2*16kB 1*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15708kB
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32: 183*4kB 19*8kB 19*16kB 19*32kB 24*64kB 17*128kB 7*256kB 5*512kB 27*1024kB 8*2048kB 0*4096kB = 53892kB
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal: 109*4kB 185*8kB 121*16kB 43*32kB 8*64kB 117*128kB 43*256kB 8*512kB 1*1024kB 1*2048kB 2*4096kB = 47084kB
Dec 20 21:23:45 vgfs001 kernel: 1269461 total pagecache pages
Dec 20 21:23:45 vgfs001 kernel: 172616 pages in swap cache
Dec 20 21:23:45 vgfs001 kernel: Swap cache stats: add 1017139, delete 844523, find 444300/457367
Dec 20 21:23:45 vgfs001 kernel: Free swap  = 3377416kB
Dec 20 21:23:45 vgfs001 kernel: Total swap = 4194300kB
Dec 20 21:23:45 vgfs001 kernel: 3145727 pages RAM
Dec 20 21:23:45 vgfs001 kernel: 96633 pages reserved
Dec 20 21:23:45 vgfs001 kernel: 9844603 pages shared
Dec 20 21:23:45 vgfs001 kernel: 528776 pages non-shared
Dec 20 21:23:45 vgfs001 kernel: [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
Dec 20 21:23:45 vgfs001 kernel: [  591]     0   591     2817        4   9     -17         -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2028]     0  2028     6899       30   0     -17         -1000 auditd
Dec 20 21:23:45 vgfs001 kernel: [ 2058]     0  2058    63875       54   2       0             0 rsyslogd
Dec 20 21:23:45 vgfs001 kernel: [ 2088]     0  2088     2740       38   7       0             0 irqbalance
Dec 20 21:23:45 vgfs001 kernel: [ 2110]    32  2110     4744       22   1       0             0 rpcbind
Dec 20 21:23:45 vgfs001 kernel: [ 2229]    81  2229     8028        9   3       0             0 dbus-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 2251]    29  2251     5837       10   2       0             0 rpc.statd
Dec 20 21:23:45 vgfs001 kernel: [ 2281]     0  2281    47351       11   7       0             0 cupsd
Dec 20 21:23:45 vgfs001 kernel: [ 2317]     0  2317     1020        8   0       0             0 acpid
Dec 20 21:23:45 vgfs001 kernel: [ 2327]    68  2327     9771      123   9       0             0 hald
Dec 20 21:23:45 vgfs001 kernel: [ 2328]     0  2328     5100        9  10       0             0 hald-runner
Dec 20 21:23:45 vgfs001 kernel: [ 2370]     0  2370     5630        8   7       0             0 hald-addon-inpu
Dec 20 21:23:45 vgfs001 kernel: [ 2376]    68  2376     4502        9   0       0             0 hald-addon-acpi
Dec 20 21:23:45 vgfs001 kernel: [ 2396]     0  2396    96535       42  11       0             0 automount
Dec 20 21:23:45 vgfs001 kernel: [ 2425]     0  2425    16671        8   4     -17         -1000 sshd
Dec 20 21:23:45 vgfs001 kernel: [ 2534]     0  2534    20331       28   4       0             0 master
Dec 20 21:23:45 vgfs001 kernel: [ 2549]    89  2549    20397       29  10       0             0 qmgr
Dec 20 21:23:45 vgfs001 kernel: [ 2562]     0  2562    28661        7   1       0             0 abrtd
Dec 20 21:23:45 vgfs001 kernel: [ 2577]     0  2577    27116       77   6       0             0 ksmtuned
Dec 20 21:23:45 vgfs001 kernel: [ 2589]     0  2589    29332       21   6       0             0 crond
Dec 20 21:23:45 vgfs001 kernel: [ 2638]     0  2638     5394        5   4       0             0 atd
Dec 20 21:23:45 vgfs001 kernel: [ 2649]     0  2649   104692     1712   3       0             0 python
Dec 20 21:23:45 vgfs001 kernel: [ 2666]     0  2666   257137      979   3       0             0 libvirtd
Dec 20 21:23:45 vgfs001 kernel: [ 2695]     0  2695    27085        6   5       0             0 rhsmcertd
Dec 20 21:23:45 vgfs001 kernel: [ 2796]    99  2796     3223        9   7       0             0 dnsmasq
Dec 20 21:23:45 vgfs001 kernel: [ 2802]     0  2802    16175        7   1       0             0 certmonger
Dec 20 21:23:45 vgfs001 kernel: [ 2824]     0  2824    33502       11   1       0             0 gdm-binary
Dec 20 21:23:45 vgfs001 kernel: [ 2840]     0  2840     1016        6   3       0             0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2842]     0  2842     1016        6   7       0             0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2844]     0  2844     1016        6   4       0             0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2846]     0  2846     1016        6   4       0             0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2850]     0  2850     1016        6   4       0             0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2862]     0  2862     3212        4   9     -17         -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2863]     0  2863     3212        4   9     -17         -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2911]     0  2911    41157       11   6       0             0 gdm-simple-slav
Dec 20 21:23:45 vgfs001 kernel: [ 2929]     0  2929    35211      911   2       0             0 Xorg
Dec 20 21:23:45 vgfs001 kernel: [ 2970]     0  2970  1029163       10   1       0             0 console-kit-dae
Dec 20 21:23:45 vgfs001 kernel: [ 3040]    42  3040     5010        5   9       0             0 dbus-launch
Dec 20 21:23:45 vgfs001 kernel: [ 3041]    42  3041     7951       10   0       0             0 dbus-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 3043]    42  3043    67404       11   8       0             0 gnome-session
Dec 20 21:23:45 vgfs001 kernel: [ 3046]     0  3046    12497       11   3       0             0 devkit-power-da
Dec 20 21:23:45 vgfs001 kernel: [ 3052]    42  3052    33326       64   0       0             0 gconfd-2
Dec 20 21:23:45 vgfs001 kernel: [ 3069]    42  3069    91526     3293   8       0             0 gnome-settings-
Dec 20 21:23:45 vgfs001 kernel: [ 3070]    42  3070    30178       56   0       0             0 at-spi-registry
Dec 20 21:23:45 vgfs001 kernel: [ 3072]    42  3072    89614       11   6       0             0 bonobo-activati
Dec 20 21:23:45 vgfs001 kernel: [ 3080]    42  3080    33821       11   8       0             0 gvfsd
Dec 20 21:23:45 vgfs001 kernel: [ 3081]    42  3081    72400       92   0       0             0 metacity
Dec 20 21:23:45 vgfs001 kernel: [ 3084]    42  3084    68544       64   2       0             0 gnome-power-man
Dec 20 21:23:45 vgfs001 kernel: [ 3085]    42  3085    62195       10   6       0             0 polkit-gnome-au
Dec 20 21:23:45 vgfs001 kernel: [ 3087]    42  3087    96302      288   0       0             0 gdm-simple-gree
Dec 20 21:23:45 vgfs001 kernel: [ 3094]     0  3094    13186       10   9       0             0 polkitd
Dec 20 21:23:45 vgfs001 kernel: [ 3107]    42  3107    86550        9   5       0             0 pulseaudio
Dec 20 21:23:45 vgfs001 kernel: [ 3109]   499  3109    42114       25  10       0             0 rtkit-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 3114]     0  3114    35562       11   6       0             0 gdm-session-wor
Dec 20 21:23:45 vgfs001 kernel: [27425]     0 27425    25109       40   3       0             0 sshd
Dec 20 21:23:45 vgfs001 kernel: [27430]     0 27430    27123       80   6       0             0 bash
Dec 20 21:23:45 vgfs001 kernel: [ 1567]     0  1567  1711609  1190642   1       0             0 lwfsd
Dec 20 21:23:45 vgfs001 kernel: [ 1691]    89  1691    20351       20   5       0             0 pickup
Dec 20 21:23:45 vgfs001 kernel: [ 1926]     0  1926    25227       25   8       0             0 sleep
Dec 20 21:23:45 vgfs001 kernel: [ 1927]     0  1927    46749     4269   7       0             0 tiotest_AMD_x86
Dec 20 21:23:45 vgfs001 kernel: Out of memory: Kill process 1567 (lwfsd) score 306 or sacrifice child
Dec 20 21:23:45 vgfs001 kernel: Killed process 1567, UID 0, (lwfsd) total-vm:6846436kB, anon-rss:4742528kB, file-rss:20040kB


这里是从Lustre的入口导致的oom,但实际上,其他入口例如KVM管理程序也可能引起oom,即任何分配内存的可能点都可能引起oom。

从分析过程来看,确实是Lustre的Cache占用了大量内存,导致内存分配不足。

三个措施。
1、增大内存
从12GB增大到16GB。
virsh setmaxmem vgfsxxx 16GB --config
运行启动后
virsh setmem vgfsxxx 16GB
这个没有用,跑了几次测试后,仍然掉服务。

2、调整lwfsd的服务优先级
设置lwfsd的服务优先级为“-17”
PID=`ps | grep lwfs | grep -v grep | awk '{print $1}'`
echo -17 > /proc/$PID/oom_adj
echo -17 > /proc/$PID/task/$PID/oom_adj
这个好像有用。

3、修改内存分配策略
并且echo "2" >/proc/sys/vm/overcommit_memory,使得分配内存时,必须存在足够的空间用于映射。
这个好像也有一定的用处。再跑跑试试。

原文地址:https://www.cnblogs.com/wangtao1993/p/5999370.html