简述

当系统内存不足时，系统会触发 oom-killer。
oom-killer的机制就是选择杀掉最适合的进程，释放内存，增加系统的可用内存。

什么时候触发oom-killer？

触发oom-killer不是在malloc分配内存时；而是在真正使用分配的内存时，虚拟内存映射到物理地址时。
malloc的manual中有说明

By default, Linux follows an optimistic memory allocation
strategy. This means that when malloc() returns non-NULL
there is no guarantee that the memory really is available.
This is a really bad bug. In case it turns out that the sys‐
tem is out of memory, one or more processes will be killed by
the infamous OOM killer. In case Linux is employed under cir‐
cumstances where it would be less desirable to suddenly lose
some randomly picked processes, and moreover the kernel ver‐
sion is sufficiently recent, one can switch off this overcom‐
mitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory

overcommit是Linux的一种内存使用机制。通过/proc/sys/vm/overcommit_memory配置，有三种取值:

Overcommit_memory	Description	Comment
0	启发式策略	拒绝明显过大的内存分配。用于典型系统。root比普通用户可分配更多的内存。
1	允许overcommit	这种策略适合那些不能承受内存分配失败的应用，例如科学计算。
2	禁止overcommit	总内存使用空间不能超过swap+RAM系数.当malloc时，内存不足会返回error。系数可以通过/proc/sys/vm/overcommit_ratio*配置，默认50。

只要有overcomit机制，内存不足时，必定会触发oom-killer。

为什么使用oom-killer？

大多数主流发行版的内核都将/proc/sys/vm/overcommit_memory设置为0，意味着进程可以申请的内存比实际可用的更多。这是基于分配的内存并不一定立即使用的启发式策略和可能进程在它整个运行过程中没完全使用它分配的所有内存。
假如禁止overcommit，系统无法完全使用所有的内存，因此会浪费一部分。
overcommit允许系统以更有效的方式使用内存，但是有OOM的风险。
占用内存比较贪婪的进程可能会耗尽系统的内存，让系统停顿。
这就导致了一种情形，当内存很低时，甚至一个页也不够分配时，允许管理员杀掉合适的进程，让内核可以采取一些重要的操作例如释放内存。在这种情形下，oom-killer挑选并杀掉合适的进程为了系统其余的任务考虑。

oom-killer的处理机制

怎么选择最适合的进程？

通过一个评分机制(badness score)选择最适合的进程。badness score 通过/proc/<pid>/oom_score体现。
基于的原则就是用最小的损失换最大的内存。
badness score与该进程占用的内存，它的cpu时间，运行时间，它的/proc/<pid>/oom_adj，运行级别nice有关。
占用内存越多，运行时间越短，badness score越高。
相反，占用内存越少，运行时间越长，badness score越低。
假如badness进程是父进程，它和它的子进程都会被kill掉。
具体的实现参考内核oom_kill.c中 badness()函数。

参考链接：
Taming the OOM killer
When Linux Runs Out of Memory
How the Linux OOM killer works