linux numastat的理解

numa的统计数据及理解如下，

[root@localhost kernel]# numastat
node0 node1
numa_hit 26668467593 28643793617
numa_miss 49206566 19035412
numa_foreign 19035412 49206544
interleave_hit 63894 63259
local_node 26668451458 19175681813
other_node 49222701 9487147404
[root@localhost kernel]# expr 28643793617 + 19035412
28662829029
[root@localhost kernel]# expr 19175681813 + 9487147404
28662829217------------------------node1的numa_hit + numa_miss ,与 local_node + other_node 并不相等。
[root@localhost kernel]# expr 26668451458 + 49222701
26717674159
[root@localhost kernel]# expr 26668467593 + 49206566
26717674159------------------------node1的numa_hit + numa_miss ,与 local_node + other_node 相等。

简单地看，由于该设备是两个cpu，每个cpu若干个核，从访问路径来看，应该只分配两个node就ok。

由于只有两个node，那么node0的 numa_miss 和node1的numa_foreign 应该相等。

对于node0来说，numa_hit + numa_miss 的值，是和 local_node + other_node 相等的，但是node1的numa_hit + numa_miss ,与 local_node + other_node 并不相等，按道理也应该相等。

内核中针对这个统计：

enum zone_stat_item {

#ifdef CONFIG_NUMA
NUMA_HIT, /* allocated in intended node */
NUMA_MISS, /* allocated in non intended node */
NUMA_FOREIGN, /* was intended here, hit elsewhere */
NUMA_INTERLEAVE_HIT, /* interleaver preferred this zone */
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
#endif

}

查看代码，想到这个统计毕竟是快速变化的值，误差范围内应该没有多少问题。

当然也有可能相差很小，因为毕竟跟访问的时间点有关系，如果看见不相等，可以多敲几遍numastat。

比如我过一会再敲就相等了，如下：

[root@localhost kernel]# numastat
node0 node1
numa_hit 27490751188 29654323053
numa_miss 52691771 19585046
numa_foreign 19585046 52691771
interleave_hit 63894 63259
local_node 27490734704 19826774263
other_node 52708255 9847133836

[root@localhost kernel]# expr 27490734704 + 52708255
27543442959
[root@localhost kernel]# expr 27490751188 + 52691771
27543442959
[root@localhost kernel]# expr 29654323053 + 19585046
29673908099
[root@localhost kernel]# expr 19826774263 + 9847133836
29673908099

可能有人会问，看数据，hit和local怎么相差这么少，一开始我也很迷惑，后来仔细看，

查看numastat的manpage。

numa_hit is memory successfully allocated on this node as intended.

numa_miss is memory allocated on this node despite the process preferring some different node. Each numa_miss has a numa_foreign on another node.

numa_foreign is memory intended for this node, but actually allocated on some different node. Each numa_foreign has a numa_miss on another node.

interleave_hit is interleaved memory successfully allocated on this node as intended.

local_node is memory allocated on this node while a process was running on it.

other_node is memory allocated on this node while a process was running on some other node.

hit是我本来想在这个node分配，然后刚好在这个node分配的次数，而local是，我本来进程就在该node对应的cpu上运行，当我要分配内存的时候，就在该节点分配成功了，看起来比较绕,。

举个栗子，当我分配内存的时候，我指定我要从node0上分配，并且分配成功了，这时候hit 要加1，如果我这时候进程在node0上运行，则我的local +1，如果我进程在node1上运行，则我的

other_node +1。