linux numastat的理解

numa的统计数据及理解如下,

[root@localhost kernel]# numastat
                    node0           node1
numa_hit 26668467593 28643793617
numa_miss 49206566 19035412
numa_foreign 19035412 49206544
interleave_hit 63894 63259
local_node 26668451458 19175681813
other_node 49222701 9487147404
[root@localhost kernel]# expr 28643793617 + 19035412
28662829029
[root@localhost kernel]# expr 19175681813 + 9487147404
28662829217------------------------node1的numa_hit + numa_miss ,与 local_node  + other_node 并不相等。
[root@localhost kernel]# expr 26668451458 + 49222701
26717674159
[root@localhost kernel]# expr 26668467593 + 49206566
26717674159------------------------node1的numa_hit + numa_miss ,与 local_node  + other_node 相等。

简单地看,由于该设备是两个cpu,每个cpu若干个核,从访问路径来看,应该只分配两个node就ok。

由于只有两个node,那么node0的 numa_miss 和node1的numa_foreign 应该相等。

对于node0来说,numa_hit  + numa_miss 的值,是和 local_node  + other_node 相等的,但是node1的numa_hit + numa_miss ,与 local_node  + other_node 并不相等,按道理也应该相等。

内核中针对这个统计:

enum zone_stat_item {

#ifdef CONFIG_NUMA
NUMA_HIT, /* allocated in intended node */
NUMA_MISS, /* allocated in non intended node */
NUMA_FOREIGN, /* was intended here, hit elsewhere */
NUMA_INTERLEAVE_HIT, /* interleaver preferred this zone */
NUMA_LOCAL, /* allocation from local node */
NUMA_OTHER, /* allocation from other node */
#endif

}

查看代码,想到这个统计毕竟是快速变化的值,误差范围内应该没有多少问题。

当然也有可能相差很小,因为毕竟跟访问的时间点有关系,如果看见不相等,可以多敲几遍numastat。

比如我过一会再敲就相等了,如下:

[root@localhost kernel]# numastat
                  node0              node1
numa_hit 27490751188 29654323053
numa_miss 52691771 19585046
numa_foreign 19585046 52691771
interleave_hit 63894 63259
local_node 27490734704 19826774263
other_node 52708255 9847133836

[root@localhost kernel]# expr 27490734704 + 52708255
27543442959
[root@localhost kernel]# expr 27490751188 + 52691771
27543442959
[root@localhost kernel]# expr 29654323053 + 19585046
29673908099
[root@localhost kernel]# expr 19826774263 + 9847133836
29673908099

可能有人会问,看数据,hit和local怎么相差这么少,一开始我也很迷惑,后来仔细看,

查看numastat的manpage。

numa_hit is memory successfully allocated on this node as intended.

numa_miss is memory allocated on this node despite the process preferring some different node. Each numa_miss has a numa_foreign on another node.

numa_foreign is memory intended for this node, but actually allocated on some different node. Each numa_foreign has a numa_miss on another node.

interleave_hit is interleaved memory successfully allocated on this node as intended.

local_node is memory allocated on this node while a process was running on it.

other_node is memory allocated on this node while a process was running on some other node.

hit是我本来想在这个node分配,然后刚好在这个node分配的次数,而local是,我本来进程就在该node对应的cpu上运行,当我要分配内存的时候,就在该节点分配成功了,看起来比较绕,。

举个栗子,当我分配内存的时候,我指定我要从node0上分配,并且分配成功了,这时候hit 要加1,如果我这时候进程在node0上运行,则我的local +1,如果我进程在node1上运行,则我的

other_node +1。

水平有限,如果有错误,请帮忙提醒我。如果您觉得本文对您有帮助,可以点击下面的 推荐 支持一下我。版权所有,需要转发请带上本文源地址,博客一直在更新,欢迎 关注 。
原文地址:https://www.cnblogs.com/10087622blog/p/7346170.html