NUMA CPU affinity issues

Rama27 · ‎04-18-2024

I have a Ubuntu server with 22.04 OS. I encountered an issue with NUMA affinity bound.

1.) Server has two NUMA nodes accordingly.

server-1:# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
node 0 size: 1031782 MB
node 0 free: 989481 MB
node 1 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
node 1 size: 1021920 MB
node 1 free: 1018313 MB
node distances:
node   0   1
  0:  10  21
  1:  21  10

2.) When I setting CPU Affinity for a new process with cpu 0 which is part of numa node0, it spanned multiple cpu's among both numa nodes.

server-1:# taskset -c 1 cat /dev/random >& /dev/null &
[1] 691247

server-1:# taskset -cp 691247
pid 691247's current affinity list: 0,1,30,48-51,96,97,144-147

I think it should stick to cpu 1 only rather than multi cpu. Not sure why taskset always shows "current affinity list: 0,1,30,48-51,96,97,144-147" though we changed coreCan you please let me know how this happened.

Vipin_Singh1 · ‎05-09-2024

Hi Rama, we would like to inform you that we are routing your query to the dedicated team for further assistance.

Pintu · ‎05-10-2024

Hello Rama27,

Greetings for the day!

Regarding these NUMA CPU affinity issues, kindly help us with the below steps:

1. What specific performance problems or challenges are you encountering with your application or workload?

2. Are you experiencing any slowdowns, latency issues, or unexpected behavior?

3. Could you provide details about your system's hardware configuration, including the number of CPUs and memory setup?

4. Kindly confirm if you are aware of any NUMA architecture in your system design.

5. Have you observed any patterns or behaviors suggesting that CPU affinity might be impacting performance?

6. Please confirm if certain processes or threads are consistently running slower or experiencing higher latency.

Kindly help with the above steps to proceed further.

Thank you for choosing Intel products and services.

Best Regards,

Manoranjan Das.

Rama27 · ‎05-12-2024

Hi, Thank you for looking into it. Updates are as follow:

1. What specific performance problems or challenges are you encountering with your application or workload?

--- The Ubuntu installed with 22.04 which has 4 numa nodes. Each time we deploy, the memory allocations are random. i.e MemUsed is different each time. We don't have explicit numa rules configured.

2. Are you experiencing any slowdowns, latency issues, or unexpected behavior?

--- Since the allocations are random, obviously overhead accessing memory on remote node.

3. Could you provide details about your system's hardware configuration, including the number of CPUs and memory setup?

--- The server is Lenovo MB, "Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz", 4 sockets(numa nodes) with 192 cpu's. RAM is 748GB

4. Kindly confirm if you are aware of any NUMA architecture in your system design.

--- I'm aware of NUMA.

5. Have you observed any patterns or behaviors suggesting that CPU affinity might be impacting performance?

--- Though we don't have explicit numa configurations, memory allocations are randomly that causing some performance issues. i.e processes run on a numa node, while memory being allocated on different nodes.

6. Please confirm if certain processes or threads are consistently running slower or experiencing higher latency.

a.) Queried cron process current allocation.

root@server-1:~# taskset -cp $(pidof cron)

pid 8225's current affinity list: 0,24,25,48,49,72,73,96,120,121,144,145,168,169

b.) Delibaretly, changed cpu to 2 which shows it is success.

root@server-1:~# taskset -cp 2 $(pidof cron)

pid 8225's current affinity list: 0,24,25,48,49,72,73,96,120,121,144,145,168,169

pid 8225's new affinity list: 2

c.) After querying, it shows previous state.

root@server-1:~# taskset -cp $(pidof cron)

pid 8225's current affinity list: 0,24,25,48,49,72,73,96,120,121,144,145,168,169

Pintu · ‎05-13-2024

Hello Rama27,

Greeting for the Day!

As per this issue, kindly confirm if you are getting any errors on your screen; if yes, please share the picture with us to proceed further; and please confirm if there are any additional issues you are facing.

Thank you for choosing Intel products and services.

Regards,

Manoranjan Das.

Rama27 · ‎05-13-2024

Hi Manoranjan,

We are commencing the tuning to enforce cpu and memory polices so that processes can be confined to particular node only to avoid remote access. Before that, we are doing some benchmarking to gain better performance.

As per my previous update, if I tried to confine the process on specific cpu, it was successful for a while, then it reverted itself after a while.

My server default policy is 'default' (prefer local node for mem allocation). If we notice mem allocation for a process(cron) that spanned on two nodes with 'default' policy.

558204b35000 default file=/usr/sbin/cron dirty=3 N2=3 kernelpagesize_kB=4
7f94216cd000 default file=/usr/lib/x86_64-linux-gnu/libcap-ng.so.0.0.0 dirty=2 mapmax=23 N0=2 kernelpagesize_kB=4
7f94216d3000 default file=/usr/lib/x86_64-linux-gnu/libcap-ng.so.0.0.0 anon=1 dirty=1 active=0 N2=1 kernelpagesize_kB=4

If we notice the numastat statistics, memory allocation happened with 'interleave' too. I've verified numa_maps for all running processes in the server which all show 'default'. If 'default' is the default policy, how could system allocated with 'interleave'? How OS determines allocations itself though there are no explicit rules.

root@server-1:~# numastat
                           node0           node1           node2           node3
numa_hit                23888413       107767334        88675695        36266619
numa_miss                      0               0               0               0
numa_foreign                   0               0               0               0
interleave_hit             28652           28388           28641           28375
local_node              23382768       107605787        88631499        36191765
other_node                505645          161547           44196           74854

Pintu · ‎05-13-2024

Hello Rama27,

Greeting for the Day!

We appreciate your patience. Please allow some more time to examine this matter.

Thank you for choosing Intel products and services.

Regards,

Manoranjan Das.