8 K80 but there are two groups in P2P

Hi,

I may have an incorrect topology. The gpu communication is strange. Do you have any idea? Thank you.

Doing “lspci –t” on the gpu machine I’m getting following:

|           +-1f.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 VCU
|           \-1f.2  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 VCU
+-[0000:80]-+-02.0-[81-84]----00.0-[82-84]--+-08.0-[83]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU2
|           |                               \-10.0-[84]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU3
|           +-03.0-[85-8e]----00.0-[86-8e]--+-08.0-[87-8a]----00.0-[88-8a]--+-08.0-[89]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU4
|           |                               |                               \-10.0-[8a]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU5
|           |                               \-10.0-[8b-8e]----00.0-[8c-8e]--+-08.0-[8d]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU6
|           |                                                               \-10.0-[8e]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU7
|           +-04.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMA Channel 0
|           +-04.1  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMA Channel 1
|           +-04.2  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMA Channel 2
….
|           +-1e.3  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Power Control Unit
|           +-1e.4  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Power Control Unit
|           +-1f.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 VCU
|           \-1f.2  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 VCU
\-[0000:00]-+-00.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2
             +-01.0-[01]--+-00.0  Intel Corporation Ethernet Controller 10-Gigabit X540-AT2
             |            \-00.1  Intel Corporation Ethernet Controller 10-Gigabit X540-AT2
             +-02.0-[02-05]----00.0-[03-05]--+-08.0-[04]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU0
             |                               \-10.0-[05]----00.0  NVIDIA Corporation GK210GL [Tesla K80] ---- GPU1
             +-03.0-[06]--
             +-04.0  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMA Channel 0
             +-04.1  Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMA Channel 1

The matrix of connection types between GPUs is shown as follows. It seems GPU0-GPU1 is one group and GPU2-GPU7 is the second group. They cannot correctly communicate.

nvidia-smi topo -m
        GPU0    GPU1    GPU2    GPU3    GPU4    GPU5    GPU6    GPU7    CPU Affinity
GPU0     X      PIX     SOC     SOC     SOC     SOC     SOC     SOC     0-7,16-23
GPU1    PIX      X      SOC     SOC     SOC     SOC     SOC     SOC     0-7,16-23
GPU2    SOC     SOC      X      PIX     PHB     PHB     PHB     PHB     8-15,24-31
GPU3    SOC     SOC     PIX      X      PHB     PHB     PHB     PHB     8-15,24-31
GPU4    SOC     SOC     PHB     PHB      X      PIX     PXB     PXB     8-15,24-31
GPU5    SOC     SOC     PHB     PHB     PIX      X      PXB     PXB     8-15,24-31
GPU6    SOC     SOC     PHB     PHB     PXB     PXB      X      PIX     8-15,24-31
GPU7    SOC     SOC     PHB     PHB     PXB     PXB     PIX      X      8-15,24-31

This is a function of system (motherboard design). This is presumably a dual-socket motherboard design and some of the PCIE slots are connected to one socket and the remainder are connected to the other socket.

How can PCIE slots be connected to one single socket? Thank you so much.

By socket I meant CPU socket.

You can have multiple PCIE slots that are connected to a single CPU socket in a variety of ways. Modern CPUs may have up to 40 PCIE lanes, and you can “expand” the capability using PCIE switches.

Thank you txbob, What do you mean that can “expand” the capability using PCIE switches? Is it hardware or software because we cannot find any additional PCIE slots on our motherboard?

The Tesla GPUs in your system provide PCIe gen3 x16 interconnect, so each GPU requires 16 PCIe lanes. Intel Xeon processors for dual-socket systems (which is what you appear to have, best I can tell) typically offer a total 40 PCIe lanes, which is sufficient to connect twoTeslas per CPU socket at full speed, plus assorted peripherals such as SSDs or networking cards (which may use 4 PCIe lanes each).

Since PCIe is a packetized full-duplex interconnect, one is not limited to straight one-to-one connections, but can build more complicated topologies using PCIe switches, in a manner analogous to the use of Ethernet switches. There are various companies, such as PLX, that offer chips that work as PCIe switches. These may offer up to 96 PCIe lanes across a number of ports.

The large number of GPUs in your system suggest that the system manufacturer used one or several PCIe switches to construct the PCIe “network” topology. This topology may be fixed by design, or configurable. Your system vendor should be able to document for you what topologies are supported in this particular system, I don’t think there is a way for us to know. From the print out above it seems the eight GPUs are currently distributed unevenly over the two CPU sockets, with six GPUs hanging off CPU0 and two GPUs off CPU1.