Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ./p2pBandwidthLatencyTest
- [P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
- Device: 0, NVIDIA GeForce RTX 4090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
- Device: 1, NVIDIA GeForce RTX 4090, pciBusID: 23, pciDeviceID: 0, pciDomainID:0
- Device: 2, NVIDIA GeForce RTX 4090, pciBusID: 41, pciDeviceID: 0, pciDomainID:0
- Device: 3, NVIDIA GeForce RTX 4090, pciBusID: 61, pciDeviceID: 0, pciDomainID:0
- Device: 4, NVIDIA GeForce RTX 4090, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
- Device: 5, NVIDIA GeForce RTX 4090, pciBusID: a1, pciDeviceID: 0, pciDomainID:0
- Device: 6, NVIDIA GeForce RTX 4090, pciBusID: c1, pciDeviceID: 0, pciDomainID:0
- Device: 7, NVIDIA GeForce RTX 4090, pciBusID: e1, pciDeviceID: 0, pciDomainID:0
- Device=0 CAN Access Peer Device=1
- Device=0 CAN Access Peer Device=2
- Device=0 CAN Access Peer Device=3
- Device=0 CAN Access Peer Device=4
- Device=0 CAN Access Peer Device=5
- Device=0 CAN Access Peer Device=6
- Device=0 CAN Access Peer Device=7
- Device=1 CAN Access Peer Device=0
- Device=1 CAN Access Peer Device=2
- Device=1 CAN Access Peer Device=3
- Device=1 CAN Access Peer Device=4
- Device=1 CAN Access Peer Device=5
- Device=1 CAN Access Peer Device=6
- Device=1 CAN Access Peer Device=7
- Device=2 CAN Access Peer Device=0
- Device=2 CAN Access Peer Device=1
- Device=2 CAN Access Peer Device=3
- Device=2 CAN Access Peer Device=4
- Device=2 CAN Access Peer Device=5
- Device=2 CAN Access Peer Device=6
- Device=2 CAN Access Peer Device=7
- Device=3 CAN Access Peer Device=0
- Device=3 CAN Access Peer Device=1
- Device=3 CAN Access Peer Device=2
- Device=3 CAN Access Peer Device=4
- Device=3 CAN Access Peer Device=5
- Device=3 CAN Access Peer Device=6
- Device=3 CAN Access Peer Device=7
- Device=4 CAN Access Peer Device=0
- Device=4 CAN Access Peer Device=1
- Device=4 CAN Access Peer Device=2
- Device=4 CAN Access Peer Device=3
- Device=4 CAN Access Peer Device=5
- Device=4 CAN Access Peer Device=6
- Device=4 CAN Access Peer Device=7
- Device=5 CAN Access Peer Device=0
- Device=5 CAN Access Peer Device=1
- Device=5 CAN Access Peer Device=2
- Device=5 CAN Access Peer Device=3
- Device=5 CAN Access Peer Device=4
- Device=5 CAN Access Peer Device=6
- Device=5 CAN Access Peer Device=7
- Device=6 CAN Access Peer Device=0
- Device=6 CAN Access Peer Device=1
- Device=6 CAN Access Peer Device=2
- Device=6 CAN Access Peer Device=3
- Device=6 CAN Access Peer Device=4
- Device=6 CAN Access Peer Device=5
- Device=6 CAN Access Peer Device=7
- Device=7 CAN Access Peer Device=0
- Device=7 CAN Access Peer Device=1
- Device=7 CAN Access Peer Device=2
- Device=7 CAN Access Peer Device=3
- Device=7 CAN Access Peer Device=4
- Device=7 CAN Access Peer Device=5
- Device=7 CAN Access Peer Device=6
- ***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
- So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
- P2P Connectivity Matrix
- D\D 0 1 2 3 4 5 6 7
- 0 1 1 1 1 1 1 1 1
- 1 1 1 1 1 1 1 1 1
- 2 1 1 1 1 1 1 1 1
- 3 1 1 1 1 1 1 1 1
- 4 1 1 1 1 1 1 1 1
- 5 1 1 1 1 1 1 1 1
- 6 1 1 1 1 1 1 1 1
- 7 1 1 1 1 1 1 1 1
- Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
- D\D 0 1 2 3 4 5 6 7
- 0 913.74 19.42 19.76 19.23 20.71 20.47 20.55 19.78
- 1 19.80 922.92 19.77 19.74 20.68 20.49 20.52 19.71
- 2 19.84 19.83 921.44 19.74 20.72 20.45 20.54 19.70
- 3 19.66 19.84 19.73 921.37 20.69 20.49 20.49 19.75
- 4 20.28 19.80 20.17 19.90 925.21 20.28 20.50 19.77
- 5 20.16 19.77 20.18 19.82 20.65 922.92 20.26 19.89
- 6 20.19 19.77 20.15 19.92 20.65 20.37 923.46 19.76
- 7 20.21 19.71 20.16 19.83 20.70 20.37 20.21 923.46
- Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
- D\D 0 1 2 3 4 5 6 7
- 0 913.69 26.23 26.39 26.39 25.41 25.56 24.64 25.56
- 1 26.26 940.13 26.39 26.39 24.50 25.56 25.56 24.65
- 2 26.26 26.39 941.37 26.39 25.41 24.64 25.56 25.56
- 3 26.26 26.39 26.39 940.70 25.41 25.56 24.63 25.57
- 4 25.57 24.64 25.57 25.57 940.89 26.26 26.39 26.39
- 5 25.41 25.56 24.64 25.56 26.26 940.27 26.39 26.39
- 6 24.49 25.56 25.57 24.64 26.26 26.39 941.16 26.40
- 7 25.41 24.64 25.57 25.56 26.26 26.39 26.39 941.27
- Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
- D\D 0 1 2 3 4 5 6 7
- 0 918.27 19.99 20.26 19.21 22.01 21.89 21.94 21.77
- 1 19.98 923.74 19.72 19.96 21.89 21.78 21.79 21.62
- 2 20.04 19.72 922.65 20.13 21.96 21.88 21.90 21.74
- 3 19.18 20.09 20.11 923.74 21.84 21.69 21.77 21.57
- 4 21.43 21.42 21.47 21.49 922.92 21.90 21.59 21.77
- 5 21.17 21.36 21.40 21.16 21.73 922.92 21.72 21.54
- 6 21.33 21.31 21.38 21.36 21.58 21.81 923.19 21.69
- 7 20.94 21.03 21.16 20.93 21.52 21.44 21.47 922.92
- Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
- D\D 0 1 2 3 4 5 6 7
- 0 917.71 50.91 51.23 51.21 46.24 47.10 46.07 47.06
- 1 50.98 920.47 51.24 51.24 44.88 47.08 47.06 46.12
- 2 50.99 51.23 921.01 51.24 46.40 46.09 47.07 47.08
- 3 50.99 51.24 51.25 920.74 46.29 47.08 46.09 47.08
- 4 47.08 46.12 47.06 47.11 921.01 50.97 51.21 51.21
- 5 46.82 47.07 46.09 47.09 50.97 922.54 51.22 51.22
- 6 45.84 47.08 47.08 46.10 50.97 51.20 921.01 51.22
- 7 46.82 46.13 47.07 47.09 50.95 51.20 51.19 919.66
- P2P=Disabled Latency Matrix (us)
- GPU 0 1 2 3 4 5 6 7
- 0 1.36 11.50 11.43 11.59 11.34 12.58 11.95 11.83
- 1 19.98 1.37 20.09 20.10 12.94 11.56 11.63 13.74
- 2 11.32 11.26 1.27 11.34 12.39 14.32 12.97 11.85
- 3 11.50 12.49 11.59 1.34 12.02 14.60 13.25 11.62
- 4 13.04 12.66 12.72 12.85 1.31 11.00 11.29 10.40
- 5 11.85 14.43 11.36 12.12 11.10 1.37 10.41 10.39
- 6 11.26 13.52 12.85 11.15 11.03 10.41 1.32 10.41
- 7 13.57 12.85 12.95 13.66 10.33 10.28 10.31 1.29
- CPU 0 1 2 3 4 5 6 7
- 0 2.56 8.68 8.46 8.32 7.49 7.42 7.62 7.43
- 1 8.69 2.42 8.21 8.10 7.22 7.21 7.23 7.18
- 2 8.43 8.11 2.48 8.13 7.27 7.24 7.29 7.23
- 3 8.27 8.00 8.08 2.42 7.24 7.24 7.23 7.23
- 4 7.79 7.52 7.58 7.51 2.12 6.67 6.83 6.68
- 5 7.82 7.44 7.57 7.40 6.65 2.11 6.71 6.67
- 6 7.77 7.46 7.58 7.45 6.65 6.64 2.14 6.72
- 7 7.80 7.45 7.54 7.44 6.68 6.65 6.69 2.14
- P2P=Enabled Latency (P2P Writes) Matrix (us)
- GPU 0 1 2 3 4 5 6 7
- 0 1.48 1.24 1.28 1.30 1.40 1.37 1.41 1.39
- 1 1.23 1.36 1.24 1.20 1.31 1.28 1.31 1.28
- 2 1.30 1.29 1.28 1.25 1.39 1.37 1.36 1.36
- 3 1.24 1.22 1.22 1.34 1.36 1.31 1.35 1.31
- 4 1.20 1.18 1.13 1.16 1.32 0.99 1.04 1.07
- 5 1.11 1.10 1.04 1.05 0.95 1.36 0.96 0.97
- 6 1.19 1.16 1.16 1.17 1.06 1.00 1.32 1.03
- 7 1.11 1.01 1.10 1.11 0.95 0.99 0.99 1.29
- CPU 0 1 2 3 4 5 6 7
- 0 2.51 2.21 2.11 2.12 2.14 2.12 2.10 2.09
- 1 2.21 2.39 2.05 2.12 2.10 2.12 2.09 2.11
- 2 2.26 2.14 2.48 2.14 2.16 2.15 2.24 2.11
- 3 2.24 2.10 2.12 2.55 2.10 2.10 2.08 2.09
- 4 2.00 1.86 1.88 1.85 2.19 1.87 1.84 1.84
- 5 1.97 1.93 1.87 1.88 1.87 2.17 1.84 1.84
- 6 2.00 1.89 1.89 1.88 1.89 1.86 2.20 1.87
- 7 2.05 1.90 1.89 1.87 1.88 1.87 1.87 2.19
- NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement