gugus

forwarding output problem

Nov 8th, 2013
117
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 15.68 KB | None | 0 0
  1. ==== Diagram ====
  2.  
  3. Generator => (igb2) router (igb3) => receiver
  4.  
  5. router have 2 routes:
  6. - 1.0.0.0/8 to the generator
  7. - 2.0.0.0/8 to the receiver
  8.  
  9. router hardware:
  10. FreeBSD 10.0-BETA2 #0 r257562M: Sun Nov 3 07:11:01 CET 2013
  11. [email protected]:/usr/obj/BSDRP.amd64/usr/local/BSDRP/BSDRP/FreeBSD/src/sys/amd64 amd64
  12. FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
  13. CPU: Intel(R) Xeon(R) CPU L5630 @ 2.13GHz (2133.46-MHz K8-class CPU)
  14. Origin = "GenuineIntel" Id = 0x206c2 Family = 0x6 Model = 0x2c Stepping = 2
  15.  
  16. real memory = 17179869184 (16384 MB)
  17. avail memory = 16550580224 (15783 MB)
  18.  
  19. FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
  20. FreeBSD/SMP: 1 package(s) x 4 core(s)
  21. cpu0 (BSP): APIC ID: 0
  22. cpu1 (AP): APIC ID: 2
  23. cpu2 (AP): APIC ID: 18
  24. cpu3 (AP): APIC ID: 20
  25.  
  26. igb2: <Intel(R) PRO/1000 Network Connection version - 2.4.0> mem 0x97a80000-0x97afffff,0x97c04000-0x97c07fff irq 39 at device 0.2 on pci26
  27. igb2: Using MSIX interrupts with 5 vectors
  28. igb2: Ethernet address: 00:1b:21:d3:8f:3e
  29. igb2: Bound queue 0 to cpu 0
  30. igb2: Bound queue 1 to cpu 1
  31. igb2: Bound queue 2 to cpu 2
  32. igb2: Bound queue 3 to cpu 3
  33. 001.000011 netmap_attach [2244] success for igb2
  34. igb3: <Intel(R) PRO/1000 Network Connection version - 2.4.0> mem 0x97a00000-0x97a7ffff,0x97c00000-0x97c03fff irq 38 at device 0.3 on pci26
  35. igb3: Using MSIX interrupts with 5 vectors
  36. igb3: Ethernet address: 00:1b:21:d3:8f:3f
  37. igb3: Bound queue 0 to cpu 0
  38. igb3: Bound queue 1 to cpu 1
  39. igb3: Bound queue 2 to cpu 2
  40. igb3: Bound queue 3 to cpu 3
  41.  
  42. ==== One flow generator ===
  43.  
  44. == On the generator ==
  45. [root@generator]~# pkt-gen -i igb2 -f tx -n 80000000 -l 42 -d 2.3.3.3 -D 00:1b:21:d3:8f:3e -s 1.3.3.3 -w 10
  46. (...)
  47. main_thread [1335] 1493279 pps (1496125 pkts in 1001906 usec)
  48. main_thread [1335] 1488173 pps (1489660 pkts in 1000999 usec)
  49. main_thread [1335] 1488187 pps (1491162 pkts in 1001999 usec)
  50. main_thread [1335] 1488176 pps (1489664 pkts in 1001000 usec)
  51. main_thread [1335] 1488183 pps (1489674 pkts in 1001002 usec)
  52. main_thread [1335] 1488180 pps (1489667 pkts in 1000999 usec)
  53. main_thread [1335] 1488170 pps (1489658 pkts in 1001000 usec)
  54. main_thread [1335] 1488196 pps (1489683 pkts in 1000999 usec)
  55. main_thread [1335] 1488177 pps (1489665 pkts in 1001000 usec)
  56. main_thread [1335] 1488178 pps (1489666 pkts in 1001000 usec)
  57. main_thread [1335] 1488177 pps (1491155 pkts in 1002001 usec)
  58. main_thread [1335] 1488181 pps (1489668 pkts in 1000999 usec)
  59. main_thread [1335] 1488176 pps (1489664 pkts in 1001000 usec)
  60. main_thread [1335] 1488185 pps (1489676 pkts in 1001002 usec)
  61. main_thread [1335] 1488177 pps (1489662 pkts in 1000998 usec)
  62. main_thread [1335] 1488179 pps (1503061 pkts in 1010000 usec)
  63. main_thread [1335] 1488182 pps (1491158 pkts in 1002000 usec)
  64. main_thread [1335] 1488186 pps (1489676 pkts in 1001001 usec)
  65. main_thread [1335] 1488174 pps (1489659 pkts in 1000998 usec)
  66.  
  67. == on the router ==
  68.  
  69. [root@router]~# netstat -iw 1
  70. input (Total) output
  71. packets errs idrops bytes packets errs bytes colls
  72. 742229 749909 0 44533746 742401 0 31173806 0
  73. 754557 750022 0 45273426 753678 0 31691534 0
  74. 721921 749787 0 43315266 722895 0 30320822 0
  75. 754685 749909 0 45281106 753702 0 31696910 0
  76. 721925 749815 0 43315506 722474 0 30320990 0
  77. 749386 749937 0 44963166 749016 0 31474352 0
  78. 727144 749796 0 43628646 727041 0 30540188 0
  79. 749366 749993 0 44961966 749569 0 31473470 0
  80. 747877 749667 0 44872626 748545 0 31410974 0
  81. 725660 750102 0 43539606 724993 0 30477902 0
  82. 747805 749739 0 44868306 748499 0 31407950 0
  83. 720579 750002 0 43234746 719919 0 30264458 0
  84. 754482 749722 0 45268926 754689 0 31688342 0
  85. 721971 750074 0 43318266 721921 0 30322922 0
  86.  
  87. => Notice the input errors: Only one NIC queue use because it's one flow, and this queue is overloaded.
  88.  
  89. == on the receiver ==
  90.  
  91. [root@receiver]~# pkt-gen -i igb3 -f rx -w 10
  92. (...)
  93. main_thread [1335] 738269 pps (739007 pkts in 1000999 usec)
  94. main_thread [1335] 738373 pps (739112 pkts in 1001001 usec)
  95. main_thread [1335] 738315 pps (739053 pkts in 1000999 usec)
  96. main_thread [1335] 738365 pps (739103 pkts in 1001000 usec)
  97. main_thread [1335] 738378 pps (739116 pkts in 1000999 usec)
  98. main_thread [1335] 738181 pps (738920 pkts in 1001001 usec)
  99. main_thread [1335] 738422 pps (739898 pkts in 1001999 usec)
  100. main_thread [1335] 738312 pps (739050 pkts in 1001000 usec)
  101. main_thread [1335] 738298 pps (759709 pkts in 1029001 usec)
  102. main_thread [1335] 738232 pps (748567 pkts in 1014000 usec)
  103. main_thread [1335] 738357 pps (748693 pkts in 1013999 usec)
  104. main_thread [1335] 738378 pps (739855 pkts in 1002000 usec)
  105. main_thread [1335] 738245 pps (739721 pkts in 1002000 usec)
  106. main_thread [1335] 738277 pps (739015 pkts in 1001000 usec)
  107. main_thread [1335] 738291 pps (739029 pkts in 1001000 usec)
  108. main_thread [1335] 738265 pps (739003 pkts in 1001000 usec)
  109. main_thread [1335] 738238 pps (738976 pkts in 1000999 usec)
  110.  
  111. == conclusion ==
  112.  
  113. With one IP flow (same src/dst address to direct attached host): 738Kpps
  114. The limit is the filling of the unique input NIC queue used (because only one flow).
  115.  
  116. ==== Multiple flows ===
  117.  
  118. == On the generator ==
  119. [root@generator]~# pkt-gen -i igb2 -f tx -n 80000000 -l 42 -d 2.3.3.1-2.3.3.10 -D 00:1b:21:d3:8f:3e -s 1.3.3.1-1.3.3.10 -w 10
  120. (...)
  121. main_thread [1335] 1493177 pps (1496020 pkts in 1001904 usec)
  122. main_thread [1335] 1488179 pps (1491154 pkts in 1001999 usec)
  123. main_thread [1335] 1488179 pps (1489667 pkts in 1001000 usec)
  124. main_thread [1335] 1488174 pps (1491150 pkts in 1002000 usec)
  125. main_thread [1335] 1488185 pps (1489673 pkts in 1001000 usec)
  126. main_thread [1335] 1488181 pps (1489669 pkts in 1001000 usec)
  127. main_thread [1335] 1488179 pps (1489667 pkts in 1001000 usec)
  128. main_thread [1335] 1488179 pps (1491155 pkts in 1002000 usec)
  129. main_thread [1335] 1488180 pps (1489668 pkts in 1001000 usec)
  130. main_thread [1335] 1488184 pps (1491119 pkts in 1001972 usec)
  131. main_thread [1335] 1488173 pps (1489703 pkts in 1001028 usec)
  132. main_thread [1335] 1488182 pps (1491157 pkts in 1001999 usec)
  133. main_thread [1335] 1488177 pps (1491161 pkts in 1002005 usec)
  134. main_thread [1335] 1488184 pps (1503060 pkts in 1009996 usec)
  135. main_thread [1335] 1488175 pps (1503057 pkts in 1010000 usec)
  136. main_thread [1335] 1488186 pps (1503068 pkts in 1010000 usec)
  137. main_thread [1335] 1488173 pps (1489661 pkts in 1001000 usec)
  138. main_thread [1335] 1488184 pps (1491159 pkts in 1001999 usec)
  139. main_thread [1335] 1488183 pps (1490180 pkts in 1001342 usec)
  140.  
  141. == on the router ==
  142.  
  143. [root@router]~# netstat -iw 1
  144. input (Total) output
  145. packets errs idrops bytes packets errs bytes colls
  146. 1461084 0 0 88830966 14897 0 625862 0
  147. 1472049 0 0 88876506 14897 0 625814 0
  148. 1452239 0 0 88695306 14897 0 625814 0
  149. 1450187 0 0 88685106 14897 0 625814 0
  150. 1450042 0 0 88686426 14897 0 625814 0
  151. 1480851 0 0 89169906 14897 0 625814 0
  152. 1474965 0 0 88928526 14897 0 625814 0
  153. 1483039 0 0 89185326 14896 0 625772 0
  154. 1474005 0 0 89013246 14898 0 625856 0
  155. 1473229 0 0 89001246 14897 0 625814 0
  156. 1477642 0 0 89050506 14896 0 625814 0
  157. 1477920 0 0 89055006 14898 0 625814 0
  158. 1478163 0 0 89059986 14897 0 625814 0
  159. 1475385 0 0 89040066 14896 0 625772 0
  160. 1474920 0 0 89035626 14897 0 625814 0
  161. 1474316 0 0 89074566 14897 0 625814 0
  162. 1486104 0 0 89256366 14897 0 625814 0
  163. 1462398 0 0 88874946 14898 0 625856 0
  164.  
  165. ==> We don't see the input starvation: The NIC use all its input queue and reach to manage all input traffic… But WTF regarding the output performance ?????
  166.  
  167. == on the receiver ==
  168.  
  169. main_thread [1335] 14896 pps (14918 pkts in 1001462 usec)
  170. main_thread [1335] 14896 pps (14911 pkts in 1000999 usec)
  171. main_thread [1335] 14896 pps (14911 pkts in 1001001 usec)
  172. main_thread [1335] 14896 pps (14926 pkts in 1002002 usec)
  173. main_thread [1335] 14896 pps (14911 pkts in 1000998 usec)
  174. main_thread [1335] 14896 pps (14917 pkts in 1001412 usec)
  175. main_thread [1335] 14895 pps (14919 pkts in 1001587 usec)
  176. main_thread [1335] 14896 pps (14926 pkts in 1002000 usec)
  177. main_thread [1335] 14896 pps (14926 pkts in 1002000 usec)
  178. main_thread [1335] 14896 pps (14911 pkts in 1001000 usec)
  179. main_thread [1335] 14896 pps (15384 pkts in 1032784 usec)
  180. main_thread [1335] 14895 pps (15130 pkts in 1015748 usec)
  181. main_thread [1335] 14896 pps (15171 pkts in 1018468 usec)
  182. main_thread [1335] 14896 pps (14926 pkts in 1002000 usec)
  183. main_thread [1335] 14896 pps (14911 pkts in 1001000 usec)
  184. main_thread [1335] 14896 pps (14941 pkts in 1003001 usec)
  185. main_thread [1335] 14896 pps (14911 pkts in 1001001 usec)
  186. main_thread [1335] 14896 pps (14926 pkts in 1001997 usec)
  187. main_thread [1335] 14895 pps (14955 pkts in 1004001 usec)
  188. main_thread [1335] 14896 pps (14926 pkts in 1001999 usec)
  189. main_thread [1335] 14896 pps (14926 pkts in 1002000 usec)
  190.  
  191. => receiver confirm the very slow pps rate received.
  192.  
  193. == conclusion ==
  194.  
  195. WTF ???
  196.  
  197. Router load:
  198.  
  199. last pid: 3208; load averages: 1.11, 0.95, 0.50 up 0+00:53:18 23:06:30
  200. 153 processes: 7 running, 97 sleeping, 49 waiting
  201. CPU: 0.0% user, 0.0% nice, 0.0% system, 33.0% interrupt, 67.0% idle
  202. Mem: 8176K Active, 14M Inact, 236M Wired, 10M Buf, 15G Free
  203. Swap:
  204.  
  205. PID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND
  206. 11 root -92 - 0K 816K WAIT 3 3:00 33.25% intr{irq281: igb2:que}
  207. 11 root -92 - 0K 816K WAIT 0 2:54 33.25% intr{irq278: igb2:que}
  208. 11 root -92 - 0K 816K CPU1 1 2:54 33.15% intr{irq279: igb2:que}
  209. 11 root -92 - 0K 816K CPU2 2 3:46 31.30% intr{irq280: igb2:que}
  210. 11 root -92 - 0K 816K WAIT 1 0:06 1.17% intr{irq284: igb3:que}
  211.  
  212. == test 1: Limiting NIC queue for better NIC/queue/CPU binding ==
  213.  
  214. echo hw.igb.num_queues="2" > /boot/loader.conf.local
  215. reboot
  216.  
  217. igb2: <Intel(R) PRO/1000 Network Connection version - 2.4.0> mem 0x97a80000-0x97afffff,0x97c04000-0x97c07fff irq 39 at device 0.2 on pci26
  218. igb2: Using MSIX interrupts with 3 vectors
  219. igb2: Ethernet address: 00:1b:21:d3:8f:3e
  220. igb2: Bound queue 0 to cpu 0
  221. igb2: Bound queue 1 to cpu 1
  222. 001.000011 netmap_attach [2244] success for igb2
  223. igb3: <Intel(R) PRO/1000 Network Connection version - 2.4.0> mem 0x97a00000-0x97a7ffff,0x97c00000-0x97c03fff irq 38 at device 0.3 on pci26
  224. igb3: Using MSIX interrupts with 3 vectors
  225. igb3: Ethernet address: 00:1b:21:d3:8f:3f
  226. igb3: Bound queue 0 to cpu 2
  227. igb3: Bound queue 1 to cpu 3
  228.  
  229. => Great !
  230.  
  231. cpu0&1 are managing queue 0&1 of igb2
  232. cpu2&3 are managing queue 0&1 of igb3
  233.  
  234. And 2 queues are enought for these CPU:
  235.  
  236. [root@bsdrp2]~# netstat -iw 1
  237. input (Total) output
  238. packets errs idrops bytes packets errs bytes colls
  239. 1472988 0 0 88868526 14910 0 626408 0
  240. 1489289 0 0 89566146 14986 0 629552 0
  241. 1468382 0 0 88416966 14808 0 622076 0
  242. 1487442 0 0 89546286 15000 0 630140 0
  243. 1468619 0 0 88310406 14793 0 621446 0
  244. 1482099 0 0 89145246 14897 0 625814 0
  245.  
  246. [root@bsdrp2]~# top -nCHSIzs1
  247. last pid: 2933; load averages: 0.75, 0.57, 0.31 up 0+00:05:43 23:25:24
  248. 137 processes: 5 running, 89 sleeping, 43 waiting
  249.  
  250. Mem: 13M Active, 8860K Inact, 213M Wired, 9760K Buf, 15G Free
  251. Swap:
  252.  
  253.  
  254. PID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND
  255. 11 root -92 - 0K 688K WAIT 1 1:05 52.39% intr{irq275: igb2:que}
  256. 11 root -92 - 0K 688K WAIT 0 1:02 48.49% intr{irq274: igb2:que}
  257. 11 root -92 - 0K 688K WAIT 3 0:02 0.88% intr{irq278: igb3:que}
  258.  
  259. ==> Still the very bad output performance
  260.  
  261.  
  262. ==== TCPdump ====
  263.  
  264. [root@bsdrp2]~# tcpdump -npi igb2 -c 100
  265. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  266. listening on igb2, link-type EN10MB (Ethernet), capture size 65535 bytes
  267. 23:30:50.714198 IP 1.3.3.3.0 > 2.3.3.1.0: UDP, length 0
  268. 23:30:50.714198 IP 1.3.3.3.0 > 2.3.3.2.0: UDP, length 0
  269. 23:30:50.714199 IP 1.3.3.2.0 > 2.3.3.3.0: UDP, length 0
  270. 23:30:50.714199 IP 1.3.3.2.0 > 2.3.3.4.0: UDP, length 0
  271. 23:30:50.714200 IP 1.3.3.1.0 > 2.3.3.3.0: UDP, length 0
  272. 23:30:50.714201 IP 1.3.3.4.0 > 2.3.3.3.0: UDP, length 0
  273. 23:30:50.714201 IP 1.3.3.4.0 > 2.3.3.4.0: UDP, length 0
  274. 23:30:50.714202 IP 1.3.3.1.0 > 2.3.3.4.0: UDP, length 0
  275. 23:30:50.714206 IP 1.3.3.3.0 > 2.3.3.3.0: UDP, length 0
  276. 23:30:50.714206 IP 1.3.3.3.0 > 2.3.3.4.0: UDP, length 0
  277. 23:30:50.714207 IP 1.3.3.2.0 > 2.3.3.1.0: UDP, length 0
  278. 23:30:50.714207 IP 1.3.3.3.0 > 2.3.3.1.0: UDP, length 0
  279. 23:30:50.714208 IP 1.3.3.2.0 > 2.3.3.2.0: UDP, length 0
  280. 23:30:50.714208 IP 1.3.3.2.0 > 2.3.3.3.0: UDP, length 0
  281. 23:30:50.714210 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  282. 23:30:50.714211 IP 1.3.3.4.0 > 2.3.3.1.0: UDP, length 0
  283. 23:30:50.714212 IP 1.3.3.4.0 > 2.3.3.2.0: UDP, length 0
  284. 23:30:50.714213 IP 1.3.3.1.0 > 2.3.3.2.0: UDP, length 0
  285. 23:30:50.714213 IP 1.3.3.1.0 > 2.3.3.3.0: UDP, length 0
  286. 23:30:50.714214 IP 1.3.3.4.0 > 2.3.3.3.0: UDP, length 0
  287.  
  288. [root@bsdrp2]~# tcpdump -npi igb3 -c 100
  289. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  290. listening on igb3, link-type EN10MB (Ethernet), capture size 65535 bytes
  291. 23:30:57.284708 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  292. 23:30:57.284712 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  293. 23:30:57.284720 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  294. 23:30:57.284741 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  295. 23:30:57.284746 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  296. 23:30:57.284749 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  297. 23:30:57.284767 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  298. 23:30:57.284776 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  299. 23:30:57.284788 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  300. 23:30:57.284796 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  301. 23:30:57.284818 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  302. 23:30:57.284823 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  303. 23:30:57.284831 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  304. 23:30:57.284842 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  305. 23:30:57.284851 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  306. 23:30:57.284861 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  307. 23:30:57.284879 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  308. 23:30:57.284885 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  309. 23:30:57.284891 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  310. 23:30:57.284910 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  311. 23:30:57.284915 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  312. 23:30:57.284930 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  313. 23:30:57.284942 IP 1.3.3.1.0 > 2.3.3.1.0: UDP, length 0
  314.  
  315. ==>> WTF !?!?!?! why only source 1.3.3.1 to dest 2.3.3.1 are forwarded ??????
  316.  
  317. ip:
  318. 46954310 total packets received
  319. 1111696437 bad header checksums
  320. 11270 packets for this host
  321. 46943040 packets forwarded (46943040 packets fast forwarded)
  322. 13451 packets sent from this host
  323.  
  324. ==> "bad header checksums" ????????
  325.  
  326. pkt-gen generate only good header checksum for the first IP couple !
Advertisement
Add Comment
Please, Sign In to add comment