Guest User

Performance results dot product via C++ algorithms, HPX

a guest
Jul 11th, 2018
134
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 10.95 KB | None | 0 0
  1. number = 1024.
  2. 8.00 kB
  3. dot(x1, x2, number) : 3.308 cycles per operation (best) 2.419 bytes per cycle (best) 3.322 cycles per operation (avg)
  4. dot128(x1, x2, number) : 0.843 cycles per operation (best) 9.492 bytes per cycle (best) 0.852 cycles per operation (avg)
  5. dot128dt(x1, x2, number) : 1.919 cycles per operation (best) 4.169 bytes per cycle (best) 1.985 cycles per operation (avg)
  6. dot256(x1, x2, number) : 0.520 cycles per operation (best) 15.398 bytes per cycle (best) 0.553 cycles per operation (avg)
  7. dot128fma(x1, x2, number) : 0.839 cycles per operation (best) 9.537 bytes per cycle (best) 0.854 cycles per operation (avg)
  8. dot256fma(x1, x2, number) : 0.526 cycles per operation (best) 15.199 bytes per cycle (best) 0.554 cycles per operation (avg)
  9. dot_inner_product(x1, x2, number) : 3.538 cycles per operation (best) 2.261 bytes per cycle (best) 3.576 cycles per operation (avg)
  10. dot_transform_reduce(x1, x2, number) : 94.146 cycles per operation (best) 0.085 bytes per cycle (best) 183.408 cycles per operation (avg)
  11.  
  12. number = 1024.
  13. 8.00 kB
  14. dot(x1, x2, number) : 3.313 cycles per operation (best) 2.414 bytes per cycle (best) 3.330 cycles per operation (avg)
  15. dot128(x1, x2, number) : 0.850 cycles per operation (best) 9.416 bytes per cycle (best) 0.862 cycles per operation (avg)
  16. dot128dt(x1, x2, number) : 1.871 cycles per operation (best) 4.276 bytes per cycle (best) 1.992 cycles per operation (avg)
  17. dot256(x1, x2, number) : 0.495 cycles per operation (best) 16.158 bytes per cycle (best) 0.558 cycles per operation (avg)
  18. dot128fma(x1, x2, number) : 0.849 cycles per operation (best) 9.427 bytes per cycle (best) 0.862 cycles per operation (avg)
  19. dot256fma(x1, x2, number) : 0.525 cycles per operation (best) 15.227 bytes per cycle (best) 0.563 cycles per operation (avg)
  20. dot_inner_product(x1, x2, number) : 3.534 cycles per operation (best) 2.264 bytes per cycle (best) 3.581 cycles per operation (avg)
  21. dot_transform_reduce(x1, x2, number) : 119.325 cycles per operation (best) 0.067 bytes per cycle (best) 199.220 cycles per operation (avg)
  22.  
  23. number = 2097152.
  24. 16.00 MB
  25. dot(x1, x2, number) : 3.472 cycles per operation (best) 2.304 bytes per cycle (best) 3.981 cycles per operation (avg)
  26. dot128(x1, x2, number) : 1.067 cycles per operation (best) 7.497 bytes per cycle (best) 1.651 cycles per operation (avg)
  27. dot128dt(x1, x2, number) : 1.472 cycles per operation (best) 5.434 bytes per cycle (best) 2.436 cycles per operation (avg)
  28. dot256(x1, x2, number) : 0.966 cycles per operation (best) 8.280 bytes per cycle (best) 1.525 cycles per operation (avg)
  29. dot128fma(x1, x2, number) : 1.035 cycles per operation (best) 7.728 bytes per cycle (best) 1.333 cycles per operation (avg)
  30. dot256fma(x1, x2, number) : 0.971 cycles per operation (best) 8.239 bytes per cycle (best) 1.042 cycles per operation (avg)
  31. dot_inner_product(x1, x2, number) : 3.491 cycles per operation (best) 2.291 bytes per cycle (best) 3.732 cycles per operation (avg)
  32. dot_transform_reduce(x1, x2, number) : 0.724 cycles per operation (best) 11.043 bytes per cycle (best) 1.325 cycles per operation (avg)
  33.  
  34. number = 4194304.
  35. 32.00 MB
  36. dot(x1, x2, number) : 3.437 cycles per operation (best) 2.328 bytes per cycle (best) 3.523 cycles per operation (avg)
  37. dot128(x1, x2, number) : 1.115 cycles per operation (best) 7.177 bytes per cycle (best) 1.187 cycles per operation (avg)
  38. dot128dt(x1, x2, number) : 1.455 cycles per operation (best) 5.497 bytes per cycle (best) 1.944 cycles per operation (avg)
  39. dot256(x1, x2, number) : 0.916 cycles per operation (best) 8.735 bytes per cycle (best) 1.043 cycles per operation (avg)
  40. dot128fma(x1, x2, number) : 0.997 cycles per operation (best) 8.024 bytes per cycle (best) 1.104 cycles per operation (avg)
  41. dot256fma(x1, x2, number) : 1.043 cycles per operation (best) 7.667 bytes per cycle (best) 1.075 cycles per operation (avg)
  42. dot_inner_product(x1, x2, number) : 3.263 cycles per operation (best) 2.452 bytes per cycle (best) 3.628 cycles per operation (avg)
  43. dot_transform_reduce(x1, x2, number) : 0.756 cycles per operation (best) 10.578 bytes per cycle (best) 1.045 cycles per operation (avg)
  44.  
  45. number = 8388608.
  46. 64.00 MB
  47. dot(x1, x2, number) : 3.437 cycles per operation (best) 2.328 bytes per cycle (best) 3.520 cycles per operation (avg)
  48. dot128(x1, x2, number) : 1.057 cycles per operation (best) 7.569 bytes per cycle (best) 1.141 cycles per operation (avg)
  49. dot128dt(x1, x2, number) : 1.481 cycles per operation (best) 5.401 bytes per cycle (best) 1.876 cycles per operation (avg)
  50. dot256(x1, x2, number) : 0.976 cycles per operation (best) 8.195 bytes per cycle (best) 1.071 cycles per operation (avg)
  51. dot128fma(x1, x2, number) : 1.054 cycles per operation (best) 7.592 bytes per cycle (best) 1.145 cycles per operation (avg)
  52. dot256fma(x1, x2, number) : 0.946 cycles per operation (best) 8.459 bytes per cycle (best) 1.065 cycles per operation (avg)
  53. dot_inner_product(x1, x2, number) : 3.364 cycles per operation (best) 2.378 bytes per cycle (best) 3.582 cycles per operation (avg)
  54. dot_transform_reduce(x1, x2, number) : 0.770 cycles per operation (best) 10.392 bytes per cycle (best) 1.246 cycles per operation (avg)
  55.  
  56. number = 16777216.
  57. 128.00 MB
  58. dot(x1, x2, number) : 3.208 cycles per operation (best) 2.494 bytes per cycle (best) 3.501 cycles per operation (avg)
  59. dot128(x1, x2, number) : 1.059 cycles per operation (best) 7.552 bytes per cycle (best) 1.154 cycles per operation (avg)
  60. dot128dt(x1, x2, number) : 1.453 cycles per operation (best) 5.505 bytes per cycle (best) 1.769 cycles per operation (avg)
  61. dot256(x1, x2, number) : 0.992 cycles per operation (best) 8.068 bytes per cycle (best) 1.077 cycles per operation (avg)
  62. dot128fma(x1, x2, number) : 1.047 cycles per operation (best) 7.642 bytes per cycle (best) 1.147 cycles per operation (avg)
  63. dot256fma(x1, x2, number) : 0.986 cycles per operation (best) 8.116 bytes per cycle (best) 1.071 cycles per operation (avg)
  64. dot_inner_product(x1, x2, number) : 3.204 cycles per operation (best) 2.497 bytes per cycle (best) 3.556 cycles per operation (avg)
  65. dot_transform_reduce(x1, x2, number) : 0.769 cycles per operation (best) 10.404 bytes per cycle (best) 0.868 cycles per operation (avg)
  66.  
  67. number = 33554432.
  68. 256.00 MB
  69. dot(x1, x2, number) : 3.453 cycles per operation (best) 2.317 bytes per cycle (best) 3.515 cycles per operation (avg)
  70. dot128(x1, x2, number) : 1.032 cycles per operation (best) 7.754 bytes per cycle (best) 1.134 cycles per operation (avg)
  71. dot128dt(x1, x2, number) : 1.427 cycles per operation (best) 5.608 bytes per cycle (best) 1.690 cycles per operation (avg)
  72. dot256(x1, x2, number) : 0.998 cycles per operation (best) 8.014 bytes per cycle (best) 1.060 cycles per operation (avg)
  73. dot128fma(x1, x2, number) : 1.044 cycles per operation (best) 7.664 bytes per cycle (best) 1.115 cycles per operation (avg)
  74. dot256fma(x1, x2, number) : 0.991 cycles per operation (best) 8.074 bytes per cycle (best) 1.050 cycles per operation (avg)
  75. dot_inner_product(x1, x2, number) : 3.255 cycles per operation (best) 2.457 bytes per cycle (best) 3.445 cycles per operation (avg)
  76. dot_transform_reduce(x1, x2, number) : 0.766 cycles per operation (best) 10.443 bytes per cycle (best) 0.796 cycles per operation (avg)
  77.  
  78. number = 134217728.
  79. 1024.00 MB
  80. dot(x1, x2, number) : 3.383 cycles per operation (best) 2.365 bytes per cycle (best) 3.478 cycles per operation (avg)
  81. dot128(x1, x2, number) : 1.034 cycles per operation (best) 7.737 bytes per cycle (best) 1.084 cycles per operation (avg)
  82. dot128dt(x1, x2, number) : 1.395 cycles per operation (best) 5.733 bytes per cycle (best) 1.555 cycles per operation (avg)
  83. dot256(x1, x2, number) : 0.962 cycles per operation (best) 8.314 bytes per cycle (best) 1.025 cycles per operation (avg)
  84. dot128fma(x1, x2, number) : 1.001 cycles per operation (best) 7.990 bytes per cycle (best) 1.063 cycles per operation (avg)
  85. dot256fma(x1, x2, number) : 0.990 cycles per operation (best) 8.078 bytes per cycle (best) 1.023 cycles per operation (avg)
  86. dot_inner_product(x1, x2, number) : 3.187 cycles per operation (best) 2.510 bytes per cycle (best) 3.337 cycles per operation (avg)
  87. dot_transform_reduce(x1, x2, number) : 0.758 cycles per operation (best) 10.558 bytes per cycle (best) 0.805 cycles per operation (avg)
Advertisement
Add Comment
Please, Sign In to add comment