Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- number = 1024.
- 8.00 kB
- dot(x1, x2, number) : 3.308 cycles per operation (best) 2.419 bytes per cycle (best) 3.322 cycles per operation (avg)
- dot128(x1, x2, number) : 0.843 cycles per operation (best) 9.492 bytes per cycle (best) 0.852 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.919 cycles per operation (best) 4.169 bytes per cycle (best) 1.985 cycles per operation (avg)
- dot256(x1, x2, number) : 0.520 cycles per operation (best) 15.398 bytes per cycle (best) 0.553 cycles per operation (avg)
- dot128fma(x1, x2, number) : 0.839 cycles per operation (best) 9.537 bytes per cycle (best) 0.854 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.526 cycles per operation (best) 15.199 bytes per cycle (best) 0.554 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.538 cycles per operation (best) 2.261 bytes per cycle (best) 3.576 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 94.146 cycles per operation (best) 0.085 bytes per cycle (best) 183.408 cycles per operation (avg)
- number = 1024.
- 8.00 kB
- dot(x1, x2, number) : 3.313 cycles per operation (best) 2.414 bytes per cycle (best) 3.330 cycles per operation (avg)
- dot128(x1, x2, number) : 0.850 cycles per operation (best) 9.416 bytes per cycle (best) 0.862 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.871 cycles per operation (best) 4.276 bytes per cycle (best) 1.992 cycles per operation (avg)
- dot256(x1, x2, number) : 0.495 cycles per operation (best) 16.158 bytes per cycle (best) 0.558 cycles per operation (avg)
- dot128fma(x1, x2, number) : 0.849 cycles per operation (best) 9.427 bytes per cycle (best) 0.862 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.525 cycles per operation (best) 15.227 bytes per cycle (best) 0.563 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.534 cycles per operation (best) 2.264 bytes per cycle (best) 3.581 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 119.325 cycles per operation (best) 0.067 bytes per cycle (best) 199.220 cycles per operation (avg)
- number = 2097152.
- 16.00 MB
- dot(x1, x2, number) : 3.472 cycles per operation (best) 2.304 bytes per cycle (best) 3.981 cycles per operation (avg)
- dot128(x1, x2, number) : 1.067 cycles per operation (best) 7.497 bytes per cycle (best) 1.651 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.472 cycles per operation (best) 5.434 bytes per cycle (best) 2.436 cycles per operation (avg)
- dot256(x1, x2, number) : 0.966 cycles per operation (best) 8.280 bytes per cycle (best) 1.525 cycles per operation (avg)
- dot128fma(x1, x2, number) : 1.035 cycles per operation (best) 7.728 bytes per cycle (best) 1.333 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.971 cycles per operation (best) 8.239 bytes per cycle (best) 1.042 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.491 cycles per operation (best) 2.291 bytes per cycle (best) 3.732 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.724 cycles per operation (best) 11.043 bytes per cycle (best) 1.325 cycles per operation (avg)
- number = 4194304.
- 32.00 MB
- dot(x1, x2, number) : 3.437 cycles per operation (best) 2.328 bytes per cycle (best) 3.523 cycles per operation (avg)
- dot128(x1, x2, number) : 1.115 cycles per operation (best) 7.177 bytes per cycle (best) 1.187 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.455 cycles per operation (best) 5.497 bytes per cycle (best) 1.944 cycles per operation (avg)
- dot256(x1, x2, number) : 0.916 cycles per operation (best) 8.735 bytes per cycle (best) 1.043 cycles per operation (avg)
- dot128fma(x1, x2, number) : 0.997 cycles per operation (best) 8.024 bytes per cycle (best) 1.104 cycles per operation (avg)
- dot256fma(x1, x2, number) : 1.043 cycles per operation (best) 7.667 bytes per cycle (best) 1.075 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.263 cycles per operation (best) 2.452 bytes per cycle (best) 3.628 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.756 cycles per operation (best) 10.578 bytes per cycle (best) 1.045 cycles per operation (avg)
- number = 8388608.
- 64.00 MB
- dot(x1, x2, number) : 3.437 cycles per operation (best) 2.328 bytes per cycle (best) 3.520 cycles per operation (avg)
- dot128(x1, x2, number) : 1.057 cycles per operation (best) 7.569 bytes per cycle (best) 1.141 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.481 cycles per operation (best) 5.401 bytes per cycle (best) 1.876 cycles per operation (avg)
- dot256(x1, x2, number) : 0.976 cycles per operation (best) 8.195 bytes per cycle (best) 1.071 cycles per operation (avg)
- dot128fma(x1, x2, number) : 1.054 cycles per operation (best) 7.592 bytes per cycle (best) 1.145 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.946 cycles per operation (best) 8.459 bytes per cycle (best) 1.065 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.364 cycles per operation (best) 2.378 bytes per cycle (best) 3.582 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.770 cycles per operation (best) 10.392 bytes per cycle (best) 1.246 cycles per operation (avg)
- number = 16777216.
- 128.00 MB
- dot(x1, x2, number) : 3.208 cycles per operation (best) 2.494 bytes per cycle (best) 3.501 cycles per operation (avg)
- dot128(x1, x2, number) : 1.059 cycles per operation (best) 7.552 bytes per cycle (best) 1.154 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.453 cycles per operation (best) 5.505 bytes per cycle (best) 1.769 cycles per operation (avg)
- dot256(x1, x2, number) : 0.992 cycles per operation (best) 8.068 bytes per cycle (best) 1.077 cycles per operation (avg)
- dot128fma(x1, x2, number) : 1.047 cycles per operation (best) 7.642 bytes per cycle (best) 1.147 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.986 cycles per operation (best) 8.116 bytes per cycle (best) 1.071 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.204 cycles per operation (best) 2.497 bytes per cycle (best) 3.556 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.769 cycles per operation (best) 10.404 bytes per cycle (best) 0.868 cycles per operation (avg)
- number = 33554432.
- 256.00 MB
- dot(x1, x2, number) : 3.453 cycles per operation (best) 2.317 bytes per cycle (best) 3.515 cycles per operation (avg)
- dot128(x1, x2, number) : 1.032 cycles per operation (best) 7.754 bytes per cycle (best) 1.134 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.427 cycles per operation (best) 5.608 bytes per cycle (best) 1.690 cycles per operation (avg)
- dot256(x1, x2, number) : 0.998 cycles per operation (best) 8.014 bytes per cycle (best) 1.060 cycles per operation (avg)
- dot128fma(x1, x2, number) : 1.044 cycles per operation (best) 7.664 bytes per cycle (best) 1.115 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.991 cycles per operation (best) 8.074 bytes per cycle (best) 1.050 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.255 cycles per operation (best) 2.457 bytes per cycle (best) 3.445 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.766 cycles per operation (best) 10.443 bytes per cycle (best) 0.796 cycles per operation (avg)
- number = 134217728.
- 1024.00 MB
- dot(x1, x2, number) : 3.383 cycles per operation (best) 2.365 bytes per cycle (best) 3.478 cycles per operation (avg)
- dot128(x1, x2, number) : 1.034 cycles per operation (best) 7.737 bytes per cycle (best) 1.084 cycles per operation (avg)
- dot128dt(x1, x2, number) : 1.395 cycles per operation (best) 5.733 bytes per cycle (best) 1.555 cycles per operation (avg)
- dot256(x1, x2, number) : 0.962 cycles per operation (best) 8.314 bytes per cycle (best) 1.025 cycles per operation (avg)
- dot128fma(x1, x2, number) : 1.001 cycles per operation (best) 7.990 bytes per cycle (best) 1.063 cycles per operation (avg)
- dot256fma(x1, x2, number) : 0.990 cycles per operation (best) 8.078 bytes per cycle (best) 1.023 cycles per operation (avg)
- dot_inner_product(x1, x2, number) : 3.187 cycles per operation (best) 2.510 bytes per cycle (best) 3.337 cycles per operation (avg)
- dot_transform_reduce(x1, x2, number) : 0.758 cycles per operation (best) 10.558 bytes per cycle (best) 0.805 cycles per operation (avg)
Advertisement
Add Comment
Please, Sign In to add comment