Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- FriendlyELEC's arm64 Ubuntu Xenial for NEO Plus 2 (kernel 4.11.2, no cpufreq support but running at 816 MHz, DRAM clocked at 672 MHz obviously):
- root@NanoPi-M1-Plus2:~/tinymembench# ./tinymembench
- tinymembench v0.4.9 (simple benchmark for memory throughput and latency)
- ==========================================================================
- == Memory bandwidth tests ==
- == ==
- == Note 1: 1MB = 1000000 bytes ==
- == Note 2: Results for 'copy' tests show how many bytes can be ==
- == copied per second (adding together read and writen ==
- == bytes would have provided twice higher numbers) ==
- == Note 3: 2-pass copy means that we are using a small temporary buffer ==
- == to first fetch data into it, and only then write it to the ==
- == destination (source -> L1 cache, L1 cache -> destination) ==
- == Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
- == brackets ==
- ==========================================================================
- C copy backwards : 866.4 MB/s (1.1%)
- C copy backwards (32 byte blocks) : 867.0 MB/s (1.2%)
- C copy backwards (64 byte blocks) : 884.9 MB/s (0.8%)
- C copy : 891.3 MB/s (1.0%)
- C copy prefetched (32 bytes step) : 718.2 MB/s
- C copy prefetched (64 bytes step) : 812.4 MB/s
- C 2-pass copy : 871.3 MB/s
- C 2-pass copy prefetched (32 bytes step) : 649.6 MB/s
- C 2-pass copy prefetched (64 bytes step) : 336.3 MB/s (0.2%)
- C fill : 2193.5 MB/s
- C fill (shuffle within 16 byte blocks) : 2194.5 MB/s
- C fill (shuffle within 32 byte blocks) : 2194.6 MB/s
- C fill (shuffle within 64 byte blocks) : 2193.1 MB/s
- ---
- standard memcpy : 906.7 MB/s
- standard memset : 2195.6 MB/s
- ---
- NEON LDP/STP copy : 897.6 MB/s (0.4%)
- NEON LDP/STP copy pldl2strm (32 bytes step) : 683.5 MB/s (0.6%)
- NEON LDP/STP copy pldl2strm (64 bytes step) : 798.6 MB/s
- NEON LDP/STP copy pldl1keep (32 bytes step) : 951.5 MB/s
- NEON LDP/STP copy pldl1keep (64 bytes step) : 952.1 MB/s
- NEON LD1/ST1 copy : 900.9 MB/s
- NEON STP fill : 2195.3 MB/s
- NEON STNP fill : 1994.3 MB/s (0.4%)
- ARM LDP/STP copy : 898.3 MB/s (0.3%)
- ARM STP fill : 2194.8 MB/s
- ARM STNP fill : 1996.3 MB/s (0.3%)
- ==========================================================================
- == Framebuffer read tests. ==
- == ==
- == Many ARM devices use a part of the system memory as the framebuffer, ==
- == typically mapped as uncached but with write-combining enabled. ==
- == Writes to such framebuffers are quite fast, but reads are much ==
- == slower and very sensitive to the alignment and the selection of ==
- == CPU instructions which are used for accessing memory. ==
- == ==
- == Many x86 systems allocate the framebuffer in the GPU memory, ==
- == accessible for the CPU via a relatively slow PCI-E bus. Moreover, ==
- == PCI-E is asymmetric and handles reads a lot worse than writes. ==
- == ==
- == If uncached framebuffer reads are reasonably fast (at least 100 MB/s ==
- == or preferably >300 MB/s), then using the shadow framebuffer layer ==
- == is not necessary in Xorg DDX drivers, resulting in a nice overall ==
- == performance improvement. For example, the xf86-video-fbturbo DDX ==
- == uses this trick. ==
- ==========================================================================
- NEON LDP/STP copy (from framebuffer) : 165.7 MB/s
- NEON LDP/STP 2-pass copy (from framebuffer) : 156.6 MB/s
- NEON LD1/ST1 copy (from framebuffer) : 43.0 MB/s
- NEON LD1/ST1 2-pass copy (from framebuffer) : 42.5 MB/s
- ARM LDP/STP copy (from framebuffer) : 85.5 MB/s
- ARM LDP/STP 2-pass copy (from framebuffer) : 83.2 MB/s
- ==========================================================================
- == Memory latency test ==
- == ==
- == Average time is measured for random memory accesses in the buffers ==
- == of different sizes. The larger is the buffer, the more significant ==
- == are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
- == accesses. For extremely large buffer sizes we are expecting to see ==
- == page table walk with several requests to SDRAM for almost every ==
- == memory access (though 64MiB is not nearly large enough to experience ==
- == this effect to its fullest). ==
- == ==
- == Note 1: All the numbers are representing extra time, which needs to ==
- == be added to L1 cache latency. The cycle timings for L1 cache ==
- == latency can be usually found in the processor documentation. ==
- == Note 2: Dual random read means that we are simultaneously performing ==
- == two independent memory accesses at a time. In the case if ==
- == the memory subsystem can't handle multiple outstanding ==
- == requests, dual random read has the same timings as two ==
- == single reads performed one after another. ==
- ==========================================================================
- block size : single random read / dual random read, [MADV_NOHUGEPAGE]
- 1024 : 0.0 ns / 0.0 ns
- 2048 : 0.0 ns / 0.0 ns
- 4096 : 0.0 ns / 0.0 ns
- 8192 : 0.0 ns / 0.0 ns
- 16384 : 0.0 ns / 0.0 ns
- 32768 : 0.1 ns / 0.1 ns
- 65536 : 8.3 ns / 14.2 ns
- 131072 : 12.8 ns / 19.7 ns
- 262144 : 15.1 ns / 21.9 ns
- 524288 : 17.6 ns / 25.5 ns
- 1048576 : 104.5 ns / 162.5 ns
- 2097152 : 151.3 ns / 210.6 ns
- 4194304 : 184.0 ns / 237.0 ns
- 8388608 : 201.8 ns / 250.0 ns
- 16777216 : 212.7 ns / 258.8 ns
- 33554432 : 219.2 ns / 265.6 ns
- 67108864 : 222.8 ns / 269.2 ns
- block size : single random read / dual random read, [MADV_HUGEPAGE]
- 1024 : 0.0 ns / 0.0 ns
- 2048 : 0.0 ns / 0.0 ns
- 4096 : 0.0 ns / 0.0 ns
- 8192 : 0.0 ns / 0.0 ns
- 16384 : 0.0 ns / 0.0 ns
- 32768 : 0.1 ns / 0.1 ns
- 65536 : 8.3 ns / 14.2 ns
- 131072 : 12.8 ns / 19.7 ns
- 262144 : 15.1 ns / 21.9 ns
- 524288 : 17.7 ns / 24.8 ns
- 1048576 : 104.5 ns / 162.5 ns
- 2097152 : 151.4 ns / 210.6 ns
- 4194304 : 184.3 ns / 237.5 ns
- 8388608 : 201.9 ns / 250.4 ns
- 16777216 : 212.6 ns / 258.7 ns
- 33554432 : 219.2 ns / 265.6 ns
- 67108864 : 222.9 ns / 269.5 ns
- sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=4:
- execution time (avg/stddev): 11.2186/0.00
- root@NanoPi-M1-Plus2:~/tinymembench# 7zr b
- 7-Zip (A) 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
- p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,4 CPUs)
- RAM size: 482 MB, # CPU hardware threads: 4
- RAM usage: 434 MB, # Benchmark threads: 4
- Dict Compressing | Decompressing
- Speed Usage R/U Rating | Speed Usage R/U Rating
- KB/s % MIPS MIPS | KB/s % MIPS MIPS
- 22: 1165 296 383 1133 | 30716 399 694 2771
- 23: 1158 302 390 1180 | 30260 399 694 2769
- Killed
- for i in 128 192 256 ; do openssl speed -elapsed -evp aes-${i}-cbc ; done
- type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
- aes-128-cbc 102619.34k 274383.85k 458681.60k 570289.15k 613662.72k
- aes-192-cbc 95828.78k 236630.63k 366491.39k 436039.34k 461564.59k
- aes-256-cbc 91768.22k 213249.41k 313543.59k 363093.33k 380613.97k
- root@NanoPi-M1-Plus2:~# iperf3 -c 192.168.83.61 -t 60 && iperf3 -R -c 192.168.83.61 -t 60
- Connecting to host 192.168.83.61, port 5201
- [ 4] local 192.168.83.63 port 44298 connected to 192.168.83.61 port 5201
- [ ID] Interval Transfer Bandwidth Retr Cwnd
- [ 4] 0.00-1.01 sec 82.6 MBytes 690 Mbits/sec 0 156 KBytes
- [ 4] 1.01-2.01 sec 82.5 MBytes 691 Mbits/sec 0 164 KBytes
- [ 4] 2.01-3.00 sec 85.5 MBytes 722 Mbits/sec 0 334 KBytes
- [ 4] 3.00-4.00 sec 102 MBytes 852 Mbits/sec 0 389 KBytes
- [ 4] 4.00-5.00 sec 102 MBytes 853 Mbits/sec 0 389 KBytes
- [ 4] 5.00-6.00 sec 101 MBytes 844 Mbits/sec 0 389 KBytes
- [ 4] 6.00-7.00 sec 102 MBytes 856 Mbits/sec 0 389 KBytes
- [ 4] 7.00-8.00 sec 101 MBytes 845 Mbits/sec 0 389 KBytes
- [ 4] 8.00-9.00 sec 102 MBytes 853 Mbits/sec 0 389 KBytes
- [ 4] 9.00-10.00 sec 101 MBytes 844 Mbits/sec 0 389 KBytes
- [ 4] 10.00-11.00 sec 101 MBytes 846 Mbits/sec 0 389 KBytes
- [ 4] 11.00-12.00 sec 100 MBytes 839 Mbits/sec 0 389 KBytes
- [ 4] 12.00-13.00 sec 100 MBytes 842 Mbits/sec 0 389 KBytes
- [ 4] 13.00-14.00 sec 101 MBytes 848 Mbits/sec 0 389 KBytes
- [ 4] 14.00-15.00 sec 102 MBytes 854 Mbits/sec 0 389 KBytes
- [ 4] 15.00-16.00 sec 101 MBytes 850 Mbits/sec 0 389 KBytes
- [ 4] 16.00-17.00 sec 102 MBytes 855 Mbits/sec 0 389 KBytes
- [ 4] 17.00-18.00 sec 101 MBytes 849 Mbits/sec 0 389 KBytes
- [ 4] 18.00-19.00 sec 101 MBytes 846 Mbits/sec 0 389 KBytes
- [ 4] 19.00-20.00 sec 102 MBytes 860 Mbits/sec 0 389 KBytes
- [ 4] 20.00-21.00 sec 101 MBytes 845 Mbits/sec 0 389 KBytes
- [ 4] 21.00-22.00 sec 102 MBytes 855 Mbits/sec 0 389 KBytes
- [ 4] 22.00-23.00 sec 102 MBytes 856 Mbits/sec 0 389 KBytes
- [ 4] 23.00-24.00 sec 101 MBytes 844 Mbits/sec 0 389 KBytes
- [ 4] 24.00-25.00 sec 102 MBytes 855 Mbits/sec 0 389 KBytes
- [ 4] 25.00-26.00 sec 102 MBytes 854 Mbits/sec 0 389 KBytes
- [ 4] 26.00-27.00 sec 102 MBytes 852 Mbits/sec 0 389 KBytes
- [ 4] 27.00-28.00 sec 101 MBytes 849 Mbits/sec 0 389 KBytes
- [ 4] 28.00-29.00 sec 102 MBytes 853 Mbits/sec 0 389 KBytes
- [ 4] 29.00-30.00 sec 101 MBytes 850 Mbits/sec 0 389 KBytes
- [ 4] 30.00-31.00 sec 101 MBytes 851 Mbits/sec 0 389 KBytes
- [ 4] 31.00-32.00 sec 102 MBytes 854 Mbits/sec 0 389 KBytes
- [ 4] 32.00-33.00 sec 101 MBytes 851 Mbits/sec 0 389 KBytes
- [ 4] 33.00-34.00 sec 102 MBytes 855 Mbits/sec 0 389 KBytes
- [ 4] 34.00-35.00 sec 101 MBytes 851 Mbits/sec 0 389 KBytes
- [ 4] 35.00-36.00 sec 101 MBytes 851 Mbits/sec 0 389 KBytes
- [ 4] 36.00-37.00 sec 102 MBytes 852 Mbits/sec 0 389 KBytes
- [ 4] 37.00-38.00 sec 102 MBytes 854 Mbits/sec 0 389 KBytes
- [ 4] 38.00-39.00 sec 102 MBytes 853 Mbits/sec 0 389 KBytes
- [ 4] 39.00-40.00 sec 102 MBytes 852 Mbits/sec 0 389 KBytes
- [ 4] 40.00-41.00 sec 102 MBytes 853 Mbits/sec 0 389 KBytes
- [ 4] 41.00-42.00 sec 101 MBytes 846 Mbits/sec 0 389 KBytes
- [ 4] 42.00-43.00 sec 101 MBytes 847 Mbits/sec 0 389 KBytes
- [ 4] 43.00-44.00 sec 101 MBytes 846 Mbits/sec 0 389 KBytes
- [ 4] 44.00-45.00 sec 103 MBytes 861 Mbits/sec 0 570 KBytes
- [ 4] 45.00-46.00 sec 101 MBytes 843 Mbits/sec 0 570 KBytes
- [ 4] 46.00-47.00 sec 103 MBytes 860 Mbits/sec 0 570 KBytes
- [ 4] 47.00-48.00 sec 101 MBytes 848 Mbits/sec 0 570 KBytes
- [ 4] 48.00-49.00 sec 102 MBytes 853 Mbits/sec 0 570 KBytes
- [ 4] 49.00-50.00 sec 102 MBytes 855 Mbits/sec 0 570 KBytes
- [ 4] 50.00-51.00 sec 102 MBytes 853 Mbits/sec 0 570 KBytes
- [ 4] 51.00-52.00 sec 101 MBytes 851 Mbits/sec 0 570 KBytes
- [ 4] 52.00-53.00 sec 102 MBytes 855 Mbits/sec 0 570 KBytes
- [ 4] 53.00-54.00 sec 103 MBytes 863 Mbits/sec 0 570 KBytes
- [ 4] 54.00-55.00 sec 102 MBytes 853 Mbits/sec 0 570 KBytes
- [ 4] 55.00-56.00 sec 101 MBytes 848 Mbits/sec 0 570 KBytes
- [ 4] 56.00-57.00 sec 102 MBytes 854 Mbits/sec 0 570 KBytes
- [ 4] 57.00-58.00 sec 102 MBytes 855 Mbits/sec 0 570 KBytes
- [ 4] 58.00-59.00 sec 102 MBytes 859 Mbits/sec 0 570 KBytes
- [ 4] 59.00-60.00 sec 102 MBytes 852 Mbits/sec 0 570 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - - -
- [ ID] Interval Transfer Bandwidth Retr
- [ 4] 0.00-60.00 sec 5.89 GBytes 844 Mbits/sec 0 sender
- [ 4] 0.00-60.00 sec 5.89 GBytes 844 Mbits/sec receiver
- iperf Done.
- Connecting to host 192.168.83.61, port 5201
- Reverse mode, remote host 192.168.83.61 is sending
- [ 4] local 192.168.83.63 port 44302 connected to 192.168.83.61 port 5201
- [ ID] Interval Transfer Bandwidth
- [ 4] 0.00-1.00 sec 112 MBytes 941 Mbits/sec
- [ 4] 1.00-2.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 2.00-3.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 3.00-4.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 4.00-5.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 5.00-6.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 6.00-7.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 7.00-8.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 8.00-9.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 9.00-10.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 10.00-11.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 11.00-12.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 12.00-13.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 13.00-14.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 14.00-15.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 15.00-16.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 16.00-17.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 17.00-18.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 18.00-19.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 19.00-20.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 20.00-21.00 sec 78.0 MBytes 654 Mbits/sec
- [ 4] 21.00-22.00 sec 94.8 MBytes 795 Mbits/sec
- [ 4] 22.00-23.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 23.00-24.00 sec 111 MBytes 935 Mbits/sec
- [ 4] 24.00-25.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 25.00-26.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 26.00-27.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 27.00-28.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 28.00-29.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 29.00-30.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 30.00-31.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 31.00-32.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 32.00-33.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 33.00-34.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 34.00-35.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 35.00-36.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 36.00-37.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 37.00-38.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 38.00-39.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 39.00-40.00 sec 112 MBytes 937 Mbits/sec
- [ 4] 40.00-41.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 41.00-42.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 42.00-43.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 43.00-44.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 44.00-45.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 45.00-46.00 sec 93.2 MBytes 782 Mbits/sec
- [ 4] 46.00-47.00 sec 67.3 MBytes 562 Mbits/sec
- [ 4] 47.00-48.00 sec 95.9 MBytes 807 Mbits/sec
- [ 4] 48.00-49.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 49.00-50.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 50.00-51.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 51.00-52.00 sec 112 MBytes 939 Mbits/sec
- [ 4] 52.00-53.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 53.00-54.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 54.00-55.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 55.00-56.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 56.00-57.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 57.00-58.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 58.00-59.00 sec 112 MBytes 940 Mbits/sec
- [ 4] 59.00-60.00 sec 112 MBytes 940 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - - -
- [ ID] Interval Transfer Bandwidth
- [ 4] 0.00-60.00 sec 6.44 GBytes 922 Mbits/sec sender
- [ 4] 0.00-60.00 sec 6.44 GBytes 922 Mbits/sec receiver
- iperf Done.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement