shupeif

llamacpp-orangepi

Nov 7th, 2024
43
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 10.30 KB | None | 0 0
  1. | model | size | params | backend | threads | test | t/s |
  2. | ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |
  3. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp32 | 30.90 ± 0.61 |
  4. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp64 | 31.02 ± 0.45 |
  5. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp128 | 30.57 ± 0.11 |
  6. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | tg32 | 9.44 ± 0.01 |
  7. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp32 | 61.34 ± 0.91 |
  8. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp64 | 62.83 ± 0.01 |
  9. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp128 | 61.99 ± 0.04 |
  10. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | tg32 | 17.86 ± 0.01 |
  11. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp32 | 90.36 ± 1.40 |
  12. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp64 | 92.84 ± 0.04 |
  13. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp128 | 91.63 ± 0.01 |
  14. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | tg32 | 22.50 ± 0.05 |
  15. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp32 | 116.23 ± 0.02 |
  16. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp64 | 118.64 ± 0.17 |
  17. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp128 | 117.15 ± 0.15 |
  18. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | tg32 | 23.16 ± 0.03 |
  19. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | pp32 | 26.66 ± 0.11 |
  20. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | pp64 | 26.72 ± 0.08 |
  21. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | pp128 | 26.56 ± 0.01 |
  22. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | tg32 | 8.23 ± 0.14 |
  23. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | pp32 | 54.21 ± 0.01 |
  24. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | pp64 | 54.60 ± 0.02 |
  25. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | pp128 | 53.84 ± 0.00 |
  26. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | tg32 | 15.97 ± 0.08 |
  27. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 3 | pp32 | 80.30 ± 0.07 |
  28. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 3 | pp64 | 80.96 ± 0.06 |
  29. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 3 | pp128 | 79.27 ± 0.02 |
  30. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 3 | tg32 | 22.13 ± 0.13 |
  31. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | pp32 | 104.02 ± 0.23 |
  32. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | pp64 | 105.38 ± 0.04 |
  33. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | pp128 | 103.86 ± 0.04 |
  34. | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | tg32 | 22.92 ± 0.04 |
  35.  
  36. build: 32e0862a (4037)
  37.  
  38.  
  39. | model | size | params | backend | threads | test | t/s |
  40. | ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |
  41. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | pp32 | 11.52 ± 0.00 |
  42. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | pp64 | 11.86 ± 0.00 |
  43. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | pp128 | 12.11 ± 0.00 |
  44. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | tg32 | 5.81 ± 0.02 |
  45. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | pp32 | 22.85 ± 0.05 |
  46. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | pp64 | 23.45 ± 0.00 |
  47. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | pp128 | 23.91 ± 0.00 |
  48. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | tg32 | 11.15 ± 0.00 |
  49. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 3 | pp32 | 33.93 ± 0.12 |
  50. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 3 | pp64 | 34.54 ± 0.02 |
  51. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 3 | pp128 | 35.39 ± 0.01 |
  52. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 3 | tg32 | 16.00 ± 0.12 |
  53. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | pp32 | 43.46 ± 0.31 |
  54. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | pp64 | 44.95 ± 0.07 |
  55. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | pp128 | 46.19 ± 0.00 |
  56. | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | tg32 | 20.43 ± 0.05 |
  57. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | pp32 | 8.70 ± 0.00 |
  58. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | pp64 | 8.72 ± 0.00 |
  59. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | pp128 | 8.68 ± 0.00 |
  60. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | tg32 | 5.81 ± 0.02 |
  61. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | pp32 | 17.42 ± 0.01 |
  62. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | pp64 | 17.52 ± 0.00 |
  63. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | pp128 | 17.41 ± 0.00 |
  64. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | tg32 | 11.15 ± 0.05 |
  65. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 3 | pp32 | 25.68 ± 0.03 |
  66. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 3 | pp64 | 25.95 ± 0.01 |
  67. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 3 | pp128 | 25.83 ± 0.00 |
  68. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 3 | tg32 | 16.09 ± 0.02 |
  69. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | pp32 | 33.72 ± 0.05 |
  70. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | pp64 | 34.11 ± 0.02 |
  71. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | pp128 | 33.92 ± 0.02 |
  72. | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | tg32 | 20.17 ± 0.34 |
  73. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp32 | 36.64 ± 0.18 |
  74. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp64 | 36.66 ± 0.08 |
  75. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | pp128 | 36.46 ± 0.01 |
  76. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | tg32 | 9.60 ± 0.10 |
  77. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp32 | 71.99 ± 3.23 |
  78. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp64 | 72.51 ± 2.17 |
  79. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | pp128 | 70.95 ± 0.81 |
  80. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | tg32 | 18.14 ± 0.06 |
  81. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp32 | 110.57 ± 3.03 |
  82. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp64 | 114.50 ± 0.04 |
  83. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | pp128 | 112.56 ± 0.02 |
  84. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 3 | tg32 | 22.41 ± 0.07 |
  85. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp32 | 146.28 ± 0.74 |
  86. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp64 | 149.91 ± 0.04 |
  87. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | pp128 | 146.72 ± 0.05 |
  88. | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | tg32 | 22.43 ± 1.04 |
  89.  
  90. build: a9e8a9a0 (4033)
  91.  
Advertisement
Add Comment
Please, Sign In to add comment