Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- | model | size | params | backend | threads | fa | test | t/s |
- | ------------------------------ | ---------: | ---------: | ---------- | ------: | -: | ------------: | -------------------: |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp64 | 153.97 ± 6.55 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp128 | 232.21 ± 0.77 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp256 | 272.61 ± 0.27 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | tg64 | 26.45 ± 0.00 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp64 | 176.57 ± 0.26 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp128 | 265.21 ± 0.28 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp256 | 324.88 ± 0.43 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | tg64 | 46.52 ± 0.01 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp64 | 156.49 ± 1.24 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp128 | 244.99 ± 1.93 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp256 | 322.09 ± 3.23 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | tg64 | 70.68 ± 0.21 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 1 | 1 | pp64 | 96.27 ± 3.51 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 1 | 1 | pp128 | 161.08 ± 0.02 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 1 | 1 | pp256 | 215.83 ± 0.04 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 1 | 1 | tg64 | 22.52 ± 0.02 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 2 | 1 | pp64 | 130.64 ± 0.38 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 2 | 1 | pp128 | 211.95 ± 0.41 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 2 | 1 | pp256 | 281.87 ± 0.31 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 2 | 1 | tg64 | 40.52 ± 0.01 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 4 | 1 | pp64 | 109.06 ± 0.16 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 4 | 1 | pp128 | 190.29 ± 0.76 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 4 | 1 | pp256 | 275.51 ± 0.88 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | BLAS | 4 | 1 | tg64 | 62.50 ± 0.31 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp64 | 130.07 ± 2.56 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp128 | 130.55 ± 0.07 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | pp256 | 126.01 ± 0.10 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 1 | 1 | tg64 | 40.52 ± 0.46 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp64 | 253.15 ± 0.31 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp128 | 252.51 ± 0.60 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | pp256 | 243.98 ± 0.19 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 2 | 1 | tg64 | 66.56 ± 0.04 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp64 | 468.86 ± 0.43 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp128 | 454.57 ± 11.13 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | pp256 | 444.80 ± 1.81 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | BLAS | 4 | 1 | tg64 | 77.89 ± 0.33 |
- build: b6453c3a (4039)
- | model | size | params | backend | threads | fa | test | t/s |
- | ------------------------------ | ---------: | ---------: | ---------- | ------: | -: | ------------: | -------------------: |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp64 | 62.57 ± 1.23 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp128 | 62.23 ± 0.25 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp256 | 60.40 ± 0.54 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | tg64 | 26.58 ± 0.02 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp64 | 120.63 ± 0.12 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp128 | 119.39 ± 0.09 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp256 | 118.31 ± 0.03 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | tg64 | 46.76 ± 0.03 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp64 | 227.80 ± 1.40 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp128 | 222.03 ± 3.12 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp256 | 225.47 ± 0.76 |
- | llama 1B Q4_0 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | tg64 | 70.94 ± 0.41 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | 1 | pp64 | 31.05 ± 0.22 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | 1 | pp128 | 31.14 ± 0.02 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | 1 | pp256 | 30.86 ± 0.02 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 1 | 1 | tg64 | 22.53 ± 0.01 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | 1 | pp64 | 60.70 ± 0.03 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | 1 | pp128 | 60.64 ± 0.01 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | 1 | pp256 | 60.02 ± 0.02 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 2 | 1 | tg64 | 40.74 ± 0.03 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | 1 | pp64 | 116.31 ± 0.43 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | 1 | pp128 | 116.29 ± 0.31 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | 1 | pp256 | 115.24 ± 0.09 |
- | llama 1B IQ4_NL - 4.5 bpw | 733.75 MiB | 1.24 B | CPU | 4 | 1 | tg64 | 66.17 ± 0.21 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp64 | 135.46 ± 3.95 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp128 | 136.77 ± 0.04 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp256 | 131.45 ± 0.24 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | tg64 | 40.74 ± 0.08 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp64 | 266.49 ± 0.17 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp128 | 263.95 ± 2.09 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp256 | 254.62 ± 0.83 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | tg64 | 66.56 ± 0.02 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp64 | 504.90 ± 0.23 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp128 | 495.84 ± 13.62 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp256 | 480.09 ± 3.04 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | tg64 | 77.70 ± 0.21 |
- build: a9e8a9a0 (4033)
- | model | size | params | backend | threads | fa | test | t/s |
- | ------------------------------ | ---------: | ---------: | ---------- | ------: | -: | ------------: | -------------------: |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp64 | 121.95 ± 3.38 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp128 | 122.95 ± 0.14 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp256 | 118.84 ± 0.04 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 1 | 1 | tg64 | 40.83 ± 0.03 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp64 | 240.60 ± 0.08 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp128 | 239.15 ± 0.04 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp256 | 230.74 ± 0.13 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 2 | 1 | tg64 | 66.48 ± 0.05 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp64 | 452.19 ± 5.66 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp128 | 453.66 ± 1.15 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp256 | 440.32 ± 0.05 |
- | llama 1B Q4_0_4_4 | 727.75 MiB | 1.24 B | CPU | 4 | 1 | tg64 | 78.13 ± 0.71 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp64 | 108.30 ± 1.95 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp128 | 108.88 ± 0.03 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | 1 | pp256 | 105.11 ± 0.65 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 1 | 1 | tg64 | 34.85 ± 0.03 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp64 | 212.50 ± 0.52 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp128 | 211.68 ± 0.12 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | 1 | pp256 | 205.00 ± 0.07 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 2 | 1 | tg64 | 62.39 ± 0.06 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp64 | 404.10 ± 0.13 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp128 | 399.49 ± 4.20 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | 1 | pp256 | 388.26 ± 3.66 |
- | llama 1B IQ4_NL_4_4 - 4.5 bpw | 727.75 MiB | 1.24 B | CPU | 4 | 1 | tg64 | 76.75 ± 1.06 |
- build: 32e0862a (4037)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement