Guest User

Untitled

a guest
Jul 9th, 2025
29
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 65.54 KB | None | 0 0
  1. Tensor blk.2.ffn_up.weight buffer type overriden to CUDA0
  2. Tensor blk.3.attn_norm.weight buffer type overriden to CUDA0
  3. Tensor blk.3.attn_q_a_norm.weight buffer type overriden to CUDA0
  4. Tensor blk.3.attn_kv_a_norm.weight buffer type overriden to CUDA0
  5. Tensor blk.3.attn_q_a.weight buffer type overriden to CUDA0
  6. Tensor blk.3.attn_q_b.weight buffer type overriden to CUDA0
  7. Tensor blk.3.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  8. Tensor blk.3.attn_kv_b.weight buffer type overriden to CUDA0
  9. Tensor blk.3.attn_k_b.weight buffer type overriden to CUDA0
  10. Tensor blk.3.attn_v_b.weight buffer type overriden to CUDA0
  11. Tensor blk.3.attn_output.weight buffer type overriden to CUDA0
  12. Tensor blk.3.ffn_norm.weight buffer type overriden to CUDA0
  13. Tensor blk.3.ffn_gate_inp.weight buffer type overriden to CUDA0
  14. Tensor blk.3.ffn_gate_exps.weight buffer type overriden to CUDA0
  15. Tensor blk.3.ffn_down_exps.weight buffer type overriden to CUDA0
  16. Tensor blk.3.ffn_up_exps.weight buffer type overriden to CUDA0
  17. Tensor blk.3.ffn_gate_shexp.weight buffer type overriden to CUDA0
  18. Tensor blk.3.ffn_down_shexp.weight buffer type overriden to CUDA0
  19. Tensor blk.3.ffn_up_shexp.weight buffer type overriden to CUDA0
  20. Tensor blk.4.attn_norm.weight buffer type overriden to CUDA0
  21. Tensor blk.4.attn_q_a_norm.weight buffer type overriden to CUDA0
  22. Tensor blk.4.attn_kv_a_norm.weight buffer type overriden to CUDA0
  23. Tensor blk.4.attn_q_a.weight buffer type overriden to CUDA0
  24. Tensor blk.4.attn_q_b.weight buffer type overriden to CUDA0
  25. Tensor blk.4.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  26. Tensor blk.4.attn_kv_b.weight buffer type overriden to CUDA0
  27. Tensor blk.4.attn_k_b.weight buffer type overriden to CUDA0
  28. Tensor blk.4.attn_v_b.weight buffer type overriden to CUDA0
  29. Tensor blk.4.attn_output.weight buffer type overriden to CUDA0
  30. Tensor blk.4.ffn_norm.weight buffer type overriden to CUDA0
  31. Tensor blk.4.ffn_gate_inp.weight buffer type overriden to CUDA0
  32. Tensor blk.4.ffn_gate_exps.weight buffer type overriden to CUDA0
  33. Tensor blk.4.ffn_down_exps.weight buffer type overriden to CUDA0
  34. Tensor blk.4.ffn_up_exps.weight buffer type overriden to CUDA0
  35. Tensor blk.4.ffn_gate_shexp.weight buffer type overriden to CUDA0
  36. Tensor blk.4.ffn_down_shexp.weight buffer type overriden to CUDA0
  37. Tensor blk.4.ffn_up_shexp.weight buffer type overriden to CUDA0
  38. Tensor blk.5.attn_norm.weight buffer type overriden to CUDA0
  39. Tensor blk.5.attn_q_a_norm.weight buffer type overriden to CUDA0
  40. Tensor blk.5.attn_kv_a_norm.weight buffer type overriden to CUDA0
  41. Tensor blk.5.attn_q_a.weight buffer type overriden to CUDA0
  42. Tensor blk.5.attn_q_b.weight buffer type overriden to CUDA0
  43. Tensor blk.5.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  44. Tensor blk.5.attn_kv_b.weight buffer type overriden to CUDA0
  45. Tensor blk.5.attn_k_b.weight buffer type overriden to CUDA0
  46. Tensor blk.5.attn_v_b.weight buffer type overriden to CUDA0
  47. Tensor blk.5.attn_output.weight buffer type overriden to CUDA0
  48. Tensor blk.5.ffn_norm.weight buffer type overriden to CUDA0
  49. Tensor blk.5.ffn_gate_inp.weight buffer type overriden to CUDA0
  50. Tensor blk.5.ffn_gate_exps.weight buffer type overriden to CUDA0
  51. Tensor blk.5.ffn_down_exps.weight buffer type overriden to CUDA0
  52. Tensor blk.5.ffn_up_exps.weight buffer type overriden to CUDA0
  53. Tensor blk.5.ffn_gate_shexp.weight buffer type overriden to CUDA0
  54. Tensor blk.5.ffn_down_shexp.weight buffer type overriden to CUDA0
  55. Tensor blk.5.ffn_up_shexp.weight buffer type overriden to CUDA0
  56. Tensor blk.6.attn_norm.weight buffer type overriden to CUDA0
  57. Tensor blk.6.attn_q_a_norm.weight buffer type overriden to CUDA0
  58. Tensor blk.6.attn_kv_a_norm.weight buffer type overriden to CUDA0
  59. Tensor blk.6.attn_q_a.weight buffer type overriden to CUDA0
  60. Tensor blk.6.attn_q_b.weight buffer type overriden to CUDA0
  61. Tensor blk.6.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  62. Tensor blk.6.attn_kv_b.weight buffer type overriden to CUDA0
  63. Tensor blk.6.attn_k_b.weight buffer type overriden to CUDA0
  64. Tensor blk.6.attn_v_b.weight buffer type overriden to CUDA0
  65. Tensor blk.6.attn_output.weight buffer type overriden to CUDA0
  66. Tensor blk.6.ffn_norm.weight buffer type overriden to CUDA1
  67. Tensor blk.6.ffn_gate_inp.weight buffer type overriden to CUDA1
  68. Tensor blk.6.ffn_gate_exps.weight buffer type overriden to CUDA1
  69. Tensor blk.6.ffn_down_exps.weight buffer type overriden to CUDA1
  70. Tensor blk.6.ffn_up_exps.weight buffer type overriden to CUDA1
  71. Tensor blk.6.ffn_gate_shexp.weight buffer type overriden to CUDA0
  72. Tensor blk.6.ffn_down_shexp.weight buffer type overriden to CUDA0
  73. Tensor blk.6.ffn_up_shexp.weight buffer type overriden to CUDA0
  74. Tensor blk.7.attn_norm.weight buffer type overriden to CUDA0
  75. Tensor blk.7.attn_q_a_norm.weight buffer type overriden to CUDA0
  76. Tensor blk.7.attn_kv_a_norm.weight buffer type overriden to CUDA0
  77. Tensor blk.7.attn_q_a.weight buffer type overriden to CUDA0
  78. Tensor blk.7.attn_q_b.weight buffer type overriden to CUDA0
  79. Tensor blk.7.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  80. Tensor blk.7.attn_kv_b.weight buffer type overriden to CUDA0
  81. Tensor blk.7.attn_k_b.weight buffer type overriden to CUDA0
  82. Tensor blk.7.attn_v_b.weight buffer type overriden to CUDA0
  83. Tensor blk.7.attn_output.weight buffer type overriden to CUDA0
  84. Tensor blk.7.ffn_norm.weight buffer type overriden to CUDA1
  85. Tensor blk.7.ffn_gate_inp.weight buffer type overriden to CUDA1
  86. Tensor blk.7.ffn_gate_exps.weight buffer type overriden to CUDA1
  87. Tensor blk.7.ffn_down_exps.weight buffer type overriden to CUDA1
  88. Tensor blk.7.ffn_up_exps.weight buffer type overriden to CUDA1
  89. Tensor blk.7.ffn_gate_shexp.weight buffer type overriden to CUDA0
  90. Tensor blk.7.ffn_down_shexp.weight buffer type overriden to CUDA0
  91. Tensor blk.7.ffn_up_shexp.weight buffer type overriden to CUDA0
  92. Tensor blk.8.attn_norm.weight buffer type overriden to CUDA0
  93. Tensor blk.8.attn_q_a_norm.weight buffer type overriden to CUDA0
  94. Tensor blk.8.attn_kv_a_norm.weight buffer type overriden to CUDA0
  95. Tensor blk.8.attn_q_a.weight buffer type overriden to CUDA0
  96. Tensor blk.8.attn_q_b.weight buffer type overriden to CUDA0
  97. Tensor blk.8.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  98. Tensor blk.8.attn_kv_b.weight buffer type overriden to CUDA0
  99. Tensor blk.8.attn_k_b.weight buffer type overriden to CUDA0
  100. Tensor blk.8.attn_v_b.weight buffer type overriden to CUDA0
  101. Tensor blk.8.attn_output.weight buffer type overriden to CUDA0
  102. Tensor blk.8.ffn_norm.weight buffer type overriden to CUDA1
  103. Tensor blk.8.ffn_gate_inp.weight buffer type overriden to CUDA1
  104. Tensor blk.8.ffn_gate_exps.weight buffer type overriden to CUDA1
  105. Tensor blk.8.ffn_down_exps.weight buffer type overriden to CUDA1
  106. Tensor blk.8.ffn_up_exps.weight buffer type overriden to CUDA1
  107. Tensor blk.8.ffn_gate_shexp.weight buffer type overriden to CUDA0
  108. Tensor blk.8.ffn_down_shexp.weight buffer type overriden to CUDA0
  109. Tensor blk.8.ffn_up_shexp.weight buffer type overriden to CUDA0
  110. Tensor blk.9.attn_norm.weight buffer type overriden to CUDA0
  111. Tensor blk.9.attn_q_a_norm.weight buffer type overriden to CUDA0
  112. Tensor blk.9.attn_kv_a_norm.weight buffer type overriden to CUDA0
  113. Tensor blk.9.attn_q_a.weight buffer type overriden to CUDA0
  114. Tensor blk.9.attn_q_b.weight buffer type overriden to CUDA0
  115. Tensor blk.9.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  116. Tensor blk.9.attn_kv_b.weight buffer type overriden to CUDA0
  117. Tensor blk.9.attn_k_b.weight buffer type overriden to CUDA0
  118. Tensor blk.9.attn_v_b.weight buffer type overriden to CUDA0
  119. Tensor blk.9.attn_output.weight buffer type overriden to CUDA0
  120. Tensor blk.9.ffn_norm.weight buffer type overriden to CUDA1
  121. Tensor blk.9.ffn_gate_inp.weight buffer type overriden to CUDA1
  122. Tensor blk.9.ffn_gate_exps.weight buffer type overriden to CUDA1
  123. Tensor blk.9.ffn_down_exps.weight buffer type overriden to CUDA1
  124. Tensor blk.9.ffn_up_exps.weight buffer type overriden to CUDA1
  125. Tensor blk.9.ffn_gate_shexp.weight buffer type overriden to CUDA0
  126. Tensor blk.9.ffn_down_shexp.weight buffer type overriden to CUDA0
  127. Tensor blk.9.ffn_up_shexp.weight buffer type overriden to CUDA0
  128. Tensor blk.10.attn_norm.weight buffer type overriden to CUDA0
  129. Tensor blk.10.attn_q_a_norm.weight buffer type overriden to CUDA0
  130. Tensor blk.10.attn_kv_a_norm.weight buffer type overriden to CUDA0
  131. Tensor blk.10.attn_q_a.weight buffer type overriden to CUDA0
  132. Tensor blk.10.attn_q_b.weight buffer type overriden to CUDA0
  133. Tensor blk.10.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  134. Tensor blk.10.attn_kv_b.weight buffer type overriden to CUDA0
  135. Tensor blk.10.attn_k_b.weight buffer type overriden to CUDA0
  136. Tensor blk.10.attn_v_b.weight buffer type overriden to CUDA0
  137. Tensor blk.10.attn_output.weight buffer type overriden to CUDA0
  138. Tensor blk.10.ffn_norm.weight buffer type overriden to CUDA2
  139. Tensor blk.10.ffn_gate_inp.weight buffer type overriden to CUDA2
  140. Tensor blk.10.ffn_gate_exps.weight buffer type overriden to CUDA2
  141. Tensor blk.10.ffn_down_exps.weight buffer type overriden to CUDA2
  142. Tensor blk.10.ffn_up_exps.weight buffer type overriden to CUDA2
  143. Tensor blk.10.ffn_gate_shexp.weight buffer type overriden to CUDA0
  144. Tensor blk.10.ffn_down_shexp.weight buffer type overriden to CUDA0
  145. Tensor blk.10.ffn_up_shexp.weight buffer type overriden to CUDA0
  146. Tensor blk.11.attn_norm.weight buffer type overriden to CUDA0
  147. Tensor blk.11.attn_q_a_norm.weight buffer type overriden to CUDA0
  148. Tensor blk.11.attn_kv_a_norm.weight buffer type overriden to CUDA0
  149. Tensor blk.11.attn_q_a.weight buffer type overriden to CUDA0
  150. Tensor blk.11.attn_q_b.weight buffer type overriden to CUDA0
  151. Tensor blk.11.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  152. Tensor blk.11.attn_kv_b.weight buffer type overriden to CUDA0
  153. Tensor blk.11.attn_k_b.weight buffer type overriden to CUDA0
  154. Tensor blk.11.attn_v_b.weight buffer type overriden to CUDA0
  155. Tensor blk.11.attn_output.weight buffer type overriden to CUDA0
  156. Tensor blk.11.ffn_norm.weight buffer type overriden to CUDA2
  157. Tensor blk.11.ffn_gate_inp.weight buffer type overriden to CUDA2
  158. Tensor blk.11.ffn_gate_exps.weight buffer type overriden to CUDA2
  159. Tensor blk.11.ffn_down_exps.weight buffer type overriden to CUDA2
  160. Tensor blk.11.ffn_up_exps.weight buffer type overriden to CUDA2
  161. Tensor blk.11.ffn_gate_shexp.weight buffer type overriden to CUDA0
  162. Tensor blk.11.ffn_down_shexp.weight buffer type overriden to CUDA0
  163. Tensor blk.11.ffn_up_shexp.weight buffer type overriden to CUDA0
  164. Tensor blk.12.attn_norm.weight buffer type overriden to CUDA0
  165. Tensor blk.12.attn_q_a_norm.weight buffer type overriden to CUDA0
  166. Tensor blk.12.attn_kv_a_norm.weight buffer type overriden to CUDA0
  167. Tensor blk.12.attn_q_a.weight buffer type overriden to CUDA0
  168. Tensor blk.12.attn_q_b.weight buffer type overriden to CUDA0
  169. Tensor blk.12.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  170. Tensor blk.12.attn_kv_b.weight buffer type overriden to CUDA0
  171. Tensor blk.12.attn_k_b.weight buffer type overriden to CUDA0
  172. Tensor blk.12.attn_v_b.weight buffer type overriden to CUDA0
  173. Tensor blk.12.attn_output.weight buffer type overriden to CUDA0
  174. Tensor blk.12.ffn_norm.weight buffer type overriden to CUDA2
  175. Tensor blk.12.ffn_gate_inp.weight buffer type overriden to CUDA2
  176. Tensor blk.12.ffn_gate_exps.weight buffer type overriden to CUDA2
  177. Tensor blk.12.ffn_down_exps.weight buffer type overriden to CUDA2
  178. Tensor blk.12.ffn_up_exps.weight buffer type overriden to CUDA2
  179. Tensor blk.12.ffn_gate_shexp.weight buffer type overriden to CUDA0
  180. Tensor blk.12.ffn_down_shexp.weight buffer type overriden to CUDA0
  181. Tensor blk.12.ffn_up_shexp.weight buffer type overriden to CUDA0
  182. Tensor blk.13.attn_norm.weight buffer type overriden to CUDA0
  183. Tensor blk.13.attn_q_a_norm.weight buffer type overriden to CUDA0
  184. Tensor blk.13.attn_kv_a_norm.weight buffer type overriden to CUDA0
  185. Tensor blk.13.attn_q_a.weight buffer type overriden to CUDA0
  186. Tensor blk.13.attn_q_b.weight buffer type overriden to CUDA0
  187. Tensor blk.13.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  188. Tensor blk.13.attn_kv_b.weight buffer type overriden to CUDA0
  189. Tensor blk.13.attn_k_b.weight buffer type overriden to CUDA0
  190. Tensor blk.13.attn_v_b.weight buffer type overriden to CUDA0
  191. Tensor blk.13.attn_output.weight buffer type overriden to CUDA0
  192. Tensor blk.13.ffn_norm.weight buffer type overriden to CUDA2
  193. Tensor blk.13.ffn_gate_inp.weight buffer type overriden to CUDA2
  194. Tensor blk.13.ffn_gate_exps.weight buffer type overriden to CUDA2
  195. Tensor blk.13.ffn_down_exps.weight buffer type overriden to CUDA2
  196. Tensor blk.13.ffn_up_exps.weight buffer type overriden to CUDA2
  197. Tensor blk.13.ffn_gate_shexp.weight buffer type overriden to CUDA0
  198. Tensor blk.13.ffn_down_shexp.weight buffer type overriden to CUDA0
  199. Tensor blk.13.ffn_up_shexp.weight buffer type overriden to CUDA0
  200. Tensor blk.14.attn_norm.weight buffer type overriden to CUDA0
  201. Tensor blk.14.attn_q_a_norm.weight buffer type overriden to CUDA0
  202. Tensor blk.14.attn_kv_a_norm.weight buffer type overriden to CUDA0
  203. Tensor blk.14.attn_q_a.weight buffer type overriden to CUDA0
  204. Tensor blk.14.attn_q_b.weight buffer type overriden to CUDA0
  205. Tensor blk.14.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  206. Tensor blk.14.attn_kv_b.weight buffer type overriden to CUDA0
  207. Tensor blk.14.attn_k_b.weight buffer type overriden to CUDA0
  208. Tensor blk.14.attn_v_b.weight buffer type overriden to CUDA0
  209. Tensor blk.14.attn_output.weight buffer type overriden to CUDA0
  210. Tensor blk.14.ffn_norm.weight buffer type overriden to CUDA3
  211. Tensor blk.14.ffn_gate_inp.weight buffer type overriden to CUDA3
  212. Tensor blk.14.ffn_gate_exps.weight buffer type overriden to CUDA3
  213. Tensor blk.14.ffn_down_exps.weight buffer type overriden to CUDA3
  214. Tensor blk.14.ffn_up_exps.weight buffer type overriden to CUDA3
  215. Tensor blk.14.ffn_gate_shexp.weight buffer type overriden to CUDA0
  216. Tensor blk.14.ffn_down_shexp.weight buffer type overriden to CUDA0
  217. Tensor blk.14.ffn_up_shexp.weight buffer type overriden to CUDA0
  218. Tensor blk.15.attn_norm.weight buffer type overriden to CUDA0
  219. Tensor blk.15.attn_q_a_norm.weight buffer type overriden to CUDA0
  220. Tensor blk.15.attn_kv_a_norm.weight buffer type overriden to CUDA0
  221. Tensor blk.15.attn_q_a.weight buffer type overriden to CUDA0
  222. Tensor blk.15.attn_q_b.weight buffer type overriden to CUDA0
  223. Tensor blk.15.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  224. Tensor blk.15.attn_kv_b.weight buffer type overriden to CUDA0
  225. Tensor blk.15.attn_k_b.weight buffer type overriden to CUDA0
  226. Tensor blk.15.attn_v_b.weight buffer type overriden to CUDA0
  227. Tensor blk.15.attn_output.weight buffer type overriden to CUDA0
  228. Tensor blk.15.ffn_norm.weight buffer type overriden to CUDA3
  229. Tensor blk.15.ffn_gate_inp.weight buffer type overriden to CUDA3
  230. Tensor blk.15.ffn_gate_exps.weight buffer type overriden to CUDA3
  231. Tensor blk.15.ffn_down_exps.weight buffer type overriden to CUDA3
  232. Tensor blk.15.ffn_up_exps.weight buffer type overriden to CUDA3
  233. Tensor blk.15.ffn_gate_shexp.weight buffer type overriden to CUDA0
  234. Tensor blk.15.ffn_down_shexp.weight buffer type overriden to CUDA0
  235. Tensor blk.15.ffn_up_shexp.weight buffer type overriden to CUDA0
  236. Tensor blk.16.attn_norm.weight buffer type overriden to CUDA0
  237. Tensor blk.16.attn_q_a_norm.weight buffer type overriden to CUDA0
  238. Tensor blk.16.attn_kv_a_norm.weight buffer type overriden to CUDA0
  239. Tensor blk.16.attn_q_a.weight buffer type overriden to CUDA0
  240. Tensor blk.16.attn_q_b.weight buffer type overriden to CUDA0
  241. Tensor blk.16.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  242. Tensor blk.16.attn_kv_b.weight buffer type overriden to CUDA0
  243. Tensor blk.16.attn_k_b.weight buffer type overriden to CUDA0
  244. Tensor blk.16.attn_v_b.weight buffer type overriden to CUDA0
  245. Tensor blk.16.attn_output.weight buffer type overriden to CUDA0
  246. Tensor blk.16.ffn_norm.weight buffer type overriden to CUDA3
  247. Tensor blk.16.ffn_gate_inp.weight buffer type overriden to CUDA3
  248. Tensor blk.16.ffn_gate_exps.weight buffer type overriden to CUDA3
  249. Tensor blk.16.ffn_down_exps.weight buffer type overriden to CUDA3
  250. Tensor blk.16.ffn_up_exps.weight buffer type overriden to CUDA3
  251. Tensor blk.16.ffn_gate_shexp.weight buffer type overriden to CUDA0
  252. Tensor blk.16.ffn_down_shexp.weight buffer type overriden to CUDA0
  253. Tensor blk.16.ffn_up_shexp.weight buffer type overriden to CUDA0
  254. Tensor blk.17.attn_norm.weight buffer type overriden to CUDA0
  255. Tensor blk.17.attn_q_a_norm.weight buffer type overriden to CUDA0
  256. Tensor blk.17.attn_kv_a_norm.weight buffer type overriden to CUDA0
  257. Tensor blk.17.attn_q_a.weight buffer type overriden to CUDA0
  258. Tensor blk.17.attn_q_b.weight buffer type overriden to CUDA0
  259. Tensor blk.17.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  260. Tensor blk.17.attn_kv_b.weight buffer type overriden to CUDA0
  261. Tensor blk.17.attn_k_b.weight buffer type overriden to CUDA0
  262. Tensor blk.17.attn_v_b.weight buffer type overriden to CUDA0
  263. Tensor blk.17.attn_output.weight buffer type overriden to CUDA0
  264. Tensor blk.17.ffn_norm.weight buffer type overriden to CUDA3
  265. Tensor blk.17.ffn_gate_inp.weight buffer type overriden to CUDA3
  266. Tensor blk.17.ffn_gate_exps.weight buffer type overriden to CUDA3
  267. Tensor blk.17.ffn_down_exps.weight buffer type overriden to CUDA3
  268. Tensor blk.17.ffn_up_exps.weight buffer type overriden to CUDA3
  269. Tensor blk.17.ffn_gate_shexp.weight buffer type overriden to CUDA0
  270. Tensor blk.17.ffn_down_shexp.weight buffer type overriden to CUDA0
  271. Tensor blk.17.ffn_up_shexp.weight buffer type overriden to CUDA0
  272. Tensor blk.18.attn_norm.weight buffer type overriden to CUDA0
  273. Tensor blk.18.attn_q_a_norm.weight buffer type overriden to CUDA0
  274. Tensor blk.18.attn_kv_a_norm.weight buffer type overriden to CUDA0
  275. Tensor blk.18.attn_q_a.weight buffer type overriden to CUDA0
  276. Tensor blk.18.attn_q_b.weight buffer type overriden to CUDA0
  277. Tensor blk.18.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  278. Tensor blk.18.attn_kv_b.weight buffer type overriden to CUDA0
  279. Tensor blk.18.attn_k_b.weight buffer type overriden to CUDA0
  280. Tensor blk.18.attn_v_b.weight buffer type overriden to CUDA0
  281. Tensor blk.18.attn_output.weight buffer type overriden to CUDA0
  282. Tensor blk.18.ffn_norm.weight buffer type overriden to CUDA3
  283. Tensor blk.18.ffn_gate_inp.weight buffer type overriden to CUDA3
  284. Tensor blk.18.ffn_gate_exps.weight buffer type overriden to CUDA3
  285. Tensor blk.18.ffn_down_exps.weight buffer type overriden to CUDA3
  286. Tensor blk.18.ffn_up_exps.weight buffer type overriden to CUDA3
  287. Tensor blk.18.ffn_gate_shexp.weight buffer type overriden to CUDA0
  288. Tensor blk.18.ffn_down_shexp.weight buffer type overriden to CUDA0
  289. Tensor blk.18.ffn_up_shexp.weight buffer type overriden to CUDA0
  290. Tensor blk.19.attn_norm.weight buffer type overriden to CUDA0
  291. Tensor blk.19.attn_q_a_norm.weight buffer type overriden to CUDA0
  292. Tensor blk.19.attn_kv_a_norm.weight buffer type overriden to CUDA0
  293. Tensor blk.19.attn_q_a.weight buffer type overriden to CUDA0
  294. Tensor blk.19.attn_q_b.weight buffer type overriden to CUDA0
  295. Tensor blk.19.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  296. Tensor blk.19.attn_kv_b.weight buffer type overriden to CUDA0
  297. Tensor blk.19.attn_k_b.weight buffer type overriden to CUDA0
  298. Tensor blk.19.attn_v_b.weight buffer type overriden to CUDA0
  299. Tensor blk.19.attn_output.weight buffer type overriden to CUDA0
  300. Tensor blk.19.ffn_norm.weight buffer type overriden to CUDA4
  301. Tensor blk.19.ffn_gate_inp.weight buffer type overriden to CUDA4
  302. Tensor blk.19.ffn_gate_exps.weight buffer type overriden to CUDA4
  303. Tensor blk.19.ffn_down_exps.weight buffer type overriden to CUDA4
  304. Tensor blk.19.ffn_up_exps.weight buffer type overriden to CUDA4
  305. Tensor blk.19.ffn_gate_shexp.weight buffer type overriden to CUDA0
  306. Tensor blk.19.ffn_down_shexp.weight buffer type overriden to CUDA0
  307. Tensor blk.19.ffn_up_shexp.weight buffer type overriden to CUDA0
  308. Tensor blk.20.attn_norm.weight buffer type overriden to CUDA0
  309. Tensor blk.20.attn_q_a_norm.weight buffer type overriden to CUDA0
  310. Tensor blk.20.attn_kv_a_norm.weight buffer type overriden to CUDA0
  311. Tensor blk.20.attn_q_a.weight buffer type overriden to CUDA0
  312. Tensor blk.20.attn_q_b.weight buffer type overriden to CUDA0
  313. Tensor blk.20.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  314. Tensor blk.20.attn_kv_b.weight buffer type overriden to CUDA0
  315. Tensor blk.20.attn_k_b.weight buffer type overriden to CUDA0
  316. Tensor blk.20.attn_v_b.weight buffer type overriden to CUDA0
  317. Tensor blk.20.attn_output.weight buffer type overriden to CUDA0
  318. Tensor blk.20.ffn_norm.weight buffer type overriden to CUDA4
  319. Tensor blk.20.ffn_gate_inp.weight buffer type overriden to CUDA4
  320. Tensor blk.20.ffn_gate_exps.weight buffer type overriden to CUDA4
  321. Tensor blk.20.ffn_down_exps.weight buffer type overriden to CUDA4
  322. Tensor blk.20.ffn_up_exps.weight buffer type overriden to CUDA4
  323. Tensor blk.20.ffn_gate_shexp.weight buffer type overriden to CUDA0
  324. Tensor blk.20.ffn_down_shexp.weight buffer type overriden to CUDA0
  325. Tensor blk.20.ffn_up_shexp.weight buffer type overriden to CUDA0
  326. Tensor blk.21.attn_norm.weight buffer type overriden to CUDA0
  327. Tensor blk.21.attn_q_a_norm.weight buffer type overriden to CUDA0
  328. Tensor blk.21.attn_kv_a_norm.weight buffer type overriden to CUDA0
  329. Tensor blk.21.attn_q_a.weight buffer type overriden to CUDA0
  330. Tensor blk.21.attn_q_b.weight buffer type overriden to CUDA0
  331. Tensor blk.21.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  332. Tensor blk.21.attn_kv_b.weight buffer type overriden to CUDA0
  333. Tensor blk.21.attn_k_b.weight buffer type overriden to CUDA0
  334. Tensor blk.21.attn_v_b.weight buffer type overriden to CUDA0
  335. Tensor blk.21.attn_output.weight buffer type overriden to CUDA0
  336. Tensor blk.21.ffn_norm.weight buffer type overriden to CUDA4
  337. Tensor blk.21.ffn_gate_inp.weight buffer type overriden to CUDA4
  338. Tensor blk.21.ffn_gate_exps.weight buffer type overriden to CUDA4
  339. Tensor blk.21.ffn_down_exps.weight buffer type overriden to CUDA4
  340. Tensor blk.21.ffn_up_exps.weight buffer type overriden to CUDA4
  341. Tensor blk.21.ffn_gate_shexp.weight buffer type overriden to CUDA0
  342. Tensor blk.21.ffn_down_shexp.weight buffer type overriden to CUDA0
  343. Tensor blk.21.ffn_up_shexp.weight buffer type overriden to CUDA0
  344. Tensor blk.22.attn_norm.weight buffer type overriden to CUDA0
  345. Tensor blk.22.attn_q_a_norm.weight buffer type overriden to CUDA0
  346. Tensor blk.22.attn_kv_a_norm.weight buffer type overriden to CUDA0
  347. Tensor blk.22.attn_q_a.weight buffer type overriden to CUDA0
  348. Tensor blk.22.attn_q_b.weight buffer type overriden to CUDA0
  349. Tensor blk.22.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  350. Tensor blk.22.attn_kv_b.weight buffer type overriden to CUDA0
  351. Tensor blk.22.attn_k_b.weight buffer type overriden to CUDA0
  352. Tensor blk.22.attn_v_b.weight buffer type overriden to CUDA0
  353. Tensor blk.22.attn_output.weight buffer type overriden to CUDA0
  354. Tensor blk.22.ffn_norm.weight buffer type overriden to CUDA4
  355. Tensor blk.22.ffn_gate_inp.weight buffer type overriden to CUDA4
  356. Tensor blk.22.ffn_gate_exps.weight buffer type overriden to CUDA4
  357. Tensor blk.22.ffn_down_exps.weight buffer type overriden to CUDA4
  358. Tensor blk.22.ffn_up_exps.weight buffer type overriden to CUDA4
  359. Tensor blk.22.ffn_gate_shexp.weight buffer type overriden to CUDA0
  360. Tensor blk.22.ffn_down_shexp.weight buffer type overriden to CUDA0
  361. Tensor blk.22.ffn_up_shexp.weight buffer type overriden to CUDA0
  362. Tensor blk.23.attn_norm.weight buffer type overriden to CUDA0
  363. Tensor blk.23.attn_q_a_norm.weight buffer type overriden to CUDA0
  364. Tensor blk.23.attn_kv_a_norm.weight buffer type overriden to CUDA0
  365. Tensor blk.23.attn_q_a.weight buffer type overriden to CUDA0
  366. Tensor blk.23.attn_q_b.weight buffer type overriden to CUDA0
  367. Tensor blk.23.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  368. Tensor blk.23.attn_kv_b.weight buffer type overriden to CUDA0
  369. Tensor blk.23.attn_k_b.weight buffer type overriden to CUDA0
  370. Tensor blk.23.attn_v_b.weight buffer type overriden to CUDA0
  371. Tensor blk.23.attn_output.weight buffer type overriden to CUDA0
  372. Tensor blk.23.ffn_norm.weight buffer type overriden to CUDA5
  373. Tensor blk.23.ffn_gate_inp.weight buffer type overriden to CUDA5
  374. Tensor blk.23.ffn_gate_exps.weight buffer type overriden to CUDA5
  375. Tensor blk.23.ffn_down_exps.weight buffer type overriden to CUDA5
  376. Tensor blk.23.ffn_up_exps.weight buffer type overriden to CUDA5
  377. Tensor blk.23.ffn_gate_shexp.weight buffer type overriden to CUDA0
  378. Tensor blk.23.ffn_down_shexp.weight buffer type overriden to CUDA0
  379. Tensor blk.23.ffn_up_shexp.weight buffer type overriden to CUDA0
  380. Tensor blk.24.attn_norm.weight buffer type overriden to CUDA0
  381. Tensor blk.24.attn_q_a_norm.weight buffer type overriden to CUDA0
  382. Tensor blk.24.attn_kv_a_norm.weight buffer type overriden to CUDA0
  383. Tensor blk.24.attn_q_a.weight buffer type overriden to CUDA0
  384. Tensor blk.24.attn_q_b.weight buffer type overriden to CUDA0
  385. Tensor blk.24.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  386. Tensor blk.24.attn_kv_b.weight buffer type overriden to CUDA0
  387. Tensor blk.24.attn_k_b.weight buffer type overriden to CUDA0
  388. Tensor blk.24.attn_v_b.weight buffer type overriden to CUDA0
  389. Tensor blk.24.attn_output.weight buffer type overriden to CUDA0
  390. Tensor blk.24.ffn_norm.weight buffer type overriden to CUDA5
  391. Tensor blk.24.ffn_gate_inp.weight buffer type overriden to CUDA5
  392. Tensor blk.24.ffn_gate_exps.weight buffer type overriden to CUDA5
  393. Tensor blk.24.ffn_down_exps.weight buffer type overriden to CUDA5
  394. Tensor blk.24.ffn_up_exps.weight buffer type overriden to CUDA5
  395. Tensor blk.24.ffn_gate_shexp.weight buffer type overriden to CUDA0
  396. Tensor blk.24.ffn_down_shexp.weight buffer type overriden to CUDA0
  397. Tensor blk.24.ffn_up_shexp.weight buffer type overriden to CUDA0
  398. Tensor blk.25.attn_norm.weight buffer type overriden to CUDA0
  399. Tensor blk.25.attn_q_a_norm.weight buffer type overriden to CUDA0
  400. Tensor blk.25.attn_kv_a_norm.weight buffer type overriden to CUDA0
  401. Tensor blk.25.attn_q_a.weight buffer type overriden to CUDA0
  402. Tensor blk.25.attn_q_b.weight buffer type overriden to CUDA0
  403. Tensor blk.25.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  404. Tensor blk.25.attn_kv_b.weight buffer type overriden to CUDA0
  405. Tensor blk.25.attn_k_b.weight buffer type overriden to CUDA0
  406. Tensor blk.25.attn_v_b.weight buffer type overriden to CUDA0
  407. Tensor blk.25.attn_output.weight buffer type overriden to CUDA0
  408. Tensor blk.25.ffn_norm.weight buffer type overriden to CUDA5
  409. Tensor blk.25.ffn_gate_inp.weight buffer type overriden to CUDA5
  410. Tensor blk.25.ffn_gate_exps.weight buffer type overriden to CUDA5
  411. Tensor blk.25.ffn_down_exps.weight buffer type overriden to CUDA5
  412. Tensor blk.25.ffn_up_exps.weight buffer type overriden to CUDA5
  413. Tensor blk.25.ffn_gate_shexp.weight buffer type overriden to CUDA0
  414. Tensor blk.25.ffn_down_shexp.weight buffer type overriden to CUDA0
  415. Tensor blk.25.ffn_up_shexp.weight buffer type overriden to CUDA0
  416. Tensor blk.26.attn_norm.weight buffer type overriden to CUDA0
  417. Tensor blk.26.attn_q_a_norm.weight buffer type overriden to CUDA0
  418. Tensor blk.26.attn_kv_a_norm.weight buffer type overriden to CUDA0
  419. Tensor blk.26.attn_q_a.weight buffer type overriden to CUDA0
  420. Tensor blk.26.attn_q_b.weight buffer type overriden to CUDA0
  421. Tensor blk.26.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  422. Tensor blk.26.attn_kv_b.weight buffer type overriden to CUDA0
  423. Tensor blk.26.attn_k_b.weight buffer type overriden to CUDA0
  424. Tensor blk.26.attn_v_b.weight buffer type overriden to CUDA0
  425. Tensor blk.26.attn_output.weight buffer type overriden to CUDA0
  426. Tensor blk.26.ffn_norm.weight buffer type overriden to CUDA5
  427. Tensor blk.26.ffn_gate_inp.weight buffer type overriden to CUDA5
  428. Tensor blk.26.ffn_gate_exps.weight buffer type overriden to CUDA5
  429. Tensor blk.26.ffn_down_exps.weight buffer type overriden to CUDA5
  430. Tensor blk.26.ffn_up_exps.weight buffer type overriden to CUDA5
  431. Tensor blk.26.ffn_gate_shexp.weight buffer type overriden to CUDA0
  432. Tensor blk.26.ffn_down_shexp.weight buffer type overriden to CUDA0
  433. Tensor blk.26.ffn_up_shexp.weight buffer type overriden to CUDA0
  434. Tensor blk.27.attn_norm.weight buffer type overriden to CUDA0
  435. Tensor blk.27.attn_q_a_norm.weight buffer type overriden to CUDA0
  436. Tensor blk.27.attn_kv_a_norm.weight buffer type overriden to CUDA0
  437. Tensor blk.27.attn_q_a.weight buffer type overriden to CUDA0
  438. Tensor blk.27.attn_q_b.weight buffer type overriden to CUDA0
  439. Tensor blk.27.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  440. Tensor blk.27.attn_kv_b.weight buffer type overriden to CUDA0
  441. Tensor blk.27.attn_k_b.weight buffer type overriden to CUDA0
  442. Tensor blk.27.attn_v_b.weight buffer type overriden to CUDA0
  443. Tensor blk.27.attn_output.weight buffer type overriden to CUDA0
  444. Tensor blk.27.ffn_norm.weight buffer type overriden to CUDA6
  445. Tensor blk.27.ffn_gate_inp.weight buffer type overriden to CUDA6
  446. Tensor blk.27.ffn_gate_exps.weight buffer type overriden to CUDA6
  447. Tensor blk.27.ffn_down_exps.weight buffer type overriden to CUDA6
  448. Tensor blk.27.ffn_up_exps.weight buffer type overriden to CUDA6
  449. Tensor blk.27.ffn_gate_shexp.weight buffer type overriden to CUDA0
  450. Tensor blk.27.ffn_down_shexp.weight buffer type overriden to CUDA0
  451. Tensor blk.27.ffn_up_shexp.weight buffer type overriden to CUDA0
  452. Tensor blk.28.attn_norm.weight buffer type overriden to CUDA0
  453. Tensor blk.28.attn_q_a_norm.weight buffer type overriden to CUDA0
  454. Tensor blk.28.attn_kv_a_norm.weight buffer type overriden to CUDA0
  455. Tensor blk.28.attn_q_a.weight buffer type overriden to CUDA0
  456. Tensor blk.28.attn_q_b.weight buffer type overriden to CUDA0
  457. Tensor blk.28.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  458. Tensor blk.28.attn_kv_b.weight buffer type overriden to CUDA0
  459. Tensor blk.28.attn_k_b.weight buffer type overriden to CUDA0
  460. Tensor blk.28.attn_v_b.weight buffer type overriden to CUDA0
  461. Tensor blk.28.attn_output.weight buffer type overriden to CUDA0
  462. Tensor blk.28.ffn_norm.weight buffer type overriden to CUDA6
  463. Tensor blk.28.ffn_gate_inp.weight buffer type overriden to CUDA6
  464. Tensor blk.28.ffn_gate_exps.weight buffer type overriden to CUDA6
  465. Tensor blk.28.ffn_down_exps.weight buffer type overriden to CUDA6
  466. Tensor blk.28.ffn_up_exps.weight buffer type overriden to CUDA6
  467. Tensor blk.28.ffn_gate_shexp.weight buffer type overriden to CUDA0
  468. Tensor blk.28.ffn_down_shexp.weight buffer type overriden to CUDA0
  469. Tensor blk.28.ffn_up_shexp.weight buffer type overriden to CUDA0
  470. Tensor blk.29.attn_norm.weight buffer type overriden to CUDA0
  471. Tensor blk.29.attn_q_a_norm.weight buffer type overriden to CUDA0
  472. Tensor blk.29.attn_kv_a_norm.weight buffer type overriden to CUDA0
  473. Tensor blk.29.attn_q_a.weight buffer type overriden to CUDA0
  474. Tensor blk.29.attn_q_b.weight buffer type overriden to CUDA0
  475. Tensor blk.29.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  476. Tensor blk.29.attn_kv_b.weight buffer type overriden to CUDA0
  477. Tensor blk.29.attn_k_b.weight buffer type overriden to CUDA0
  478. Tensor blk.29.attn_v_b.weight buffer type overriden to CUDA0
  479. Tensor blk.29.attn_output.weight buffer type overriden to CUDA0
  480. Tensor blk.29.ffn_norm.weight buffer type overriden to CUDA6
  481. Tensor blk.29.ffn_gate_inp.weight buffer type overriden to CUDA6
  482. Tensor blk.29.ffn_gate_exps.weight buffer type overriden to CUDA6
  483. Tensor blk.29.ffn_down_exps.weight buffer type overriden to CUDA6
  484. Tensor blk.29.ffn_up_exps.weight buffer type overriden to CUDA6
  485. Tensor blk.29.ffn_gate_shexp.weight buffer type overriden to CUDA0
  486. Tensor blk.29.ffn_down_shexp.weight buffer type overriden to CUDA0
  487. Tensor blk.29.ffn_up_shexp.weight buffer type overriden to CUDA0
  488. Tensor blk.30.attn_norm.weight buffer type overriden to CUDA0
  489. Tensor blk.30.attn_q_a_norm.weight buffer type overriden to CUDA0
  490. Tensor blk.30.attn_kv_a_norm.weight buffer type overriden to CUDA0
  491. Tensor blk.30.attn_q_a.weight buffer type overriden to CUDA0
  492. Tensor blk.30.attn_q_b.weight buffer type overriden to CUDA0
  493. Tensor blk.30.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  494. Tensor blk.30.attn_kv_b.weight buffer type overriden to CUDA0
  495. Tensor blk.30.attn_k_b.weight buffer type overriden to CUDA0
  496. Tensor blk.30.attn_v_b.weight buffer type overriden to CUDA0
  497. Tensor blk.30.attn_output.weight buffer type overriden to CUDA0
  498. Tensor blk.30.ffn_norm.weight buffer type overriden to CUDA6
  499. Tensor blk.30.ffn_gate_inp.weight buffer type overriden to CUDA6
  500. Tensor blk.30.ffn_gate_exps.weight buffer type overriden to CUDA6
  501. Tensor blk.30.ffn_down_exps.weight buffer type overriden to CUDA6
  502. Tensor blk.30.ffn_up_exps.weight buffer type overriden to CUDA6
  503. Tensor blk.30.ffn_gate_shexp.weight buffer type overriden to CUDA0
  504. Tensor blk.30.ffn_down_shexp.weight buffer type overriden to CUDA0
  505. Tensor blk.30.ffn_up_shexp.weight buffer type overriden to CUDA0
  506. Tensor blk.31.attn_norm.weight buffer type overriden to CUDA0
  507. Tensor blk.31.attn_q_a_norm.weight buffer type overriden to CUDA0
  508. Tensor blk.31.attn_kv_a_norm.weight buffer type overriden to CUDA0
  509. Tensor blk.31.attn_q_a.weight buffer type overriden to CUDA0
  510. Tensor blk.31.attn_q_b.weight buffer type overriden to CUDA0
  511. Tensor blk.31.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  512. Tensor blk.31.attn_kv_b.weight buffer type overriden to CUDA0
  513. Tensor blk.31.attn_k_b.weight buffer type overriden to CUDA0
  514. Tensor blk.31.attn_v_b.weight buffer type overriden to CUDA0
  515. Tensor blk.31.attn_output.weight buffer type overriden to CUDA0
  516. Tensor blk.31.ffn_norm.weight buffer type overriden to CUDA6
  517. Tensor blk.31.ffn_gate_inp.weight buffer type overriden to CUDA6
  518. Tensor blk.31.ffn_gate_exps.weight buffer type overriden to CUDA6
  519. Tensor blk.31.ffn_down_exps.weight buffer type overriden to CUDA6
  520. Tensor blk.31.ffn_up_exps.weight buffer type overriden to CUDA6
  521. Tensor blk.31.ffn_gate_shexp.weight buffer type overriden to CUDA0
  522. Tensor blk.31.ffn_down_shexp.weight buffer type overriden to CUDA0
  523. Tensor blk.31.ffn_up_shexp.weight buffer type overriden to CUDA0
  524. Tensor blk.32.attn_norm.weight buffer type overriden to CUDA0
  525. Tensor blk.32.attn_q_a_norm.weight buffer type overriden to CUDA0
  526. Tensor blk.32.attn_kv_a_norm.weight buffer type overriden to CUDA0
  527. Tensor blk.32.attn_q_a.weight buffer type overriden to CUDA0
  528. Tensor blk.32.attn_q_b.weight buffer type overriden to CUDA0
  529. Tensor blk.32.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  530. Tensor blk.32.attn_kv_b.weight buffer type overriden to CUDA0
  531. Tensor blk.32.attn_k_b.weight buffer type overriden to CUDA0
  532. Tensor blk.32.attn_v_b.weight buffer type overriden to CUDA0
  533. Tensor blk.32.attn_output.weight buffer type overriden to CUDA0
  534. Tensor blk.32.ffn_norm.weight buffer type overriden to CUDA6
  535. Tensor blk.32.ffn_gate_inp.weight buffer type overriden to CUDA6
  536. Tensor blk.32.ffn_gate_exps.weight buffer type overriden to CUDA6
  537. Tensor blk.32.ffn_down_exps.weight buffer type overriden to CUDA6
  538. Tensor blk.32.ffn_up_exps.weight buffer type overriden to CUDA6
  539. Tensor blk.32.ffn_gate_shexp.weight buffer type overriden to CUDA0
  540. Tensor blk.32.ffn_down_shexp.weight buffer type overriden to CUDA0
  541. Tensor blk.32.ffn_up_shexp.weight buffer type overriden to CUDA0
  542. Tensor blk.33.attn_norm.weight buffer type overriden to CUDA0
  543. Tensor blk.33.attn_q_a_norm.weight buffer type overriden to CUDA0
  544. Tensor blk.33.attn_kv_a_norm.weight buffer type overriden to CUDA0
  545. Tensor blk.33.attn_q_a.weight buffer type overriden to CUDA0
  546. Tensor blk.33.attn_q_b.weight buffer type overriden to CUDA0
  547. Tensor blk.33.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  548. Tensor blk.33.attn_kv_b.weight buffer type overriden to CUDA0
  549. Tensor blk.33.attn_k_b.weight buffer type overriden to CUDA0
  550. Tensor blk.33.attn_v_b.weight buffer type overriden to CUDA0
  551. Tensor blk.33.attn_output.weight buffer type overriden to CUDA0
  552. Tensor blk.33.ffn_norm.weight buffer type overriden to CUDA6
  553. Tensor blk.33.ffn_gate_inp.weight buffer type overriden to CUDA6
  554. Tensor blk.33.ffn_gate_exps.weight buffer type overriden to CUDA6
  555. Tensor blk.33.ffn_down_exps.weight buffer type overriden to CUDA6
  556. Tensor blk.33.ffn_up_exps.weight buffer type overriden to CUDA6
  557. Tensor blk.33.ffn_gate_shexp.weight buffer type overriden to CUDA0
  558. Tensor blk.33.ffn_down_shexp.weight buffer type overriden to CUDA0
  559. Tensor blk.33.ffn_up_shexp.weight buffer type overriden to CUDA0
  560. Tensor blk.34.attn_norm.weight buffer type overriden to CUDA0
  561. Tensor blk.34.attn_q_a_norm.weight buffer type overriden to CUDA0
  562. Tensor blk.34.attn_kv_a_norm.weight buffer type overriden to CUDA0
  563. Tensor blk.34.attn_q_a.weight buffer type overriden to CUDA0
  564. Tensor blk.34.attn_q_b.weight buffer type overriden to CUDA0
  565. Tensor blk.34.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  566. Tensor blk.34.attn_kv_b.weight buffer type overriden to CUDA0
  567. Tensor blk.34.attn_k_b.weight buffer type overriden to CUDA0
  568. Tensor blk.34.attn_v_b.weight buffer type overriden to CUDA0
  569. Tensor blk.34.attn_output.weight buffer type overriden to CUDA0
  570. Tensor blk.34.ffn_norm.weight buffer type overriden to CUDA6
  571. Tensor blk.34.ffn_gate_inp.weight buffer type overriden to CUDA6
  572. Tensor blk.34.ffn_gate_exps.weight buffer type overriden to CUDA6
  573. Tensor blk.34.ffn_down_exps.weight buffer type overriden to CUDA6
  574. Tensor blk.34.ffn_up_exps.weight buffer type overriden to CUDA6
  575. Tensor blk.34.ffn_gate_shexp.weight buffer type overriden to CUDA0
  576. Tensor blk.34.ffn_down_shexp.weight buffer type overriden to CUDA0
  577. Tensor blk.34.ffn_up_shexp.weight buffer type overriden to CUDA0
  578. Tensor blk.35.attn_norm.weight buffer type overriden to CUDA0
  579. Tensor blk.35.attn_q_a_norm.weight buffer type overriden to CUDA0
  580. Tensor blk.35.attn_kv_a_norm.weight buffer type overriden to CUDA0
  581. Tensor blk.35.attn_q_a.weight buffer type overriden to CUDA0
  582. Tensor blk.35.attn_q_b.weight buffer type overriden to CUDA0
  583. Tensor blk.35.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  584. Tensor blk.35.attn_kv_b.weight buffer type overriden to CUDA0
  585. Tensor blk.35.attn_k_b.weight buffer type overriden to CUDA0
  586. Tensor blk.35.attn_v_b.weight buffer type overriden to CUDA0
  587. Tensor blk.35.attn_output.weight buffer type overriden to CUDA0
  588. Tensor blk.35.ffn_gate_exps.weight buffer type overriden to CPU
  589. Tensor blk.35.ffn_down_exps.weight buffer type overriden to CPU
  590. Tensor blk.35.ffn_up_exps.weight buffer type overriden to CPU
  591. Tensor blk.35.ffn_gate_shexp.weight buffer type overriden to CUDA0
  592. Tensor blk.35.ffn_down_shexp.weight buffer type overriden to CUDA0
  593. Tensor blk.35.ffn_up_shexp.weight buffer type overriden to CUDA0
  594. Tensor blk.36.attn_norm.weight buffer type overriden to CUDA0
  595. Tensor blk.36.attn_q_a_norm.weight buffer type overriden to CUDA0
  596. Tensor blk.36.attn_kv_a_norm.weight buffer type overriden to CUDA0
  597. Tensor blk.36.attn_q_a.weight buffer type overriden to CUDA0
  598. Tensor blk.36.attn_q_b.weight buffer type overriden to CUDA0
  599. Tensor blk.36.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  600. Tensor blk.36.attn_kv_b.weight buffer type overriden to CUDA0
  601. Tensor blk.36.attn_k_b.weight buffer type overriden to CUDA0
  602. Tensor blk.36.attn_v_b.weight buffer type overriden to CUDA0
  603. Tensor blk.36.attn_output.weight buffer type overriden to CUDA0
  604. Tensor blk.36.ffn_gate_exps.weight buffer type overriden to CPU
  605. Tensor blk.36.ffn_down_exps.weight buffer type overriden to CPU
  606. Tensor blk.36.ffn_up_exps.weight buffer type overriden to CPU
  607. Tensor blk.36.ffn_gate_shexp.weight buffer type overriden to CUDA0
  608. Tensor blk.36.ffn_down_shexp.weight buffer type overriden to CUDA0
  609. Tensor blk.36.ffn_up_shexp.weight buffer type overriden to CUDA0
  610. Tensor blk.37.attn_norm.weight buffer type overriden to CUDA0
  611. Tensor blk.37.attn_q_a_norm.weight buffer type overriden to CUDA0
  612. Tensor blk.37.attn_kv_a_norm.weight buffer type overriden to CUDA0
  613. Tensor blk.37.attn_q_a.weight buffer type overriden to CUDA0
  614. Tensor blk.37.attn_q_b.weight buffer type overriden to CUDA0
  615. Tensor blk.37.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  616. Tensor blk.37.attn_kv_b.weight buffer type overriden to CUDA0
  617. Tensor blk.37.attn_k_b.weight buffer type overriden to CUDA0
  618. Tensor blk.37.attn_v_b.weight buffer type overriden to CUDA0
  619. Tensor blk.37.attn_output.weight buffer type overriden to CUDA0
  620. Tensor blk.37.ffn_gate_exps.weight buffer type overriden to CPU
  621. Tensor blk.37.ffn_down_exps.weight buffer type overriden to CPU
  622. Tensor blk.37.ffn_up_exps.weight buffer type overriden to CPU
  623. Tensor blk.37.ffn_gate_shexp.weight buffer type overriden to CUDA0
  624. Tensor blk.37.ffn_down_shexp.weight buffer type overriden to CUDA0
  625. Tensor blk.37.ffn_up_shexp.weight buffer type overriden to CUDA0
  626. Tensor blk.38.attn_norm.weight buffer type overriden to CUDA0
  627. Tensor blk.38.attn_q_a_norm.weight buffer type overriden to CUDA0
  628. Tensor blk.38.attn_kv_a_norm.weight buffer type overriden to CUDA0
  629. Tensor blk.38.attn_q_a.weight buffer type overriden to CUDA0
  630. Tensor blk.38.attn_q_b.weight buffer type overriden to CUDA0
  631. Tensor blk.38.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  632. Tensor blk.38.attn_kv_b.weight buffer type overriden to CUDA0
  633. Tensor blk.38.attn_k_b.weight buffer type overriden to CUDA0
  634. Tensor blk.38.attn_v_b.weight buffer type overriden to CUDA0
  635. Tensor blk.38.attn_output.weight buffer type overriden to CUDA0
  636. Tensor blk.38.ffn_gate_exps.weight buffer type overriden to CPU
  637. Tensor blk.38.ffn_down_exps.weight buffer type overriden to CPU
  638. Tensor blk.38.ffn_up_exps.weight buffer type overriden to CPU
  639. Tensor blk.38.ffn_gate_shexp.weight buffer type overriden to CUDA0
  640. Tensor blk.38.ffn_down_shexp.weight buffer type overriden to CUDA0
  641. Tensor blk.38.ffn_up_shexp.weight buffer type overriden to CUDA0
  642. Tensor blk.39.attn_norm.weight buffer type overriden to CUDA0
  643. Tensor blk.39.attn_q_a_norm.weight buffer type overriden to CUDA0
  644. Tensor blk.39.attn_kv_a_norm.weight buffer type overriden to CUDA0
  645. Tensor blk.39.attn_q_a.weight buffer type overriden to CUDA0
  646. Tensor blk.39.attn_q_b.weight buffer type overriden to CUDA0
  647. Tensor blk.39.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  648. Tensor blk.39.attn_kv_b.weight buffer type overriden to CUDA0
  649. Tensor blk.39.attn_k_b.weight buffer type overriden to CUDA0
  650. Tensor blk.39.attn_v_b.weight buffer type overriden to CUDA0
  651. Tensor blk.39.attn_output.weight buffer type overriden to CUDA0
  652. Tensor blk.39.ffn_gate_exps.weight buffer type overriden to CPU
  653. Tensor blk.39.ffn_down_exps.weight buffer type overriden to CPU
  654. Tensor blk.39.ffn_up_exps.weight buffer type overriden to CPU
  655. Tensor blk.39.ffn_gate_shexp.weight buffer type overriden to CUDA0
  656. Tensor blk.39.ffn_down_shexp.weight buffer type overriden to CUDA0
  657. Tensor blk.39.ffn_up_shexp.weight buffer type overriden to CUDA0
  658. Tensor blk.40.attn_norm.weight buffer type overriden to CUDA0
  659. Tensor blk.40.attn_q_a_norm.weight buffer type overriden to CUDA0
  660. Tensor blk.40.attn_kv_a_norm.weight buffer type overriden to CUDA0
  661. Tensor blk.40.attn_q_a.weight buffer type overriden to CUDA0
  662. Tensor blk.40.attn_q_b.weight buffer type overriden to CUDA0
  663. Tensor blk.40.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  664. Tensor blk.40.attn_kv_b.weight buffer type overriden to CUDA0
  665. Tensor blk.40.attn_k_b.weight buffer type overriden to CUDA0
  666. Tensor blk.40.attn_v_b.weight buffer type overriden to CUDA0
  667. Tensor blk.40.attn_output.weight buffer type overriden to CUDA0
  668. Tensor blk.40.ffn_gate_exps.weight buffer type overriden to CPU
  669. Tensor blk.40.ffn_down_exps.weight buffer type overriden to CPU
  670. Tensor blk.40.ffn_up_exps.weight buffer type overriden to CPU
  671. Tensor blk.40.ffn_gate_shexp.weight buffer type overriden to CUDA0
  672. Tensor blk.40.ffn_down_shexp.weight buffer type overriden to CUDA0
  673. Tensor blk.40.ffn_up_shexp.weight buffer type overriden to CUDA0
  674. Tensor blk.41.attn_norm.weight buffer type overriden to CUDA0
  675. Tensor blk.41.attn_q_a_norm.weight buffer type overriden to CUDA0
  676. Tensor blk.41.attn_kv_a_norm.weight buffer type overriden to CUDA0
  677. Tensor blk.41.attn_q_a.weight buffer type overriden to CUDA0
  678. Tensor blk.41.attn_q_b.weight buffer type overriden to CUDA0
  679. Tensor blk.41.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  680. Tensor blk.41.attn_kv_b.weight buffer type overriden to CUDA0
  681. Tensor blk.41.attn_k_b.weight buffer type overriden to CUDA0
  682. Tensor blk.41.attn_v_b.weight buffer type overriden to CUDA0
  683. Tensor blk.41.attn_output.weight buffer type overriden to CUDA0
  684. Tensor blk.41.ffn_gate_exps.weight buffer type overriden to CPU
  685. Tensor blk.41.ffn_down_exps.weight buffer type overriden to CPU
  686. Tensor blk.41.ffn_up_exps.weight buffer type overriden to CPU
  687. Tensor blk.41.ffn_gate_shexp.weight buffer type overriden to CUDA0
  688. Tensor blk.41.ffn_down_shexp.weight buffer type overriden to CUDA0
  689. Tensor blk.41.ffn_up_shexp.weight buffer type overriden to CUDA0
  690. Tensor blk.42.attn_norm.weight buffer type overriden to CUDA0
  691. Tensor blk.42.attn_q_a_norm.weight buffer type overriden to CUDA0
  692. Tensor blk.42.attn_kv_a_norm.weight buffer type overriden to CUDA0
  693. Tensor blk.42.attn_q_a.weight buffer type overriden to CUDA0
  694. Tensor blk.42.attn_q_b.weight buffer type overriden to CUDA0
  695. Tensor blk.42.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  696. Tensor blk.42.attn_kv_b.weight buffer type overriden to CUDA0
  697. Tensor blk.42.attn_k_b.weight buffer type overriden to CUDA0
  698. Tensor blk.42.attn_v_b.weight buffer type overriden to CUDA0
  699. Tensor blk.42.attn_output.weight buffer type overriden to CUDA0
  700. Tensor blk.42.ffn_gate_exps.weight buffer type overriden to CPU
  701. Tensor blk.42.ffn_down_exps.weight buffer type overriden to CPU
  702. Tensor blk.42.ffn_up_exps.weight buffer type overriden to CPU
  703. Tensor blk.42.ffn_gate_shexp.weight buffer type overriden to CUDA0
  704. Tensor blk.42.ffn_down_shexp.weight buffer type overriden to CUDA0
  705. Tensor blk.42.ffn_up_shexp.weight buffer type overriden to CUDA0
  706. Tensor blk.43.attn_norm.weight buffer type overriden to CUDA0
  707. Tensor blk.43.attn_q_a_norm.weight buffer type overriden to CUDA0
  708. Tensor blk.43.attn_kv_a_norm.weight buffer type overriden to CUDA0
  709. Tensor blk.43.attn_q_a.weight buffer type overriden to CUDA0
  710. Tensor blk.43.attn_q_b.weight buffer type overriden to CUDA0
  711. Tensor blk.43.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  712. Tensor blk.43.attn_kv_b.weight buffer type overriden to CUDA0
  713. Tensor blk.43.attn_k_b.weight buffer type overriden to CUDA0
  714. Tensor blk.43.attn_v_b.weight buffer type overriden to CUDA0
  715. Tensor blk.43.attn_output.weight buffer type overriden to CUDA0
  716. Tensor blk.43.ffn_gate_exps.weight buffer type overriden to CPU
  717. Tensor blk.43.ffn_down_exps.weight buffer type overriden to CPU
  718. Tensor blk.43.ffn_up_exps.weight buffer type overriden to CPU
  719. Tensor blk.43.ffn_gate_shexp.weight buffer type overriden to CUDA0
  720. Tensor blk.43.ffn_down_shexp.weight buffer type overriden to CUDA0
  721. Tensor blk.43.ffn_up_shexp.weight buffer type overriden to CUDA0
  722. Tensor blk.44.attn_norm.weight buffer type overriden to CUDA0
  723. Tensor blk.44.attn_q_a_norm.weight buffer type overriden to CUDA0
  724. Tensor blk.44.attn_kv_a_norm.weight buffer type overriden to CUDA0
  725. Tensor blk.44.attn_q_a.weight buffer type overriden to CUDA0
  726. Tensor blk.44.attn_q_b.weight buffer type overriden to CUDA0
  727. Tensor blk.44.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  728. Tensor blk.44.attn_kv_b.weight buffer type overriden to CUDA0
  729. Tensor blk.44.attn_k_b.weight buffer type overriden to CUDA0
  730. Tensor blk.44.attn_v_b.weight buffer type overriden to CUDA0
  731. Tensor blk.44.attn_output.weight buffer type overriden to CUDA0
  732. Tensor blk.44.ffn_gate_exps.weight buffer type overriden to CPU
  733. Tensor blk.44.ffn_down_exps.weight buffer type overriden to CPU
  734. Tensor blk.44.ffn_up_exps.weight buffer type overriden to CPU
  735. Tensor blk.44.ffn_gate_shexp.weight buffer type overriden to CUDA0
  736. Tensor blk.44.ffn_down_shexp.weight buffer type overriden to CUDA0
  737. Tensor blk.44.ffn_up_shexp.weight buffer type overriden to CUDA0
  738. Tensor blk.45.attn_norm.weight buffer type overriden to CUDA0
  739. Tensor blk.45.attn_q_a_norm.weight buffer type overriden to CUDA0
  740. Tensor blk.45.attn_kv_a_norm.weight buffer type overriden to CUDA0
  741. Tensor blk.45.attn_q_a.weight buffer type overriden to CUDA0
  742. Tensor blk.45.attn_q_b.weight buffer type overriden to CUDA0
  743. Tensor blk.45.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  744. Tensor blk.45.attn_kv_b.weight buffer type overriden to CUDA0
  745. Tensor blk.45.attn_k_b.weight buffer type overriden to CUDA0
  746. Tensor blk.45.attn_v_b.weight buffer type overriden to CUDA0
  747. Tensor blk.45.attn_output.weight buffer type overriden to CUDA0
  748. Tensor blk.45.ffn_gate_exps.weight buffer type overriden to CPU
  749. Tensor blk.45.ffn_down_exps.weight buffer type overriden to CPU
  750. Tensor blk.45.ffn_up_exps.weight buffer type overriden to CPU
  751. Tensor blk.45.ffn_gate_shexp.weight buffer type overriden to CUDA0
  752. Tensor blk.45.ffn_down_shexp.weight buffer type overriden to CUDA0
  753. Tensor blk.45.ffn_up_shexp.weight buffer type overriden to CUDA0
  754. Tensor blk.46.attn_norm.weight buffer type overriden to CUDA0
  755. Tensor blk.46.attn_q_a_norm.weight buffer type overriden to CUDA0
  756. Tensor blk.46.attn_kv_a_norm.weight buffer type overriden to CUDA0
  757. Tensor blk.46.attn_q_a.weight buffer type overriden to CUDA0
  758. Tensor blk.46.attn_q_b.weight buffer type overriden to CUDA0
  759. Tensor blk.46.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  760. Tensor blk.46.attn_kv_b.weight buffer type overriden to CUDA0
  761. Tensor blk.46.attn_k_b.weight buffer type overriden to CUDA0
  762. Tensor blk.46.attn_v_b.weight buffer type overriden to CUDA0
  763. Tensor blk.46.attn_output.weight buffer type overriden to CUDA0
  764. Tensor blk.46.ffn_gate_exps.weight buffer type overriden to CPU
  765. Tensor blk.46.ffn_down_exps.weight buffer type overriden to CPU
  766. Tensor blk.46.ffn_up_exps.weight buffer type overriden to CPU
  767. Tensor blk.46.ffn_gate_shexp.weight buffer type overriden to CUDA0
  768. Tensor blk.46.ffn_down_shexp.weight buffer type overriden to CUDA0
  769. Tensor blk.46.ffn_up_shexp.weight buffer type overriden to CUDA0
  770. Tensor blk.47.attn_norm.weight buffer type overriden to CUDA0
  771. Tensor blk.47.attn_q_a_norm.weight buffer type overriden to CUDA0
  772. Tensor blk.47.attn_kv_a_norm.weight buffer type overriden to CUDA0
  773. Tensor blk.47.attn_q_a.weight buffer type overriden to CUDA0
  774. Tensor blk.47.attn_q_b.weight buffer type overriden to CUDA0
  775. Tensor blk.47.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  776. Tensor blk.47.attn_kv_b.weight buffer type overriden to CUDA0
  777. Tensor blk.47.attn_k_b.weight buffer type overriden to CUDA0
  778. Tensor blk.47.attn_v_b.weight buffer type overriden to CUDA0
  779. Tensor blk.47.attn_output.weight buffer type overriden to CUDA0
  780. Tensor blk.47.ffn_gate_exps.weight buffer type overriden to CPU
  781. Tensor blk.47.ffn_down_exps.weight buffer type overriden to CPU
  782. Tensor blk.47.ffn_up_exps.weight buffer type overriden to CPU
  783. Tensor blk.47.ffn_gate_shexp.weight buffer type overriden to CUDA0
  784. Tensor blk.47.ffn_down_shexp.weight buffer type overriden to CUDA0
  785. Tensor blk.47.ffn_up_shexp.weight buffer type overriden to CUDA0
  786. Tensor blk.48.attn_norm.weight buffer type overriden to CUDA0
  787. Tensor blk.48.attn_q_a_norm.weight buffer type overriden to CUDA0
  788. Tensor blk.48.attn_kv_a_norm.weight buffer type overriden to CUDA0
  789. Tensor blk.48.attn_q_a.weight buffer type overriden to CUDA0
  790. Tensor blk.48.attn_q_b.weight buffer type overriden to CUDA0
  791. Tensor blk.48.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  792. Tensor blk.48.attn_kv_b.weight buffer type overriden to CUDA0
  793. Tensor blk.48.attn_k_b.weight buffer type overriden to CUDA0
  794. Tensor blk.48.attn_v_b.weight buffer type overriden to CUDA0
  795. Tensor blk.48.attn_output.weight buffer type overriden to CUDA0
  796. Tensor blk.48.ffn_gate_exps.weight buffer type overriden to CPU
  797. Tensor blk.48.ffn_down_exps.weight buffer type overriden to CPU
  798. Tensor blk.48.ffn_up_exps.weight buffer type overriden to CPU
  799. Tensor blk.48.ffn_gate_shexp.weight buffer type overriden to CUDA0
  800. Tensor blk.48.ffn_down_shexp.weight buffer type overriden to CUDA0
  801. Tensor blk.48.ffn_up_shexp.weight buffer type overriden to CUDA0
  802. Tensor blk.49.attn_norm.weight buffer type overriden to CUDA0
  803. Tensor blk.49.attn_q_a_norm.weight buffer type overriden to CUDA0
  804. Tensor blk.49.attn_kv_a_norm.weight buffer type overriden to CUDA0
  805. Tensor blk.49.attn_q_a.weight buffer type overriden to CUDA0
  806. Tensor blk.49.attn_q_b.weight buffer type overriden to CUDA0
  807. Tensor blk.49.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  808. Tensor blk.49.attn_kv_b.weight buffer type overriden to CUDA0
  809. Tensor blk.49.attn_k_b.weight buffer type overriden to CUDA0
  810. Tensor blk.49.attn_v_b.weight buffer type overriden to CUDA0
  811. Tensor blk.49.attn_output.weight buffer type overriden to CUDA0
  812. Tensor blk.49.ffn_gate_exps.weight buffer type overriden to CPU
  813. Tensor blk.49.ffn_down_exps.weight buffer type overriden to CPU
  814. Tensor blk.49.ffn_up_exps.weight buffer type overriden to CPU
  815. Tensor blk.49.ffn_gate_shexp.weight buffer type overriden to CUDA0
  816. Tensor blk.49.ffn_down_shexp.weight buffer type overriden to CUDA0
  817. Tensor blk.49.ffn_up_shexp.weight buffer type overriden to CUDA0
  818. Tensor blk.50.attn_norm.weight buffer type overriden to CUDA0
  819. Tensor blk.50.attn_q_a_norm.weight buffer type overriden to CUDA0
  820. Tensor blk.50.attn_kv_a_norm.weight buffer type overriden to CUDA0
  821. Tensor blk.50.attn_q_a.weight buffer type overriden to CUDA0
  822. Tensor blk.50.attn_q_b.weight buffer type overriden to CUDA0
  823. Tensor blk.50.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  824. Tensor blk.50.attn_kv_b.weight buffer type overriden to CUDA0
  825. Tensor blk.50.attn_k_b.weight buffer type overriden to CUDA0
  826. Tensor blk.50.attn_v_b.weight buffer type overriden to CUDA0
  827. Tensor blk.50.attn_output.weight buffer type overriden to CUDA0
  828. Tensor blk.50.ffn_gate_exps.weight buffer type overriden to CPU
  829. Tensor blk.50.ffn_down_exps.weight buffer type overriden to CPU
  830. Tensor blk.50.ffn_up_exps.weight buffer type overriden to CPU
  831. Tensor blk.50.ffn_gate_shexp.weight buffer type overriden to CUDA0
  832. Tensor blk.50.ffn_down_shexp.weight buffer type overriden to CUDA0
  833. Tensor blk.50.ffn_up_shexp.weight buffer type overriden to CUDA0
  834. Tensor blk.51.attn_norm.weight buffer type overriden to CUDA0
  835. Tensor blk.51.attn_q_a_norm.weight buffer type overriden to CUDA0
  836. Tensor blk.51.attn_kv_a_norm.weight buffer type overriden to CUDA0
  837. Tensor blk.51.attn_q_a.weight buffer type overriden to CUDA0
  838. Tensor blk.51.attn_q_b.weight buffer type overriden to CUDA0
  839. Tensor blk.51.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  840. Tensor blk.51.attn_kv_b.weight buffer type overriden to CUDA0
  841. Tensor blk.51.attn_k_b.weight buffer type overriden to CUDA0
  842. Tensor blk.51.attn_v_b.weight buffer type overriden to CUDA0
  843. Tensor blk.51.attn_output.weight buffer type overriden to CUDA0
  844. Tensor blk.51.ffn_gate_exps.weight buffer type overriden to CPU
  845. Tensor blk.51.ffn_down_exps.weight buffer type overriden to CPU
  846. Tensor blk.51.ffn_up_exps.weight buffer type overriden to CPU
  847. Tensor blk.51.ffn_gate_shexp.weight buffer type overriden to CUDA0
  848. Tensor blk.51.ffn_down_shexp.weight buffer type overriden to CUDA0
  849. Tensor blk.51.ffn_up_shexp.weight buffer type overriden to CUDA0
  850. Tensor blk.52.attn_norm.weight buffer type overriden to CUDA0
  851. Tensor blk.52.attn_q_a_norm.weight buffer type overriden to CUDA0
  852. Tensor blk.52.attn_kv_a_norm.weight buffer type overriden to CUDA0
  853. Tensor blk.52.attn_q_a.weight buffer type overriden to CUDA0
  854. Tensor blk.52.attn_q_b.weight buffer type overriden to CUDA0
  855. Tensor blk.52.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  856. Tensor blk.52.attn_kv_b.weight buffer type overriden to CUDA0
  857. Tensor blk.52.attn_k_b.weight buffer type overriden to CUDA0
  858. Tensor blk.52.attn_v_b.weight buffer type overriden to CUDA0
  859. Tensor blk.52.attn_output.weight buffer type overriden to CUDA0
  860. Tensor blk.52.ffn_gate_exps.weight buffer type overriden to CPU
  861. Tensor blk.52.ffn_down_exps.weight buffer type overriden to CPU
  862. Tensor blk.52.ffn_up_exps.weight buffer type overriden to CPU
  863. Tensor blk.52.ffn_gate_shexp.weight buffer type overriden to CUDA0
  864. Tensor blk.52.ffn_down_shexp.weight buffer type overriden to CUDA0
  865. Tensor blk.52.ffn_up_shexp.weight buffer type overriden to CUDA0
  866. Tensor blk.53.attn_norm.weight buffer type overriden to CUDA0
  867. Tensor blk.53.attn_q_a_norm.weight buffer type overriden to CUDA0
  868. Tensor blk.53.attn_kv_a_norm.weight buffer type overriden to CUDA0
  869. Tensor blk.53.attn_q_a.weight buffer type overriden to CUDA0
  870. Tensor blk.53.attn_q_b.weight buffer type overriden to CUDA0
  871. Tensor blk.53.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  872. Tensor blk.53.attn_kv_b.weight buffer type overriden to CUDA0
  873. Tensor blk.53.attn_k_b.weight buffer type overriden to CUDA0
  874. Tensor blk.53.attn_v_b.weight buffer type overriden to CUDA0
  875. Tensor blk.53.attn_output.weight buffer type overriden to CUDA0
  876. Tensor blk.53.ffn_gate_exps.weight buffer type overriden to CPU
  877. Tensor blk.53.ffn_down_exps.weight buffer type overriden to CPU
  878. Tensor blk.53.ffn_up_exps.weight buffer type overriden to CPU
  879. Tensor blk.53.ffn_gate_shexp.weight buffer type overriden to CUDA0
  880. Tensor blk.53.ffn_down_shexp.weight buffer type overriden to CUDA0
  881. Tensor blk.53.ffn_up_shexp.weight buffer type overriden to CUDA0
  882. Tensor blk.54.attn_norm.weight buffer type overriden to CUDA0
  883. Tensor blk.54.attn_q_a_norm.weight buffer type overriden to CUDA0
  884. Tensor blk.54.attn_kv_a_norm.weight buffer type overriden to CUDA0
  885. Tensor blk.54.attn_q_a.weight buffer type overriden to CUDA0
  886. Tensor blk.54.attn_q_b.weight buffer type overriden to CUDA0
  887. Tensor blk.54.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  888. Tensor blk.54.attn_kv_b.weight buffer type overriden to CUDA0
  889. Tensor blk.54.attn_k_b.weight buffer type overriden to CUDA0
  890. Tensor blk.54.attn_v_b.weight buffer type overriden to CUDA0
  891. Tensor blk.54.attn_output.weight buffer type overriden to CUDA0
  892. Tensor blk.54.ffn_gate_exps.weight buffer type overriden to CPU
  893. Tensor blk.54.ffn_down_exps.weight buffer type overriden to CPU
  894. Tensor blk.54.ffn_up_exps.weight buffer type overriden to CPU
  895. Tensor blk.54.ffn_gate_shexp.weight buffer type overriden to CUDA0
  896. Tensor blk.54.ffn_down_shexp.weight buffer type overriden to CUDA0
  897. Tensor blk.54.ffn_up_shexp.weight buffer type overriden to CUDA0
  898. Tensor blk.55.attn_norm.weight buffer type overriden to CUDA0
  899. Tensor blk.55.attn_q_a_norm.weight buffer type overriden to CUDA0
  900. Tensor blk.55.attn_kv_a_norm.weight buffer type overriden to CUDA0
  901. Tensor blk.55.attn_q_a.weight buffer type overriden to CUDA0
  902. Tensor blk.55.attn_q_b.weight buffer type overriden to CUDA0
  903. Tensor blk.55.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  904. Tensor blk.55.attn_kv_b.weight buffer type overriden to CUDA0
  905. Tensor blk.55.attn_k_b.weight buffer type overriden to CUDA0
  906. Tensor blk.55.attn_v_b.weight buffer type overriden to CUDA0
  907. Tensor blk.55.attn_output.weight buffer type overriden to CUDA0
  908. Tensor blk.55.ffn_gate_exps.weight buffer type overriden to CPU
  909. Tensor blk.55.ffn_down_exps.weight buffer type overriden to CPU
  910. Tensor blk.55.ffn_up_exps.weight buffer type overriden to CPU
  911. Tensor blk.55.ffn_gate_shexp.weight buffer type overriden to CUDA0
  912. Tensor blk.55.ffn_down_shexp.weight buffer type overriden to CUDA0
  913. Tensor blk.55.ffn_up_shexp.weight buffer type overriden to CUDA0
  914. Tensor blk.56.attn_norm.weight buffer type overriden to CUDA0
  915. Tensor blk.56.attn_q_a_norm.weight buffer type overriden to CUDA0
  916. Tensor blk.56.attn_kv_a_norm.weight buffer type overriden to CUDA0
  917. Tensor blk.56.attn_q_a.weight buffer type overriden to CUDA0
  918. Tensor blk.56.attn_q_b.weight buffer type overriden to CUDA0
  919. Tensor blk.56.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  920. Tensor blk.56.attn_kv_b.weight buffer type overriden to CUDA0
  921. Tensor blk.56.attn_k_b.weight buffer type overriden to CUDA0
  922. Tensor blk.56.attn_v_b.weight buffer type overriden to CUDA0
  923. Tensor blk.56.attn_output.weight buffer type overriden to CUDA0
  924. Tensor blk.56.ffn_gate_exps.weight buffer type overriden to CPU
  925. Tensor blk.56.ffn_down_exps.weight buffer type overriden to CPU
  926. Tensor blk.56.ffn_up_exps.weight buffer type overriden to CPU
  927. Tensor blk.56.ffn_gate_shexp.weight buffer type overriden to CUDA0
  928. Tensor blk.56.ffn_down_shexp.weight buffer type overriden to CUDA0
  929. Tensor blk.56.ffn_up_shexp.weight buffer type overriden to CUDA0
  930. Tensor blk.57.attn_norm.weight buffer type overriden to CUDA0
  931. Tensor blk.57.attn_q_a_norm.weight buffer type overriden to CUDA0
  932. Tensor blk.57.attn_kv_a_norm.weight buffer type overriden to CUDA0
  933. Tensor blk.57.attn_q_a.weight buffer type overriden to CUDA0
  934. Tensor blk.57.attn_q_b.weight buffer type overriden to CUDA0
  935. Tensor blk.57.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  936. Tensor blk.57.attn_kv_b.weight buffer type overriden to CUDA0
  937. Tensor blk.57.attn_k_b.weight buffer type overriden to CUDA0
  938. Tensor blk.57.attn_v_b.weight buffer type overriden to CUDA0
  939. Tensor blk.57.attn_output.weight buffer type overriden to CUDA0
  940. Tensor blk.57.ffn_gate_exps.weight buffer type overriden to CPU
  941. Tensor blk.57.ffn_down_exps.weight buffer type overriden to CPU
  942. Tensor blk.57.ffn_up_exps.weight buffer type overriden to CPU
  943. Tensor blk.57.ffn_gate_shexp.weight buffer type overriden to CUDA0
  944. Tensor blk.57.ffn_down_shexp.weight buffer type overriden to CUDA0
  945. Tensor blk.57.ffn_up_shexp.weight buffer type overriden to CUDA0
  946. Tensor blk.58.attn_norm.weight buffer type overriden to CUDA0
  947. Tensor blk.58.attn_q_a_norm.weight buffer type overriden to CUDA0
  948. Tensor blk.58.attn_kv_a_norm.weight buffer type overriden to CUDA0
  949. Tensor blk.58.attn_q_a.weight buffer type overriden to CUDA0
  950. Tensor blk.58.attn_q_b.weight buffer type overriden to CUDA0
  951. Tensor blk.58.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  952. Tensor blk.58.attn_kv_b.weight buffer type overriden to CUDA0
  953. Tensor blk.58.attn_k_b.weight buffer type overriden to CUDA0
  954. Tensor blk.58.attn_v_b.weight buffer type overriden to CUDA0
  955. Tensor blk.58.attn_output.weight buffer type overriden to CUDA0
  956. Tensor blk.58.ffn_gate_exps.weight buffer type overriden to CPU
  957. Tensor blk.58.ffn_down_exps.weight buffer type overriden to CPU
  958. Tensor blk.58.ffn_up_exps.weight buffer type overriden to CPU
  959. Tensor blk.58.ffn_gate_shexp.weight buffer type overriden to CUDA0
  960. Tensor blk.58.ffn_down_shexp.weight buffer type overriden to CUDA0
  961. Tensor blk.58.ffn_up_shexp.weight buffer type overriden to CUDA0
  962. Tensor blk.59.attn_norm.weight buffer type overriden to CUDA0
  963. Tensor blk.59.attn_q_a_norm.weight buffer type overriden to CUDA0
  964. Tensor blk.59.attn_kv_a_norm.weight buffer type overriden to CUDA0
  965. Tensor blk.59.attn_q_a.weight buffer type overriden to CUDA0
  966. Tensor blk.59.attn_q_b.weight buffer type overriden to CUDA0
  967. Tensor blk.59.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  968. Tensor blk.59.attn_kv_b.weight buffer type overriden to CUDA0
  969. Tensor blk.59.attn_k_b.weight buffer type overriden to CUDA0
  970. Tensor blk.59.attn_v_b.weight buffer type overriden to CUDA0
  971. Tensor blk.59.attn_output.weight buffer type overriden to CUDA0
  972. Tensor blk.59.ffn_gate_exps.weight buffer type overriden to CPU
  973. Tensor blk.59.ffn_down_exps.weight buffer type overriden to CPU
  974. Tensor blk.59.ffn_up_exps.weight buffer type overriden to CPU
  975. Tensor blk.59.ffn_gate_shexp.weight buffer type overriden to CUDA0
  976. Tensor blk.59.ffn_down_shexp.weight buffer type overriden to CUDA0
  977. Tensor blk.59.ffn_up_shexp.weight buffer type overriden to CUDA0
  978. Tensor blk.60.attn_norm.weight buffer type overriden to CUDA0
  979. Tensor blk.60.attn_q_a_norm.weight buffer type overriden to CUDA0
  980. Tensor blk.60.attn_kv_a_norm.weight buffer type overriden to CUDA0
  981. Tensor blk.60.attn_q_a.weight buffer type overriden to CUDA0
  982. Tensor blk.60.attn_q_b.weight buffer type overriden to CUDA0
  983. Tensor blk.60.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  984. Tensor blk.60.attn_kv_b.weight buffer type overriden to CUDA0
  985. Tensor blk.60.attn_k_b.weight buffer type overriden to CUDA0
  986. Tensor blk.60.attn_v_b.weight buffer type overriden to CUDA0
  987. Tensor blk.60.attn_output.weight buffer type overriden to CUDA0
  988. Tensor blk.60.ffn_gate_exps.weight buffer type overriden to CPU
  989. Tensor blk.60.ffn_down_exps.weight buffer type overriden to CPU
  990. Tensor blk.60.ffn_up_exps.weight buffer type overriden to CPU
  991. Tensor blk.60.ffn_gate_shexp.weight buffer type overriden to CUDA0
  992. Tensor blk.60.ffn_down_shexp.weight buffer type overriden to CUDA0
  993. Tensor blk.60.ffn_up_shexp.weight buffer type overriden to CUDA0
  994. llm_load_tensors: offloading 61 repeating layers to GPU
  995. llm_load_tensors: offloading non-repeating layers to GPU
  996. llm_load_tensors: offloaded 62/62 layers to GPU
  997. llm_load_tensors: CPU buffer size = 123994.00 MiB
  998. llm_load_tensors: CUDA_Host buffer size = 607.58 MiB
  999. llm_load_tensors: CUDA0 buffer size = 24196.79 MiB
  1000. llm_load_tensors: CUDA1 buffer size = 19104.12 MiB
  1001. llm_load_tensors: CUDA2 buffer size = 19104.12 MiB
  1002. llm_load_tensors: CUDA3 buffer size = 23880.15 MiB
  1003. llm_load_tensors: CUDA4 buffer size = 19146.28 MiB
  1004. llm_load_tensors: CUDA5 buffer size = 19153.31 MiB
  1005. llm_load_tensors: CUDA6 buffer size = 39031.47 MiB
  1006. ....................................................................................................
  1007. llama_new_context_with_model: n_ctx = 16384
  1008. llama_new_context_with_model: n_batch = 2048
  1009. llama_new_context_with_model: n_ubatch = 2048
  1010. llama_new_context_with_model: flash_attn = 1
  1011. llama_new_context_with_model: mla_attn = 3
  1012. llama_new_context_with_model: attn_max_b = 256
  1013. llama_new_context_with_model: fused_moe = 1
  1014. llama_new_context_with_model: ser = -1, 0
  1015. llama_new_context_with_model: freq_base = 10000.0
  1016. llama_new_context_with_model: freq_scale = 0.025
  1017. llama_kv_cache_init: CUDA0 KV buffer size = 180.00 MiB
  1018. llama_kv_cache_init: CUDA1 KV buffer size = 126.00 MiB
  1019. llama_kv_cache_init: CUDA2 KV buffer size = 126.00 MiB
  1020. llama_kv_cache_init: CUDA3 KV buffer size = 162.00 MiB
  1021. llama_kv_cache_init: CUDA4 KV buffer size = 144.00 MiB
  1022. llama_kv_cache_init: CUDA5 KV buffer size = 126.00 MiB
  1023. llama_kv_cache_init: CUDA6 KV buffer size = 234.00 MiB
  1024. llama_new_context_with_model: KV self size = 1098.00 MiB, c^KV (f16): 1098.00 MiB, kv^T: not used
  1025. llama_new_context_with_model: CUDA_Host output buffer size = 0.49 MiB
  1026. llama_new_context_with_model: pipeline parallelism enabled (n_copies=1)
  1027. llama_new_context_with_model: CUDA0 compute buffer size = 3566.01 MiB
  1028. llama_new_context_with_model: CUDA1 compute buffer size = 688.00 MiB
  1029. llama_new_context_with_model: CUDA2 compute buffer size = 688.00 MiB
  1030. llama_new_context_with_model: CUDA3 compute buffer size = 688.00 MiB
  1031. llama_new_context_with_model: CUDA4 compute buffer size = 688.00 MiB
  1032. llama_new_context_with_model: CUDA5 compute buffer size = 688.00 MiB
  1033. llama_new_context_with_model: CUDA6 compute buffer size = 1122.00 MiB
  1034. llama_new_context_with_model: CUDA_Host compute buffer size = 184.02 MiB
  1035. llama_new_context_with_model: graph nodes = 8184
  1036. llama_new_context_with_model: graph splits = 299
  1037.  
  1038. main: n_kv_max = 16384, n_batch = 2048, n_ubatch = 2048, flash_attn = 1, n_gpu_layers = 999, n_threads = 8, n_threads_batch = 8
  1039.  
Advertisement
Add Comment
Please, Sign In to add comment