Guest User

Untitled

a guest
Jul 9th, 2025
27
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 67.52 KB | None | 0 0
  1. Tensor blk.0.attn_norm.weight buffer type overriden to CUDA0
  2. Tensor blk.0.attn_q_a_norm.weight buffer type overriden to CUDA0
  3. Tensor blk.0.attn_kv_a_norm.weight buffer type overriden to CUDA0
  4. Tensor blk.0.attn_q_a.weight buffer type overriden to CUDA0
  5. Tensor blk.0.attn_q_b.weight buffer type overriden to CUDA0
  6. Tensor blk.0.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  7. Tensor blk.0.attn_kv_b.weight buffer type overriden to CUDA0
  8. Tensor blk.0.attn_k_b.weight buffer type overriden to CUDA0
  9. Tensor blk.0.attn_v_b.weight buffer type overriden to CUDA0
  10. Tensor blk.0.attn_output.weight buffer type overriden to CUDA0
  11. Tensor blk.0.ffn_norm.weight buffer type overriden to CUDA0
  12. Tensor blk.0.ffn_gate.weight buffer type overriden to CUDA0
  13. Tensor blk.0.ffn_down.weight buffer type overriden to CUDA0
  14. Tensor blk.0.ffn_up.weight buffer type overriden to CUDA0
  15. Tensor blk.1.attn_norm.weight buffer type overriden to CUDA0
  16. Tensor blk.1.attn_q_a_norm.weight buffer type overriden to CUDA0
  17. Tensor blk.1.attn_kv_a_norm.weight buffer type overriden to CUDA0
  18. Tensor blk.1.attn_q_a.weight buffer type overriden to CUDA0
  19. Tensor blk.1.attn_q_b.weight buffer type overriden to CUDA0
  20. Tensor blk.1.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  21. Tensor blk.1.attn_kv_b.weight buffer type overriden to CUDA0
  22. Tensor blk.1.attn_k_b.weight buffer type overriden to CUDA0
  23. Tensor blk.1.attn_v_b.weight buffer type overriden to CUDA0
  24. Tensor blk.1.attn_output.weight buffer type overriden to CUDA0
  25. Tensor blk.1.ffn_norm.weight buffer type overriden to CUDA0
  26. Tensor blk.1.ffn_gate.weight buffer type overriden to CUDA0
  27. Tensor blk.1.ffn_down.weight buffer type overriden to CUDA0
  28. Tensor blk.1.ffn_up.weight buffer type overriden to CUDA0
  29. Tensor blk.2.attn_norm.weight buffer type overriden to CUDA0
  30. Tensor blk.2.attn_q_a_norm.weight buffer type overriden to CUDA0
  31. Tensor blk.2.attn_kv_a_norm.weight buffer type overriden to CUDA0
  32. Tensor blk.2.attn_q_a.weight buffer type overriden to CUDA0
  33. Tensor blk.2.attn_q_b.weight buffer type overriden to CUDA0
  34. Tensor blk.2.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  35. Tensor blk.2.attn_kv_b.weight buffer type overriden to CUDA0
  36. Tensor blk.2.attn_k_b.weight buffer type overriden to CUDA0
  37. Tensor blk.2.attn_v_b.weight buffer type overriden to CUDA0
  38. Tensor blk.2.attn_output.weight buffer type overriden to CUDA0
  39. Tensor blk.2.ffn_norm.weight buffer type overriden to CUDA0
  40. Tensor blk.2.ffn_gate.weight buffer type overriden to CUDA0
  41. Tensor blk.2.ffn_down.weight buffer type overriden to CUDA0
  42. Tensor blk.2.ffn_up.weight buffer type overriden to CUDA0
  43. Tensor blk.3.attn_norm.weight buffer type overriden to CUDA0
  44. Tensor blk.3.attn_q_a_norm.weight buffer type overriden to CUDA0
  45. Tensor blk.3.attn_kv_a_norm.weight buffer type overriden to CUDA0
  46. Tensor blk.3.attn_q_a.weight buffer type overriden to CUDA0
  47. Tensor blk.3.attn_q_b.weight buffer type overriden to CUDA0
  48. Tensor blk.3.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  49. Tensor blk.3.attn_kv_b.weight buffer type overriden to CUDA0
  50. Tensor blk.3.attn_k_b.weight buffer type overriden to CUDA0
  51. Tensor blk.3.attn_v_b.weight buffer type overriden to CUDA0
  52. Tensor blk.3.attn_output.weight buffer type overriden to CUDA0
  53. Tensor blk.3.ffn_norm.weight buffer type overriden to CUDA0
  54. Tensor blk.3.ffn_gate_inp.weight buffer type overriden to CUDA0
  55. Tensor blk.3.ffn_gate_exps.weight buffer type overriden to CUDA0
  56. Tensor blk.3.ffn_down_exps.weight buffer type overriden to CUDA0
  57. Tensor blk.3.ffn_up_exps.weight buffer type overriden to CUDA0
  58. Tensor blk.3.ffn_gate_shexp.weight buffer type overriden to CUDA0
  59. Tensor blk.3.ffn_down_shexp.weight buffer type overriden to CUDA0
  60. Tensor blk.3.ffn_up_shexp.weight buffer type overriden to CUDA0
  61. Tensor blk.4.attn_norm.weight buffer type overriden to CUDA0
  62. Tensor blk.4.attn_q_a_norm.weight buffer type overriden to CUDA0
  63. Tensor blk.4.attn_kv_a_norm.weight buffer type overriden to CUDA0
  64. Tensor blk.4.attn_q_a.weight buffer type overriden to CUDA0
  65. Tensor blk.4.attn_q_b.weight buffer type overriden to CUDA0
  66. Tensor blk.4.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  67. Tensor blk.4.attn_kv_b.weight buffer type overriden to CUDA0
  68. Tensor blk.4.attn_k_b.weight buffer type overriden to CUDA0
  69. Tensor blk.4.attn_v_b.weight buffer type overriden to CUDA0
  70. Tensor blk.4.attn_output.weight buffer type overriden to CUDA0
  71. Tensor blk.4.ffn_norm.weight buffer type overriden to CUDA0
  72. Tensor blk.4.ffn_gate_inp.weight buffer type overriden to CUDA0
  73. Tensor blk.4.ffn_gate_exps.weight buffer type overriden to CUDA0
  74. Tensor blk.4.ffn_down_exps.weight buffer type overriden to CUDA0
  75. Tensor blk.4.ffn_up_exps.weight buffer type overriden to CUDA0
  76. Tensor blk.4.ffn_gate_shexp.weight buffer type overriden to CUDA0
  77. Tensor blk.4.ffn_down_shexp.weight buffer type overriden to CUDA0
  78. Tensor blk.4.ffn_up_shexp.weight buffer type overriden to CUDA0
  79. Tensor blk.5.attn_norm.weight buffer type overriden to CUDA0
  80. Tensor blk.5.attn_q_a_norm.weight buffer type overriden to CUDA0
  81. Tensor blk.5.attn_kv_a_norm.weight buffer type overriden to CUDA0
  82. Tensor blk.5.attn_q_a.weight buffer type overriden to CUDA0
  83. Tensor blk.5.attn_q_b.weight buffer type overriden to CUDA0
  84. Tensor blk.5.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  85. Tensor blk.5.attn_kv_b.weight buffer type overriden to CUDA0
  86. Tensor blk.5.attn_k_b.weight buffer type overriden to CUDA0
  87. Tensor blk.5.attn_v_b.weight buffer type overriden to CUDA0
  88. Tensor blk.5.attn_output.weight buffer type overriden to CUDA0
  89. Tensor blk.5.ffn_norm.weight buffer type overriden to CUDA0
  90. Tensor blk.5.ffn_gate_inp.weight buffer type overriden to CUDA0
  91. Tensor blk.5.ffn_gate_exps.weight buffer type overriden to CUDA0
  92. Tensor blk.5.ffn_down_exps.weight buffer type overriden to CUDA0
  93. Tensor blk.5.ffn_up_exps.weight buffer type overriden to CUDA0
  94. Tensor blk.5.ffn_gate_shexp.weight buffer type overriden to CUDA0
  95. Tensor blk.5.ffn_down_shexp.weight buffer type overriden to CUDA0
  96. Tensor blk.5.ffn_up_shexp.weight buffer type overriden to CUDA0
  97. Tensor blk.6.attn_norm.weight buffer type overriden to CUDA0
  98. Tensor blk.6.attn_q_a_norm.weight buffer type overriden to CUDA0
  99. Tensor blk.6.attn_kv_a_norm.weight buffer type overriden to CUDA0
  100. Tensor blk.6.attn_q_a.weight buffer type overriden to CUDA0
  101. Tensor blk.6.attn_q_b.weight buffer type overriden to CUDA0
  102. Tensor blk.6.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  103. Tensor blk.6.attn_kv_b.weight buffer type overriden to CUDA0
  104. Tensor blk.6.attn_k_b.weight buffer type overriden to CUDA0
  105. Tensor blk.6.attn_v_b.weight buffer type overriden to CUDA0
  106. Tensor blk.6.attn_output.weight buffer type overriden to CUDA0
  107. Tensor blk.6.ffn_norm.weight buffer type overriden to CUDA1
  108. Tensor blk.6.ffn_gate_inp.weight buffer type overriden to CUDA1
  109. Tensor blk.6.ffn_gate_exps.weight buffer type overriden to CUDA1
  110. Tensor blk.6.ffn_down_exps.weight buffer type overriden to CUDA1
  111. Tensor blk.6.ffn_up_exps.weight buffer type overriden to CUDA1
  112. Tensor blk.6.ffn_gate_shexp.weight buffer type overriden to CUDA0
  113. Tensor blk.6.ffn_down_shexp.weight buffer type overriden to CUDA0
  114. Tensor blk.6.ffn_up_shexp.weight buffer type overriden to CUDA0
  115. Tensor blk.7.attn_norm.weight buffer type overriden to CUDA0
  116. Tensor blk.7.attn_q_a_norm.weight buffer type overriden to CUDA0
  117. Tensor blk.7.attn_kv_a_norm.weight buffer type overriden to CUDA0
  118. Tensor blk.7.attn_q_a.weight buffer type overriden to CUDA0
  119. Tensor blk.7.attn_q_b.weight buffer type overriden to CUDA0
  120. Tensor blk.7.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  121. Tensor blk.7.attn_kv_b.weight buffer type overriden to CUDA0
  122. Tensor blk.7.attn_k_b.weight buffer type overriden to CUDA0
  123. Tensor blk.7.attn_v_b.weight buffer type overriden to CUDA0
  124. Tensor blk.7.attn_output.weight buffer type overriden to CUDA0
  125. Tensor blk.7.ffn_norm.weight buffer type overriden to CUDA1
  126. Tensor blk.7.ffn_gate_inp.weight buffer type overriden to CUDA1
  127. Tensor blk.7.ffn_gate_exps.weight buffer type overriden to CUDA1
  128. Tensor blk.7.ffn_down_exps.weight buffer type overriden to CUDA1
  129. Tensor blk.7.ffn_up_exps.weight buffer type overriden to CUDA1
  130. Tensor blk.7.ffn_gate_shexp.weight buffer type overriden to CUDA0
  131. Tensor blk.7.ffn_down_shexp.weight buffer type overriden to CUDA0
  132. Tensor blk.7.ffn_up_shexp.weight buffer type overriden to CUDA0
  133. Tensor blk.8.attn_norm.weight buffer type overriden to CUDA0
  134. Tensor blk.8.attn_q_a_norm.weight buffer type overriden to CUDA0
  135. Tensor blk.8.attn_kv_a_norm.weight buffer type overriden to CUDA0
  136. Tensor blk.8.attn_q_a.weight buffer type overriden to CUDA0
  137. Tensor blk.8.attn_q_b.weight buffer type overriden to CUDA0
  138. Tensor blk.8.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  139. Tensor blk.8.attn_kv_b.weight buffer type overriden to CUDA0
  140. Tensor blk.8.attn_k_b.weight buffer type overriden to CUDA0
  141. Tensor blk.8.attn_v_b.weight buffer type overriden to CUDA0
  142. Tensor blk.8.attn_output.weight buffer type overriden to CUDA0
  143. Tensor blk.8.ffn_norm.weight buffer type overriden to CUDA1
  144. Tensor blk.8.ffn_gate_inp.weight buffer type overriden to CUDA1
  145. Tensor blk.8.ffn_gate_exps.weight buffer type overriden to CUDA1
  146. Tensor blk.8.ffn_down_exps.weight buffer type overriden to CUDA1
  147. Tensor blk.8.ffn_up_exps.weight buffer type overriden to CUDA1
  148. Tensor blk.8.ffn_gate_shexp.weight buffer type overriden to CUDA0
  149. Tensor blk.8.ffn_down_shexp.weight buffer type overriden to CUDA0
  150. Tensor blk.8.ffn_up_shexp.weight buffer type overriden to CUDA0
  151. Tensor blk.9.attn_norm.weight buffer type overriden to CUDA0
  152. Tensor blk.9.attn_q_a_norm.weight buffer type overriden to CUDA0
  153. Tensor blk.9.attn_kv_a_norm.weight buffer type overriden to CUDA0
  154. Tensor blk.9.attn_q_a.weight buffer type overriden to CUDA0
  155. Tensor blk.9.attn_q_b.weight buffer type overriden to CUDA0
  156. Tensor blk.9.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  157. Tensor blk.9.attn_kv_b.weight buffer type overriden to CUDA0
  158. Tensor blk.9.attn_k_b.weight buffer type overriden to CUDA0
  159. Tensor blk.9.attn_v_b.weight buffer type overriden to CUDA0
  160. Tensor blk.9.attn_output.weight buffer type overriden to CUDA0
  161. Tensor blk.9.ffn_norm.weight buffer type overriden to CUDA1
  162. Tensor blk.9.ffn_gate_inp.weight buffer type overriden to CUDA1
  163. Tensor blk.9.ffn_gate_exps.weight buffer type overriden to CUDA1
  164. Tensor blk.9.ffn_down_exps.weight buffer type overriden to CUDA1
  165. Tensor blk.9.ffn_up_exps.weight buffer type overriden to CUDA1
  166. Tensor blk.9.ffn_gate_shexp.weight buffer type overriden to CUDA0
  167. Tensor blk.9.ffn_down_shexp.weight buffer type overriden to CUDA0
  168. Tensor blk.9.ffn_up_shexp.weight buffer type overriden to CUDA0
  169. Tensor blk.10.attn_norm.weight buffer type overriden to CUDA0
  170. Tensor blk.10.attn_q_a_norm.weight buffer type overriden to CUDA0
  171. Tensor blk.10.attn_kv_a_norm.weight buffer type overriden to CUDA0
  172. Tensor blk.10.attn_q_a.weight buffer type overriden to CUDA0
  173. Tensor blk.10.attn_q_b.weight buffer type overriden to CUDA0
  174. Tensor blk.10.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  175. Tensor blk.10.attn_kv_b.weight buffer type overriden to CUDA0
  176. Tensor blk.10.attn_k_b.weight buffer type overriden to CUDA0
  177. Tensor blk.10.attn_v_b.weight buffer type overriden to CUDA0
  178. Tensor blk.10.attn_output.weight buffer type overriden to CUDA0
  179. Tensor blk.10.ffn_norm.weight buffer type overriden to CUDA2
  180. Tensor blk.10.ffn_gate_inp.weight buffer type overriden to CUDA2
  181. Tensor blk.10.ffn_gate_exps.weight buffer type overriden to CUDA2
  182. Tensor blk.10.ffn_down_exps.weight buffer type overriden to CUDA2
  183. Tensor blk.10.ffn_up_exps.weight buffer type overriden to CUDA2
  184. Tensor blk.10.ffn_gate_shexp.weight buffer type overriden to CUDA0
  185. Tensor blk.10.ffn_down_shexp.weight buffer type overriden to CUDA0
  186. Tensor blk.10.ffn_up_shexp.weight buffer type overriden to CUDA0
  187. Tensor blk.11.attn_norm.weight buffer type overriden to CUDA0
  188. Tensor blk.11.attn_q_a_norm.weight buffer type overriden to CUDA0
  189. Tensor blk.11.attn_kv_a_norm.weight buffer type overriden to CUDA0
  190. Tensor blk.11.attn_q_a.weight buffer type overriden to CUDA0
  191. Tensor blk.11.attn_q_b.weight buffer type overriden to CUDA0
  192. Tensor blk.11.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  193. Tensor blk.11.attn_kv_b.weight buffer type overriden to CUDA0
  194. Tensor blk.11.attn_k_b.weight buffer type overriden to CUDA0
  195. Tensor blk.11.attn_v_b.weight buffer type overriden to CUDA0
  196. Tensor blk.11.attn_output.weight buffer type overriden to CUDA0
  197. Tensor blk.11.ffn_norm.weight buffer type overriden to CUDA2
  198. Tensor blk.11.ffn_gate_inp.weight buffer type overriden to CUDA2
  199. Tensor blk.11.ffn_gate_exps.weight buffer type overriden to CUDA2
  200. Tensor blk.11.ffn_down_exps.weight buffer type overriden to CUDA2
  201. Tensor blk.11.ffn_up_exps.weight buffer type overriden to CUDA2
  202. Tensor blk.11.ffn_gate_shexp.weight buffer type overriden to CUDA0
  203. Tensor blk.11.ffn_down_shexp.weight buffer type overriden to CUDA0
  204. Tensor blk.11.ffn_up_shexp.weight buffer type overriden to CUDA0
  205. Tensor blk.12.attn_norm.weight buffer type overriden to CUDA0
  206. Tensor blk.12.attn_q_a_norm.weight buffer type overriden to CUDA0
  207. Tensor blk.12.attn_kv_a_norm.weight buffer type overriden to CUDA0
  208. Tensor blk.12.attn_q_a.weight buffer type overriden to CUDA0
  209. Tensor blk.12.attn_q_b.weight buffer type overriden to CUDA0
  210. Tensor blk.12.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  211. Tensor blk.12.attn_kv_b.weight buffer type overriden to CUDA0
  212. Tensor blk.12.attn_k_b.weight buffer type overriden to CUDA0
  213. Tensor blk.12.attn_v_b.weight buffer type overriden to CUDA0
  214. Tensor blk.12.attn_output.weight buffer type overriden to CUDA0
  215. Tensor blk.12.ffn_norm.weight buffer type overriden to CUDA2
  216. Tensor blk.12.ffn_gate_inp.weight buffer type overriden to CUDA2
  217. Tensor blk.12.ffn_gate_exps.weight buffer type overriden to CUDA2
  218. Tensor blk.12.ffn_down_exps.weight buffer type overriden to CUDA2
  219. Tensor blk.12.ffn_up_exps.weight buffer type overriden to CUDA2
  220. Tensor blk.12.ffn_gate_shexp.weight buffer type overriden to CUDA0
  221. Tensor blk.12.ffn_down_shexp.weight buffer type overriden to CUDA0
  222. Tensor blk.12.ffn_up_shexp.weight buffer type overriden to CUDA0
  223. Tensor blk.13.attn_norm.weight buffer type overriden to CUDA0
  224. Tensor blk.13.attn_q_a_norm.weight buffer type overriden to CUDA0
  225. Tensor blk.13.attn_kv_a_norm.weight buffer type overriden to CUDA0
  226. Tensor blk.13.attn_q_a.weight buffer type overriden to CUDA0
  227. Tensor blk.13.attn_q_b.weight buffer type overriden to CUDA0
  228. Tensor blk.13.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  229. Tensor blk.13.attn_kv_b.weight buffer type overriden to CUDA0
  230. Tensor blk.13.attn_k_b.weight buffer type overriden to CUDA0
  231. Tensor blk.13.attn_v_b.weight buffer type overriden to CUDA0
  232. Tensor blk.13.attn_output.weight buffer type overriden to CUDA0
  233. Tensor blk.13.ffn_norm.weight buffer type overriden to CUDA2
  234. Tensor blk.13.ffn_gate_inp.weight buffer type overriden to CUDA2
  235. Tensor blk.13.ffn_gate_exps.weight buffer type overriden to CUDA2
  236. Tensor blk.13.ffn_down_exps.weight buffer type overriden to CUDA2
  237. Tensor blk.13.ffn_up_exps.weight buffer type overriden to CUDA2
  238. Tensor blk.13.ffn_gate_shexp.weight buffer type overriden to CUDA0
  239. Tensor blk.13.ffn_down_shexp.weight buffer type overriden to CUDA0
  240. Tensor blk.13.ffn_up_shexp.weight buffer type overriden to CUDA0
  241. Tensor blk.14.attn_norm.weight buffer type overriden to CUDA0
  242. Tensor blk.14.attn_q_a_norm.weight buffer type overriden to CUDA0
  243. Tensor blk.14.attn_kv_a_norm.weight buffer type overriden to CUDA0
  244. Tensor blk.14.attn_q_a.weight buffer type overriden to CUDA0
  245. Tensor blk.14.attn_q_b.weight buffer type overriden to CUDA0
  246. Tensor blk.14.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  247. Tensor blk.14.attn_kv_b.weight buffer type overriden to CUDA0
  248. Tensor blk.14.attn_k_b.weight buffer type overriden to CUDA0
  249. Tensor blk.14.attn_v_b.weight buffer type overriden to CUDA0
  250. Tensor blk.14.attn_output.weight buffer type overriden to CUDA0
  251. Tensor blk.14.ffn_norm.weight buffer type overriden to CUDA3
  252. Tensor blk.14.ffn_gate_inp.weight buffer type overriden to CUDA3
  253. Tensor blk.14.ffn_gate_exps.weight buffer type overriden to CUDA3
  254. Tensor blk.14.ffn_down_exps.weight buffer type overriden to CUDA3
  255. Tensor blk.14.ffn_up_exps.weight buffer type overriden to CUDA3
  256. Tensor blk.14.ffn_gate_shexp.weight buffer type overriden to CUDA0
  257. Tensor blk.14.ffn_down_shexp.weight buffer type overriden to CUDA0
  258. Tensor blk.14.ffn_up_shexp.weight buffer type overriden to CUDA0
  259. Tensor blk.15.attn_norm.weight buffer type overriden to CUDA0
  260. Tensor blk.15.attn_q_a_norm.weight buffer type overriden to CUDA0
  261. Tensor blk.15.attn_kv_a_norm.weight buffer type overriden to CUDA0
  262. Tensor blk.15.attn_q_a.weight buffer type overriden to CUDA0
  263. Tensor blk.15.attn_q_b.weight buffer type overriden to CUDA0
  264. Tensor blk.15.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  265. Tensor blk.15.attn_kv_b.weight buffer type overriden to CUDA0
  266. Tensor blk.15.attn_k_b.weight buffer type overriden to CUDA0
  267. Tensor blk.15.attn_v_b.weight buffer type overriden to CUDA0
  268. Tensor blk.15.attn_output.weight buffer type overriden to CUDA0
  269. Tensor blk.15.ffn_norm.weight buffer type overriden to CUDA3
  270. Tensor blk.15.ffn_gate_inp.weight buffer type overriden to CUDA3
  271. Tensor blk.15.ffn_gate_exps.weight buffer type overriden to CUDA3
  272. Tensor blk.15.ffn_down_exps.weight buffer type overriden to CUDA3
  273. Tensor blk.15.ffn_up_exps.weight buffer type overriden to CUDA3
  274. Tensor blk.15.ffn_gate_shexp.weight buffer type overriden to CUDA0
  275. Tensor blk.15.ffn_down_shexp.weight buffer type overriden to CUDA0
  276. Tensor blk.15.ffn_up_shexp.weight buffer type overriden to CUDA0
  277. Tensor blk.16.attn_norm.weight buffer type overriden to CUDA0
  278. Tensor blk.16.attn_q_a_norm.weight buffer type overriden to CUDA0
  279. Tensor blk.16.attn_kv_a_norm.weight buffer type overriden to CUDA0
  280. Tensor blk.16.attn_q_a.weight buffer type overriden to CUDA0
  281. Tensor blk.16.attn_q_b.weight buffer type overriden to CUDA0
  282. Tensor blk.16.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  283. Tensor blk.16.attn_kv_b.weight buffer type overriden to CUDA0
  284. Tensor blk.16.attn_k_b.weight buffer type overriden to CUDA0
  285. Tensor blk.16.attn_v_b.weight buffer type overriden to CUDA0
  286. Tensor blk.16.attn_output.weight buffer type overriden to CUDA0
  287. Tensor blk.16.ffn_norm.weight buffer type overriden to CUDA3
  288. Tensor blk.16.ffn_gate_inp.weight buffer type overriden to CUDA3
  289. Tensor blk.16.ffn_gate_exps.weight buffer type overriden to CUDA3
  290. Tensor blk.16.ffn_down_exps.weight buffer type overriden to CUDA3
  291. Tensor blk.16.ffn_up_exps.weight buffer type overriden to CUDA3
  292. Tensor blk.16.ffn_gate_shexp.weight buffer type overriden to CUDA0
  293. Tensor blk.16.ffn_down_shexp.weight buffer type overriden to CUDA0
  294. Tensor blk.16.ffn_up_shexp.weight buffer type overriden to CUDA0
  295. Tensor blk.17.attn_norm.weight buffer type overriden to CUDA0
  296. Tensor blk.17.attn_q_a_norm.weight buffer type overriden to CUDA0
  297. Tensor blk.17.attn_kv_a_norm.weight buffer type overriden to CUDA0
  298. Tensor blk.17.attn_q_a.weight buffer type overriden to CUDA0
  299. Tensor blk.17.attn_q_b.weight buffer type overriden to CUDA0
  300. Tensor blk.17.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  301. Tensor blk.17.attn_kv_b.weight buffer type overriden to CUDA0
  302. Tensor blk.17.attn_k_b.weight buffer type overriden to CUDA0
  303. Tensor blk.17.attn_v_b.weight buffer type overriden to CUDA0
  304. Tensor blk.17.attn_output.weight buffer type overriden to CUDA0
  305. Tensor blk.17.ffn_norm.weight buffer type overriden to CUDA3
  306. Tensor blk.17.ffn_gate_inp.weight buffer type overriden to CUDA3
  307. Tensor blk.17.ffn_gate_exps.weight buffer type overriden to CUDA3
  308. Tensor blk.17.ffn_down_exps.weight buffer type overriden to CUDA3
  309. Tensor blk.17.ffn_up_exps.weight buffer type overriden to CUDA3
  310. Tensor blk.17.ffn_gate_shexp.weight buffer type overriden to CUDA0
  311. Tensor blk.17.ffn_down_shexp.weight buffer type overriden to CUDA0
  312. Tensor blk.17.ffn_up_shexp.weight buffer type overriden to CUDA0
  313. Tensor blk.18.attn_norm.weight buffer type overriden to CUDA0
  314. Tensor blk.18.attn_q_a_norm.weight buffer type overriden to CUDA0
  315. Tensor blk.18.attn_kv_a_norm.weight buffer type overriden to CUDA0
  316. Tensor blk.18.attn_q_a.weight buffer type overriden to CUDA0
  317. Tensor blk.18.attn_q_b.weight buffer type overriden to CUDA0
  318. Tensor blk.18.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  319. Tensor blk.18.attn_kv_b.weight buffer type overriden to CUDA0
  320. Tensor blk.18.attn_k_b.weight buffer type overriden to CUDA0
  321. Tensor blk.18.attn_v_b.weight buffer type overriden to CUDA0
  322. Tensor blk.18.attn_output.weight buffer type overriden to CUDA0
  323. Tensor blk.18.ffn_norm.weight buffer type overriden to CUDA3
  324. Tensor blk.18.ffn_gate_inp.weight buffer type overriden to CUDA3
  325. Tensor blk.18.ffn_gate_exps.weight buffer type overriden to CUDA3
  326. Tensor blk.18.ffn_down_exps.weight buffer type overriden to CUDA3
  327. Tensor blk.18.ffn_up_exps.weight buffer type overriden to CUDA3
  328. Tensor blk.18.ffn_gate_shexp.weight buffer type overriden to CUDA0
  329. Tensor blk.18.ffn_down_shexp.weight buffer type overriden to CUDA0
  330. Tensor blk.18.ffn_up_shexp.weight buffer type overriden to CUDA0
  331. Tensor blk.19.attn_norm.weight buffer type overriden to CUDA0
  332. Tensor blk.19.attn_q_a_norm.weight buffer type overriden to CUDA0
  333. Tensor blk.19.attn_kv_a_norm.weight buffer type overriden to CUDA0
  334. Tensor blk.19.attn_q_a.weight buffer type overriden to CUDA0
  335. Tensor blk.19.attn_q_b.weight buffer type overriden to CUDA0
  336. Tensor blk.19.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  337. Tensor blk.19.attn_kv_b.weight buffer type overriden to CUDA0
  338. Tensor blk.19.attn_k_b.weight buffer type overriden to CUDA0
  339. Tensor blk.19.attn_v_b.weight buffer type overriden to CUDA0
  340. Tensor blk.19.attn_output.weight buffer type overriden to CUDA0
  341. Tensor blk.19.ffn_norm.weight buffer type overriden to CUDA4
  342. Tensor blk.19.ffn_gate_inp.weight buffer type overriden to CUDA4
  343. Tensor blk.19.ffn_gate_exps.weight buffer type overriden to CUDA4
  344. Tensor blk.19.ffn_down_exps.weight buffer type overriden to CUDA4
  345. Tensor blk.19.ffn_up_exps.weight buffer type overriden to CUDA4
  346. Tensor blk.19.ffn_gate_shexp.weight buffer type overriden to CUDA0
  347. Tensor blk.19.ffn_down_shexp.weight buffer type overriden to CUDA0
  348. Tensor blk.19.ffn_up_shexp.weight buffer type overriden to CUDA0
  349. Tensor blk.20.attn_norm.weight buffer type overriden to CUDA0
  350. Tensor blk.20.attn_q_a_norm.weight buffer type overriden to CUDA0
  351. Tensor blk.20.attn_kv_a_norm.weight buffer type overriden to CUDA0
  352. Tensor blk.20.attn_q_a.weight buffer type overriden to CUDA0
  353. Tensor blk.20.attn_q_b.weight buffer type overriden to CUDA0
  354. Tensor blk.20.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  355. Tensor blk.20.attn_kv_b.weight buffer type overriden to CUDA0
  356. Tensor blk.20.attn_k_b.weight buffer type overriden to CUDA0
  357. Tensor blk.20.attn_v_b.weight buffer type overriden to CUDA0
  358. Tensor blk.20.attn_output.weight buffer type overriden to CUDA0
  359. Tensor blk.20.ffn_norm.weight buffer type overriden to CUDA4
  360. Tensor blk.20.ffn_gate_inp.weight buffer type overriden to CUDA4
  361. Tensor blk.20.ffn_gate_exps.weight buffer type overriden to CUDA4
  362. Tensor blk.20.ffn_down_exps.weight buffer type overriden to CUDA4
  363. Tensor blk.20.ffn_up_exps.weight buffer type overriden to CUDA4
  364. Tensor blk.20.ffn_gate_shexp.weight buffer type overriden to CUDA0
  365. Tensor blk.20.ffn_down_shexp.weight buffer type overriden to CUDA0
  366. Tensor blk.20.ffn_up_shexp.weight buffer type overriden to CUDA0
  367. Tensor blk.21.attn_norm.weight buffer type overriden to CUDA0
  368. Tensor blk.21.attn_q_a_norm.weight buffer type overriden to CUDA0
  369. Tensor blk.21.attn_kv_a_norm.weight buffer type overriden to CUDA0
  370. Tensor blk.21.attn_q_a.weight buffer type overriden to CUDA0
  371. Tensor blk.21.attn_q_b.weight buffer type overriden to CUDA0
  372. Tensor blk.21.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  373. Tensor blk.21.attn_kv_b.weight buffer type overriden to CUDA0
  374. Tensor blk.21.attn_k_b.weight buffer type overriden to CUDA0
  375. Tensor blk.21.attn_v_b.weight buffer type overriden to CUDA0
  376. Tensor blk.21.attn_output.weight buffer type overriden to CUDA0
  377. Tensor blk.21.ffn_norm.weight buffer type overriden to CUDA4
  378. Tensor blk.21.ffn_gate_inp.weight buffer type overriden to CUDA4
  379. Tensor blk.21.ffn_gate_exps.weight buffer type overriden to CUDA4
  380. Tensor blk.21.ffn_down_exps.weight buffer type overriden to CUDA4
  381. Tensor blk.21.ffn_up_exps.weight buffer type overriden to CUDA4
  382. Tensor blk.21.ffn_gate_shexp.weight buffer type overriden to CUDA0
  383. Tensor blk.21.ffn_down_shexp.weight buffer type overriden to CUDA0
  384. Tensor blk.21.ffn_up_shexp.weight buffer type overriden to CUDA0
  385. Tensor blk.22.attn_norm.weight buffer type overriden to CUDA0
  386. Tensor blk.22.attn_q_a_norm.weight buffer type overriden to CUDA0
  387. Tensor blk.22.attn_kv_a_norm.weight buffer type overriden to CUDA0
  388. Tensor blk.22.attn_q_a.weight buffer type overriden to CUDA0
  389. Tensor blk.22.attn_q_b.weight buffer type overriden to CUDA0
  390. Tensor blk.22.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  391. Tensor blk.22.attn_kv_b.weight buffer type overriden to CUDA0
  392. Tensor blk.22.attn_k_b.weight buffer type overriden to CUDA0
  393. Tensor blk.22.attn_v_b.weight buffer type overriden to CUDA0
  394. Tensor blk.22.attn_output.weight buffer type overriden to CUDA0
  395. Tensor blk.22.ffn_norm.weight buffer type overriden to CUDA5
  396. Tensor blk.22.ffn_gate_inp.weight buffer type overriden to CUDA5
  397. Tensor blk.22.ffn_gate_exps.weight buffer type overriden to CUDA5
  398. Tensor blk.22.ffn_down_exps.weight buffer type overriden to CUDA5
  399. Tensor blk.22.ffn_up_exps.weight buffer type overriden to CUDA5
  400. Tensor blk.22.ffn_gate_shexp.weight buffer type overriden to CUDA0
  401. Tensor blk.22.ffn_down_shexp.weight buffer type overriden to CUDA0
  402. Tensor blk.22.ffn_up_shexp.weight buffer type overriden to CUDA0
  403. Tensor blk.23.attn_norm.weight buffer type overriden to CUDA0
  404. Tensor blk.23.attn_q_a_norm.weight buffer type overriden to CUDA0
  405. Tensor blk.23.attn_kv_a_norm.weight buffer type overriden to CUDA0
  406. Tensor blk.23.attn_q_a.weight buffer type overriden to CUDA0
  407. Tensor blk.23.attn_q_b.weight buffer type overriden to CUDA0
  408. Tensor blk.23.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  409. Tensor blk.23.attn_kv_b.weight buffer type overriden to CUDA0
  410. Tensor blk.23.attn_k_b.weight buffer type overriden to CUDA0
  411. Tensor blk.23.attn_v_b.weight buffer type overriden to CUDA0
  412. Tensor blk.23.attn_output.weight buffer type overriden to CUDA0
  413. Tensor blk.23.ffn_norm.weight buffer type overriden to CUDA5
  414. Tensor blk.23.ffn_gate_inp.weight buffer type overriden to CUDA5
  415. Tensor blk.23.ffn_gate_exps.weight buffer type overriden to CUDA5
  416. Tensor blk.23.ffn_down_exps.weight buffer type overriden to CUDA5
  417. Tensor blk.23.ffn_up_exps.weight buffer type overriden to CUDA5
  418. Tensor blk.23.ffn_gate_shexp.weight buffer type overriden to CUDA0
  419. Tensor blk.23.ffn_down_shexp.weight buffer type overriden to CUDA0
  420. Tensor blk.23.ffn_up_shexp.weight buffer type overriden to CUDA0
  421. Tensor blk.24.attn_norm.weight buffer type overriden to CUDA0
  422. Tensor blk.24.attn_q_a_norm.weight buffer type overriden to CUDA0
  423. Tensor blk.24.attn_kv_a_norm.weight buffer type overriden to CUDA0
  424. Tensor blk.24.attn_q_a.weight buffer type overriden to CUDA0
  425. Tensor blk.24.attn_q_b.weight buffer type overriden to CUDA0
  426. Tensor blk.24.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  427. Tensor blk.24.attn_kv_b.weight buffer type overriden to CUDA0
  428. Tensor blk.24.attn_k_b.weight buffer type overriden to CUDA0
  429. Tensor blk.24.attn_v_b.weight buffer type overriden to CUDA0
  430. Tensor blk.24.attn_output.weight buffer type overriden to CUDA0
  431. Tensor blk.24.ffn_norm.weight buffer type overriden to CUDA5
  432. Tensor blk.24.ffn_gate_inp.weight buffer type overriden to CUDA5
  433. Tensor blk.24.ffn_gate_exps.weight buffer type overriden to CUDA5
  434. Tensor blk.24.ffn_down_exps.weight buffer type overriden to CUDA5
  435. Tensor blk.24.ffn_up_exps.weight buffer type overriden to CUDA5
  436. Tensor blk.24.ffn_gate_shexp.weight buffer type overriden to CUDA0
  437. Tensor blk.24.ffn_down_shexp.weight buffer type overriden to CUDA0
  438. Tensor blk.24.ffn_up_shexp.weight buffer type overriden to CUDA0
  439. Tensor blk.25.attn_norm.weight buffer type overriden to CUDA0
  440. Tensor blk.25.attn_q_a_norm.weight buffer type overriden to CUDA0
  441. Tensor blk.25.attn_kv_a_norm.weight buffer type overriden to CUDA0
  442. Tensor blk.25.attn_q_a.weight buffer type overriden to CUDA0
  443. Tensor blk.25.attn_q_b.weight buffer type overriden to CUDA0
  444. Tensor blk.25.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  445. Tensor blk.25.attn_kv_b.weight buffer type overriden to CUDA0
  446. Tensor blk.25.attn_k_b.weight buffer type overriden to CUDA0
  447. Tensor blk.25.attn_v_b.weight buffer type overriden to CUDA0
  448. Tensor blk.25.attn_output.weight buffer type overriden to CUDA0
  449. Tensor blk.25.ffn_norm.weight buffer type overriden to CUDA6
  450. Tensor blk.25.ffn_gate_inp.weight buffer type overriden to CUDA6
  451. Tensor blk.25.ffn_gate_exps.weight buffer type overriden to CUDA6
  452. Tensor blk.25.ffn_down_exps.weight buffer type overriden to CUDA6
  453. Tensor blk.25.ffn_up_exps.weight buffer type overriden to CUDA6
  454. Tensor blk.25.ffn_gate_shexp.weight buffer type overriden to CUDA0
  455. Tensor blk.25.ffn_down_shexp.weight buffer type overriden to CUDA0
  456. Tensor blk.25.ffn_up_shexp.weight buffer type overriden to CUDA0
  457. Tensor blk.26.attn_norm.weight buffer type overriden to CUDA0
  458. Tensor blk.26.attn_q_a_norm.weight buffer type overriden to CUDA0
  459. Tensor blk.26.attn_kv_a_norm.weight buffer type overriden to CUDA0
  460. Tensor blk.26.attn_q_a.weight buffer type overriden to CUDA0
  461. Tensor blk.26.attn_q_b.weight buffer type overriden to CUDA0
  462. Tensor blk.26.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  463. Tensor blk.26.attn_kv_b.weight buffer type overriden to CUDA0
  464. Tensor blk.26.attn_k_b.weight buffer type overriden to CUDA0
  465. Tensor blk.26.attn_v_b.weight buffer type overriden to CUDA0
  466. Tensor blk.26.attn_output.weight buffer type overriden to CUDA0
  467. Tensor blk.26.ffn_norm.weight buffer type overriden to CUDA6
  468. Tensor blk.26.ffn_gate_inp.weight buffer type overriden to CUDA6
  469. Tensor blk.26.ffn_gate_exps.weight buffer type overriden to CUDA6
  470. Tensor blk.26.ffn_down_exps.weight buffer type overriden to CUDA6
  471. Tensor blk.26.ffn_up_exps.weight buffer type overriden to CUDA6
  472. Tensor blk.26.ffn_gate_shexp.weight buffer type overriden to CUDA0
  473. Tensor blk.26.ffn_down_shexp.weight buffer type overriden to CUDA0
  474. Tensor blk.26.ffn_up_shexp.weight buffer type overriden to CUDA0
  475. Tensor blk.27.attn_norm.weight buffer type overriden to CUDA0
  476. Tensor blk.27.attn_q_a_norm.weight buffer type overriden to CUDA0
  477. Tensor blk.27.attn_kv_a_norm.weight buffer type overriden to CUDA0
  478. Tensor blk.27.attn_q_a.weight buffer type overriden to CUDA0
  479. Tensor blk.27.attn_q_b.weight buffer type overriden to CUDA0
  480. Tensor blk.27.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  481. Tensor blk.27.attn_kv_b.weight buffer type overriden to CUDA0
  482. Tensor blk.27.attn_k_b.weight buffer type overriden to CUDA0
  483. Tensor blk.27.attn_v_b.weight buffer type overriden to CUDA0
  484. Tensor blk.27.attn_output.weight buffer type overriden to CUDA0
  485. Tensor blk.27.ffn_norm.weight buffer type overriden to CUDA6
  486. Tensor blk.27.ffn_gate_inp.weight buffer type overriden to CUDA6
  487. Tensor blk.27.ffn_gate_exps.weight buffer type overriden to CUDA6
  488. Tensor blk.27.ffn_down_exps.weight buffer type overriden to CUDA6
  489. Tensor blk.27.ffn_up_exps.weight buffer type overriden to CUDA6
  490. Tensor blk.27.ffn_gate_shexp.weight buffer type overriden to CUDA0
  491. Tensor blk.27.ffn_down_shexp.weight buffer type overriden to CUDA0
  492. Tensor blk.27.ffn_up_shexp.weight buffer type overriden to CUDA0
  493. Tensor blk.28.attn_norm.weight buffer type overriden to CUDA0
  494. Tensor blk.28.attn_q_a_norm.weight buffer type overriden to CUDA0
  495. Tensor blk.28.attn_kv_a_norm.weight buffer type overriden to CUDA0
  496. Tensor blk.28.attn_q_a.weight buffer type overriden to CUDA0
  497. Tensor blk.28.attn_q_b.weight buffer type overriden to CUDA0
  498. Tensor blk.28.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  499. Tensor blk.28.attn_kv_b.weight buffer type overriden to CUDA0
  500. Tensor blk.28.attn_k_b.weight buffer type overriden to CUDA0
  501. Tensor blk.28.attn_v_b.weight buffer type overriden to CUDA0
  502. Tensor blk.28.attn_output.weight buffer type overriden to CUDA0
  503. Tensor blk.28.ffn_norm.weight buffer type overriden to CUDA6
  504. Tensor blk.28.ffn_gate_inp.weight buffer type overriden to CUDA6
  505. Tensor blk.28.ffn_gate_exps.weight buffer type overriden to CUDA6
  506. Tensor blk.28.ffn_down_exps.weight buffer type overriden to CUDA6
  507. Tensor blk.28.ffn_up_exps.weight buffer type overriden to CUDA6
  508. Tensor blk.28.ffn_gate_shexp.weight buffer type overriden to CUDA0
  509. Tensor blk.28.ffn_down_shexp.weight buffer type overriden to CUDA0
  510. Tensor blk.28.ffn_up_shexp.weight buffer type overriden to CUDA0
  511. Tensor blk.29.attn_norm.weight buffer type overriden to CUDA0
  512. Tensor blk.29.attn_q_a_norm.weight buffer type overriden to CUDA0
  513. Tensor blk.29.attn_kv_a_norm.weight buffer type overriden to CUDA0
  514. Tensor blk.29.attn_q_a.weight buffer type overriden to CUDA0
  515. Tensor blk.29.attn_q_b.weight buffer type overriden to CUDA0
  516. Tensor blk.29.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  517. Tensor blk.29.attn_kv_b.weight buffer type overriden to CUDA0
  518. Tensor blk.29.attn_k_b.weight buffer type overriden to CUDA0
  519. Tensor blk.29.attn_v_b.weight buffer type overriden to CUDA0
  520. Tensor blk.29.attn_output.weight buffer type overriden to CUDA0
  521. Tensor blk.29.ffn_norm.weight buffer type overriden to CUDA6
  522. Tensor blk.29.ffn_gate_inp.weight buffer type overriden to CUDA6
  523. Tensor blk.29.ffn_gate_exps.weight buffer type overriden to CUDA6
  524. Tensor blk.29.ffn_down_exps.weight buffer type overriden to CUDA6
  525. Tensor blk.29.ffn_up_exps.weight buffer type overriden to CUDA6
  526. Tensor blk.29.ffn_gate_shexp.weight buffer type overriden to CUDA0
  527. Tensor blk.29.ffn_down_shexp.weight buffer type overriden to CUDA0
  528. Tensor blk.29.ffn_up_shexp.weight buffer type overriden to CUDA0
  529. Tensor blk.30.attn_norm.weight buffer type overriden to CUDA0
  530. Tensor blk.30.attn_q_a_norm.weight buffer type overriden to CUDA0
  531. Tensor blk.30.attn_kv_a_norm.weight buffer type overriden to CUDA0
  532. Tensor blk.30.attn_q_a.weight buffer type overriden to CUDA0
  533. Tensor blk.30.attn_q_b.weight buffer type overriden to CUDA0
  534. Tensor blk.30.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  535. Tensor blk.30.attn_kv_b.weight buffer type overriden to CUDA0
  536. Tensor blk.30.attn_k_b.weight buffer type overriden to CUDA0
  537. Tensor blk.30.attn_v_b.weight buffer type overriden to CUDA0
  538. Tensor blk.30.attn_output.weight buffer type overriden to CUDA0
  539. Tensor blk.30.ffn_norm.weight buffer type overriden to CUDA6
  540. Tensor blk.30.ffn_gate_inp.weight buffer type overriden to CUDA6
  541. Tensor blk.30.ffn_gate_exps.weight buffer type overriden to CUDA6
  542. Tensor blk.30.ffn_down_exps.weight buffer type overriden to CUDA6
  543. Tensor blk.30.ffn_up_exps.weight buffer type overriden to CUDA6
  544. Tensor blk.30.ffn_gate_shexp.weight buffer type overriden to CUDA0
  545. Tensor blk.30.ffn_down_shexp.weight buffer type overriden to CUDA0
  546. Tensor blk.30.ffn_up_shexp.weight buffer type overriden to CUDA0
  547. Tensor blk.31.attn_norm.weight buffer type overriden to CUDA0
  548. Tensor blk.31.attn_q_a_norm.weight buffer type overriden to CUDA0
  549. Tensor blk.31.attn_kv_a_norm.weight buffer type overriden to CUDA0
  550. Tensor blk.31.attn_q_a.weight buffer type overriden to CUDA0
  551. Tensor blk.31.attn_q_b.weight buffer type overriden to CUDA0
  552. Tensor blk.31.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  553. Tensor blk.31.attn_kv_b.weight buffer type overriden to CUDA0
  554. Tensor blk.31.attn_k_b.weight buffer type overriden to CUDA0
  555. Tensor blk.31.attn_v_b.weight buffer type overriden to CUDA0
  556. Tensor blk.31.attn_output.weight buffer type overriden to CUDA0
  557. Tensor blk.31.ffn_norm.weight buffer type overriden to CUDA6
  558. Tensor blk.31.ffn_gate_inp.weight buffer type overriden to CUDA6
  559. Tensor blk.31.ffn_gate_exps.weight buffer type overriden to CUDA6
  560. Tensor blk.31.ffn_down_exps.weight buffer type overriden to CUDA6
  561. Tensor blk.31.ffn_up_exps.weight buffer type overriden to CUDA6
  562. Tensor blk.31.ffn_gate_shexp.weight buffer type overriden to CUDA0
  563. Tensor blk.31.ffn_down_shexp.weight buffer type overriden to CUDA0
  564. Tensor blk.31.ffn_up_shexp.weight buffer type overriden to CUDA0
  565. Tensor blk.32.attn_norm.weight buffer type overriden to CUDA0
  566. Tensor blk.32.attn_q_a_norm.weight buffer type overriden to CUDA0
  567. Tensor blk.32.attn_kv_a_norm.weight buffer type overriden to CUDA0
  568. Tensor blk.32.attn_q_a.weight buffer type overriden to CUDA0
  569. Tensor blk.32.attn_q_b.weight buffer type overriden to CUDA0
  570. Tensor blk.32.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  571. Tensor blk.32.attn_kv_b.weight buffer type overriden to CUDA0
  572. Tensor blk.32.attn_k_b.weight buffer type overriden to CUDA0
  573. Tensor blk.32.attn_v_b.weight buffer type overriden to CUDA0
  574. Tensor blk.32.attn_output.weight buffer type overriden to CUDA0
  575. Tensor blk.32.ffn_gate_exps.weight buffer type overriden to CPU
  576. Tensor blk.32.ffn_down_exps.weight buffer type overriden to CPU
  577. Tensor blk.32.ffn_up_exps.weight buffer type overriden to CPU
  578. Tensor blk.32.ffn_gate_shexp.weight buffer type overriden to CUDA0
  579. Tensor blk.32.ffn_down_shexp.weight buffer type overriden to CUDA0
  580. Tensor blk.32.ffn_up_shexp.weight buffer type overriden to CUDA0
  581. Tensor blk.33.attn_norm.weight buffer type overriden to CUDA0
  582. Tensor blk.33.attn_q_a_norm.weight buffer type overriden to CUDA0
  583. Tensor blk.33.attn_kv_a_norm.weight buffer type overriden to CUDA0
  584. Tensor blk.33.attn_q_a.weight buffer type overriden to CUDA0
  585. Tensor blk.33.attn_q_b.weight buffer type overriden to CUDA0
  586. Tensor blk.33.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  587. Tensor blk.33.attn_kv_b.weight buffer type overriden to CUDA0
  588. Tensor blk.33.attn_k_b.weight buffer type overriden to CUDA0
  589. Tensor blk.33.attn_v_b.weight buffer type overriden to CUDA0
  590. Tensor blk.33.attn_output.weight buffer type overriden to CUDA0
  591. Tensor blk.33.ffn_gate_exps.weight buffer type overriden to CPU
  592. Tensor blk.33.ffn_down_exps.weight buffer type overriden to CPU
  593. Tensor blk.33.ffn_up_exps.weight buffer type overriden to CPU
  594. Tensor blk.33.ffn_gate_shexp.weight buffer type overriden to CUDA0
  595. Tensor blk.33.ffn_down_shexp.weight buffer type overriden to CUDA0
  596. Tensor blk.33.ffn_up_shexp.weight buffer type overriden to CUDA0
  597. Tensor blk.34.attn_norm.weight buffer type overriden to CUDA0
  598. Tensor blk.34.attn_q_a_norm.weight buffer type overriden to CUDA0
  599. Tensor blk.34.attn_kv_a_norm.weight buffer type overriden to CUDA0
  600. Tensor blk.34.attn_q_a.weight buffer type overriden to CUDA0
  601. Tensor blk.34.attn_q_b.weight buffer type overriden to CUDA0
  602. Tensor blk.34.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  603. Tensor blk.34.attn_kv_b.weight buffer type overriden to CUDA0
  604. Tensor blk.34.attn_k_b.weight buffer type overriden to CUDA0
  605. Tensor blk.34.attn_v_b.weight buffer type overriden to CUDA0
  606. Tensor blk.34.attn_output.weight buffer type overriden to CUDA0
  607. Tensor blk.34.ffn_gate_exps.weight buffer type overriden to CPU
  608. Tensor blk.34.ffn_down_exps.weight buffer type overriden to CPU
  609. Tensor blk.34.ffn_up_exps.weight buffer type overriden to CPU
  610. Tensor blk.34.ffn_gate_shexp.weight buffer type overriden to CUDA0
  611. Tensor blk.34.ffn_down_shexp.weight buffer type overriden to CUDA0
  612. Tensor blk.34.ffn_up_shexp.weight buffer type overriden to CUDA0
  613. Tensor blk.35.attn_norm.weight buffer type overriden to CUDA0
  614. Tensor blk.35.attn_q_a_norm.weight buffer type overriden to CUDA0
  615. Tensor blk.35.attn_kv_a_norm.weight buffer type overriden to CUDA0
  616. Tensor blk.35.attn_q_a.weight buffer type overriden to CUDA0
  617. Tensor blk.35.attn_q_b.weight buffer type overriden to CUDA0
  618. Tensor blk.35.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  619. Tensor blk.35.attn_kv_b.weight buffer type overriden to CUDA0
  620. Tensor blk.35.attn_k_b.weight buffer type overriden to CUDA0
  621. Tensor blk.35.attn_v_b.weight buffer type overriden to CUDA0
  622. Tensor blk.35.attn_output.weight buffer type overriden to CUDA0
  623. Tensor blk.35.ffn_gate_exps.weight buffer type overriden to CPU
  624. Tensor blk.35.ffn_down_exps.weight buffer type overriden to CPU
  625. Tensor blk.35.ffn_up_exps.weight buffer type overriden to CPU
  626. Tensor blk.35.ffn_gate_shexp.weight buffer type overriden to CUDA0
  627. Tensor blk.35.ffn_down_shexp.weight buffer type overriden to CUDA0
  628. Tensor blk.35.ffn_up_shexp.weight buffer type overriden to CUDA0
  629. Tensor blk.36.attn_norm.weight buffer type overriden to CUDA0
  630. Tensor blk.36.attn_q_a_norm.weight buffer type overriden to CUDA0
  631. Tensor blk.36.attn_kv_a_norm.weight buffer type overriden to CUDA0
  632. Tensor blk.36.attn_q_a.weight buffer type overriden to CUDA0
  633. Tensor blk.36.attn_q_b.weight buffer type overriden to CUDA0
  634. Tensor blk.36.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  635. Tensor blk.36.attn_kv_b.weight buffer type overriden to CUDA0
  636. Tensor blk.36.attn_k_b.weight buffer type overriden to CUDA0
  637. Tensor blk.36.attn_v_b.weight buffer type overriden to CUDA0
  638. Tensor blk.36.attn_output.weight buffer type overriden to CUDA0
  639. Tensor blk.36.ffn_gate_exps.weight buffer type overriden to CPU
  640. Tensor blk.36.ffn_down_exps.weight buffer type overriden to CPU
  641. Tensor blk.36.ffn_up_exps.weight buffer type overriden to CPU
  642. Tensor blk.36.ffn_gate_shexp.weight buffer type overriden to CUDA0
  643. Tensor blk.36.ffn_down_shexp.weight buffer type overriden to CUDA0
  644. Tensor blk.36.ffn_up_shexp.weight buffer type overriden to CUDA0
  645. Tensor blk.37.attn_norm.weight buffer type overriden to CUDA0
  646. Tensor blk.37.attn_q_a_norm.weight buffer type overriden to CUDA0
  647. Tensor blk.37.attn_kv_a_norm.weight buffer type overriden to CUDA0
  648. Tensor blk.37.attn_q_a.weight buffer type overriden to CUDA0
  649. Tensor blk.37.attn_q_b.weight buffer type overriden to CUDA0
  650. Tensor blk.37.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  651. Tensor blk.37.attn_kv_b.weight buffer type overriden to CUDA0
  652. Tensor blk.37.attn_k_b.weight buffer type overriden to CUDA0
  653. Tensor blk.37.attn_v_b.weight buffer type overriden to CUDA0
  654. Tensor blk.37.attn_output.weight buffer type overriden to CUDA0
  655. Tensor blk.37.ffn_gate_exps.weight buffer type overriden to CPU
  656. Tensor blk.37.ffn_down_exps.weight buffer type overriden to CPU
  657. Tensor blk.37.ffn_up_exps.weight buffer type overriden to CPU
  658. Tensor blk.37.ffn_gate_shexp.weight buffer type overriden to CUDA0
  659. Tensor blk.37.ffn_down_shexp.weight buffer type overriden to CUDA0
  660. Tensor blk.37.ffn_up_shexp.weight buffer type overriden to CUDA0
  661. Tensor blk.38.attn_norm.weight buffer type overriden to CUDA0
  662. Tensor blk.38.attn_q_a_norm.weight buffer type overriden to CUDA0
  663. Tensor blk.38.attn_kv_a_norm.weight buffer type overriden to CUDA0
  664. Tensor blk.38.attn_q_a.weight buffer type overriden to CUDA0
  665. Tensor blk.38.attn_q_b.weight buffer type overriden to CUDA0
  666. Tensor blk.38.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  667. Tensor blk.38.attn_kv_b.weight buffer type overriden to CUDA0
  668. Tensor blk.38.attn_k_b.weight buffer type overriden to CUDA0
  669. Tensor blk.38.attn_v_b.weight buffer type overriden to CUDA0
  670. Tensor blk.38.attn_output.weight buffer type overriden to CUDA0
  671. Tensor blk.38.ffn_gate_exps.weight buffer type overriden to CPU
  672. Tensor blk.38.ffn_down_exps.weight buffer type overriden to CPU
  673. Tensor blk.38.ffn_up_exps.weight buffer type overriden to CPU
  674. Tensor blk.38.ffn_gate_shexp.weight buffer type overriden to CUDA0
  675. Tensor blk.38.ffn_down_shexp.weight buffer type overriden to CUDA0
  676. Tensor blk.38.ffn_up_shexp.weight buffer type overriden to CUDA0
  677. Tensor blk.39.attn_norm.weight buffer type overriden to CUDA0
  678. Tensor blk.39.attn_q_a_norm.weight buffer type overriden to CUDA0
  679. Tensor blk.39.attn_kv_a_norm.weight buffer type overriden to CUDA0
  680. Tensor blk.39.attn_q_a.weight buffer type overriden to CUDA0
  681. Tensor blk.39.attn_q_b.weight buffer type overriden to CUDA0
  682. Tensor blk.39.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  683. Tensor blk.39.attn_kv_b.weight buffer type overriden to CUDA0
  684. Tensor blk.39.attn_k_b.weight buffer type overriden to CUDA0
  685. Tensor blk.39.attn_v_b.weight buffer type overriden to CUDA0
  686. Tensor blk.39.attn_output.weight buffer type overriden to CUDA0
  687. Tensor blk.39.ffn_gate_exps.weight buffer type overriden to CPU
  688. Tensor blk.39.ffn_down_exps.weight buffer type overriden to CPU
  689. Tensor blk.39.ffn_up_exps.weight buffer type overriden to CPU
  690. Tensor blk.39.ffn_gate_shexp.weight buffer type overriden to CUDA0
  691. Tensor blk.39.ffn_down_shexp.weight buffer type overriden to CUDA0
  692. Tensor blk.39.ffn_up_shexp.weight buffer type overriden to CUDA0
  693. Tensor blk.40.attn_norm.weight buffer type overriden to CUDA0
  694. Tensor blk.40.attn_q_a_norm.weight buffer type overriden to CUDA0
  695. Tensor blk.40.attn_kv_a_norm.weight buffer type overriden to CUDA0
  696. Tensor blk.40.attn_q_a.weight buffer type overriden to CUDA0
  697. Tensor blk.40.attn_q_b.weight buffer type overriden to CUDA0
  698. Tensor blk.40.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  699. Tensor blk.40.attn_kv_b.weight buffer type overriden to CUDA0
  700. Tensor blk.40.attn_k_b.weight buffer type overriden to CUDA0
  701. Tensor blk.40.attn_v_b.weight buffer type overriden to CUDA0
  702. Tensor blk.40.attn_output.weight buffer type overriden to CUDA0
  703. Tensor blk.40.ffn_gate_exps.weight buffer type overriden to CPU
  704. Tensor blk.40.ffn_down_exps.weight buffer type overriden to CPU
  705. Tensor blk.40.ffn_up_exps.weight buffer type overriden to CPU
  706. Tensor blk.40.ffn_gate_shexp.weight buffer type overriden to CUDA0
  707. Tensor blk.40.ffn_down_shexp.weight buffer type overriden to CUDA0
  708. Tensor blk.40.ffn_up_shexp.weight buffer type overriden to CUDA0
  709. Tensor blk.41.attn_norm.weight buffer type overriden to CUDA0
  710. Tensor blk.41.attn_q_a_norm.weight buffer type overriden to CUDA0
  711. Tensor blk.41.attn_kv_a_norm.weight buffer type overriden to CUDA0
  712. Tensor blk.41.attn_q_a.weight buffer type overriden to CUDA0
  713. Tensor blk.41.attn_q_b.weight buffer type overriden to CUDA0
  714. Tensor blk.41.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  715. Tensor blk.41.attn_kv_b.weight buffer type overriden to CUDA0
  716. Tensor blk.41.attn_k_b.weight buffer type overriden to CUDA0
  717. Tensor blk.41.attn_v_b.weight buffer type overriden to CUDA0
  718. Tensor blk.41.attn_output.weight buffer type overriden to CUDA0
  719. Tensor blk.41.ffn_gate_exps.weight buffer type overriden to CPU
  720. Tensor blk.41.ffn_down_exps.weight buffer type overriden to CPU
  721. Tensor blk.41.ffn_up_exps.weight buffer type overriden to CPU
  722. Tensor blk.41.ffn_gate_shexp.weight buffer type overriden to CUDA0
  723. Tensor blk.41.ffn_down_shexp.weight buffer type overriden to CUDA0
  724. Tensor blk.41.ffn_up_shexp.weight buffer type overriden to CUDA0
  725. Tensor blk.42.attn_norm.weight buffer type overriden to CUDA0
  726. Tensor blk.42.attn_q_a_norm.weight buffer type overriden to CUDA0
  727. Tensor blk.42.attn_kv_a_norm.weight buffer type overriden to CUDA0
  728. Tensor blk.42.attn_q_a.weight buffer type overriden to CUDA0
  729. Tensor blk.42.attn_q_b.weight buffer type overriden to CUDA0
  730. Tensor blk.42.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  731. Tensor blk.42.attn_kv_b.weight buffer type overriden to CUDA0
  732. Tensor blk.42.attn_k_b.weight buffer type overriden to CUDA0
  733. Tensor blk.42.attn_v_b.weight buffer type overriden to CUDA0
  734. Tensor blk.42.attn_output.weight buffer type overriden to CUDA0
  735. Tensor blk.42.ffn_gate_exps.weight buffer type overriden to CPU
  736. Tensor blk.42.ffn_down_exps.weight buffer type overriden to CPU
  737. Tensor blk.42.ffn_up_exps.weight buffer type overriden to CPU
  738. Tensor blk.42.ffn_gate_shexp.weight buffer type overriden to CUDA0
  739. Tensor blk.42.ffn_down_shexp.weight buffer type overriden to CUDA0
  740. Tensor blk.42.ffn_up_shexp.weight buffer type overriden to CUDA0
  741. Tensor blk.43.attn_norm.weight buffer type overriden to CUDA0
  742. Tensor blk.43.attn_q_a_norm.weight buffer type overriden to CUDA0
  743. Tensor blk.43.attn_kv_a_norm.weight buffer type overriden to CUDA0
  744. Tensor blk.43.attn_q_a.weight buffer type overriden to CUDA0
  745. Tensor blk.43.attn_q_b.weight buffer type overriden to CUDA0
  746. Tensor blk.43.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  747. Tensor blk.43.attn_kv_b.weight buffer type overriden to CUDA0
  748. Tensor blk.43.attn_k_b.weight buffer type overriden to CUDA0
  749. Tensor blk.43.attn_v_b.weight buffer type overriden to CUDA0
  750. Tensor blk.43.attn_output.weight buffer type overriden to CUDA0
  751. Tensor blk.43.ffn_gate_exps.weight buffer type overriden to CPU
  752. Tensor blk.43.ffn_down_exps.weight buffer type overriden to CPU
  753. Tensor blk.43.ffn_up_exps.weight buffer type overriden to CPU
  754. Tensor blk.43.ffn_gate_shexp.weight buffer type overriden to CUDA0
  755. Tensor blk.43.ffn_down_shexp.weight buffer type overriden to CUDA0
  756. Tensor blk.43.ffn_up_shexp.weight buffer type overriden to CUDA0
  757. Tensor blk.44.attn_norm.weight buffer type overriden to CUDA0
  758. Tensor blk.44.attn_q_a_norm.weight buffer type overriden to CUDA0
  759. Tensor blk.44.attn_kv_a_norm.weight buffer type overriden to CUDA0
  760. Tensor blk.44.attn_q_a.weight buffer type overriden to CUDA0
  761. Tensor blk.44.attn_q_b.weight buffer type overriden to CUDA0
  762. Tensor blk.44.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  763. Tensor blk.44.attn_kv_b.weight buffer type overriden to CUDA0
  764. Tensor blk.44.attn_k_b.weight buffer type overriden to CUDA0
  765. Tensor blk.44.attn_v_b.weight buffer type overriden to CUDA0
  766. Tensor blk.44.attn_output.weight buffer type overriden to CUDA0
  767. Tensor blk.44.ffn_gate_exps.weight buffer type overriden to CPU
  768. Tensor blk.44.ffn_down_exps.weight buffer type overriden to CPU
  769. Tensor blk.44.ffn_up_exps.weight buffer type overriden to CPU
  770. Tensor blk.44.ffn_gate_shexp.weight buffer type overriden to CUDA0
  771. Tensor blk.44.ffn_down_shexp.weight buffer type overriden to CUDA0
  772. Tensor blk.44.ffn_up_shexp.weight buffer type overriden to CUDA0
  773. Tensor blk.45.attn_norm.weight buffer type overriden to CUDA0
  774. Tensor blk.45.attn_q_a_norm.weight buffer type overriden to CUDA0
  775. Tensor blk.45.attn_kv_a_norm.weight buffer type overriden to CUDA0
  776. Tensor blk.45.attn_q_a.weight buffer type overriden to CUDA0
  777. Tensor blk.45.attn_q_b.weight buffer type overriden to CUDA0
  778. Tensor blk.45.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  779. Tensor blk.45.attn_kv_b.weight buffer type overriden to CUDA0
  780. Tensor blk.45.attn_k_b.weight buffer type overriden to CUDA0
  781. Tensor blk.45.attn_v_b.weight buffer type overriden to CUDA0
  782. Tensor blk.45.attn_output.weight buffer type overriden to CUDA0
  783. Tensor blk.45.ffn_gate_exps.weight buffer type overriden to CPU
  784. Tensor blk.45.ffn_down_exps.weight buffer type overriden to CPU
  785. Tensor blk.45.ffn_up_exps.weight buffer type overriden to CPU
  786. Tensor blk.45.ffn_gate_shexp.weight buffer type overriden to CUDA0
  787. Tensor blk.45.ffn_down_shexp.weight buffer type overriden to CUDA0
  788. Tensor blk.45.ffn_up_shexp.weight buffer type overriden to CUDA0
  789. Tensor blk.46.attn_norm.weight buffer type overriden to CUDA0
  790. Tensor blk.46.attn_q_a_norm.weight buffer type overriden to CUDA0
  791. Tensor blk.46.attn_kv_a_norm.weight buffer type overriden to CUDA0
  792. Tensor blk.46.attn_q_a.weight buffer type overriden to CUDA0
  793. Tensor blk.46.attn_q_b.weight buffer type overriden to CUDA0
  794. Tensor blk.46.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  795. Tensor blk.46.attn_kv_b.weight buffer type overriden to CUDA0
  796. Tensor blk.46.attn_k_b.weight buffer type overriden to CUDA0
  797. Tensor blk.46.attn_v_b.weight buffer type overriden to CUDA0
  798. Tensor blk.46.attn_output.weight buffer type overriden to CUDA0
  799. Tensor blk.46.ffn_gate_exps.weight buffer type overriden to CPU
  800. Tensor blk.46.ffn_down_exps.weight buffer type overriden to CPU
  801. Tensor blk.46.ffn_up_exps.weight buffer type overriden to CPU
  802. Tensor blk.46.ffn_gate_shexp.weight buffer type overriden to CUDA0
  803. Tensor blk.46.ffn_down_shexp.weight buffer type overriden to CUDA0
  804. Tensor blk.46.ffn_up_shexp.weight buffer type overriden to CUDA0
  805. Tensor blk.47.attn_norm.weight buffer type overriden to CUDA0
  806. Tensor blk.47.attn_q_a_norm.weight buffer type overriden to CUDA0
  807. Tensor blk.47.attn_kv_a_norm.weight buffer type overriden to CUDA0
  808. Tensor blk.47.attn_q_a.weight buffer type overriden to CUDA0
  809. Tensor blk.47.attn_q_b.weight buffer type overriden to CUDA0
  810. Tensor blk.47.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  811. Tensor blk.47.attn_kv_b.weight buffer type overriden to CUDA0
  812. Tensor blk.47.attn_k_b.weight buffer type overriden to CUDA0
  813. Tensor blk.47.attn_v_b.weight buffer type overriden to CUDA0
  814. Tensor blk.47.attn_output.weight buffer type overriden to CUDA0
  815. Tensor blk.47.ffn_gate_exps.weight buffer type overriden to CPU
  816. Tensor blk.47.ffn_down_exps.weight buffer type overriden to CPU
  817. Tensor blk.47.ffn_up_exps.weight buffer type overriden to CPU
  818. Tensor blk.47.ffn_gate_shexp.weight buffer type overriden to CUDA0
  819. Tensor blk.47.ffn_down_shexp.weight buffer type overriden to CUDA0
  820. Tensor blk.47.ffn_up_shexp.weight buffer type overriden to CUDA0
  821. Tensor blk.48.attn_norm.weight buffer type overriden to CUDA0
  822. Tensor blk.48.attn_q_a_norm.weight buffer type overriden to CUDA0
  823. Tensor blk.48.attn_kv_a_norm.weight buffer type overriden to CUDA0
  824. Tensor blk.48.attn_q_a.weight buffer type overriden to CUDA0
  825. Tensor blk.48.attn_q_b.weight buffer type overriden to CUDA0
  826. Tensor blk.48.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  827. Tensor blk.48.attn_kv_b.weight buffer type overriden to CUDA0
  828. Tensor blk.48.attn_k_b.weight buffer type overriden to CUDA0
  829. Tensor blk.48.attn_v_b.weight buffer type overriden to CUDA0
  830. Tensor blk.48.attn_output.weight buffer type overriden to CUDA0
  831. Tensor blk.48.ffn_gate_exps.weight buffer type overriden to CPU
  832. Tensor blk.48.ffn_down_exps.weight buffer type overriden to CPU
  833. Tensor blk.48.ffn_up_exps.weight buffer type overriden to CPU
  834. Tensor blk.48.ffn_gate_shexp.weight buffer type overriden to CUDA0
  835. Tensor blk.48.ffn_down_shexp.weight buffer type overriden to CUDA0
  836. Tensor blk.48.ffn_up_shexp.weight buffer type overriden to CUDA0
  837. Tensor blk.49.attn_norm.weight buffer type overriden to CUDA0
  838. Tensor blk.49.attn_q_a_norm.weight buffer type overriden to CUDA0
  839. Tensor blk.49.attn_kv_a_norm.weight buffer type overriden to CUDA0
  840. Tensor blk.49.attn_q_a.weight buffer type overriden to CUDA0
  841. Tensor blk.49.attn_q_b.weight buffer type overriden to CUDA0
  842. Tensor blk.49.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  843. Tensor blk.49.attn_kv_b.weight buffer type overriden to CUDA0
  844. Tensor blk.49.attn_k_b.weight buffer type overriden to CUDA0
  845. Tensor blk.49.attn_v_b.weight buffer type overriden to CUDA0
  846. Tensor blk.49.attn_output.weight buffer type overriden to CUDA0
  847. Tensor blk.49.ffn_gate_exps.weight buffer type overriden to CPU
  848. Tensor blk.49.ffn_down_exps.weight buffer type overriden to CPU
  849. Tensor blk.49.ffn_up_exps.weight buffer type overriden to CPU
  850. Tensor blk.49.ffn_gate_shexp.weight buffer type overriden to CUDA0
  851. Tensor blk.49.ffn_down_shexp.weight buffer type overriden to CUDA0
  852. Tensor blk.49.ffn_up_shexp.weight buffer type overriden to CUDA0
  853. Tensor blk.50.attn_norm.weight buffer type overriden to CUDA0
  854. Tensor blk.50.attn_q_a_norm.weight buffer type overriden to CUDA0
  855. Tensor blk.50.attn_kv_a_norm.weight buffer type overriden to CUDA0
  856. Tensor blk.50.attn_q_a.weight buffer type overriden to CUDA0
  857. Tensor blk.50.attn_q_b.weight buffer type overriden to CUDA0
  858. Tensor blk.50.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  859. Tensor blk.50.attn_kv_b.weight buffer type overriden to CUDA0
  860. Tensor blk.50.attn_k_b.weight buffer type overriden to CUDA0
  861. Tensor blk.50.attn_v_b.weight buffer type overriden to CUDA0
  862. Tensor blk.50.attn_output.weight buffer type overriden to CUDA0
  863. Tensor blk.50.ffn_gate_exps.weight buffer type overriden to CPU
  864. Tensor blk.50.ffn_down_exps.weight buffer type overriden to CPU
  865. Tensor blk.50.ffn_up_exps.weight buffer type overriden to CPU
  866. Tensor blk.50.ffn_gate_shexp.weight buffer type overriden to CUDA0
  867. Tensor blk.50.ffn_down_shexp.weight buffer type overriden to CUDA0
  868. Tensor blk.50.ffn_up_shexp.weight buffer type overriden to CUDA0
  869. Tensor blk.51.attn_norm.weight buffer type overriden to CUDA0
  870. Tensor blk.51.attn_q_a_norm.weight buffer type overriden to CUDA0
  871. Tensor blk.51.attn_kv_a_norm.weight buffer type overriden to CUDA0
  872. Tensor blk.51.attn_q_a.weight buffer type overriden to CUDA0
  873. Tensor blk.51.attn_q_b.weight buffer type overriden to CUDA0
  874. Tensor blk.51.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  875. Tensor blk.51.attn_kv_b.weight buffer type overriden to CUDA0
  876. Tensor blk.51.attn_k_b.weight buffer type overriden to CUDA0
  877. Tensor blk.51.attn_v_b.weight buffer type overriden to CUDA0
  878. Tensor blk.51.attn_output.weight buffer type overriden to CUDA0
  879. Tensor blk.51.ffn_gate_exps.weight buffer type overriden to CPU
  880. Tensor blk.51.ffn_down_exps.weight buffer type overriden to CPU
  881. Tensor blk.51.ffn_up_exps.weight buffer type overriden to CPU
  882. Tensor blk.51.ffn_gate_shexp.weight buffer type overriden to CUDA0
  883. Tensor blk.51.ffn_down_shexp.weight buffer type overriden to CUDA0
  884. Tensor blk.51.ffn_up_shexp.weight buffer type overriden to CUDA0
  885. Tensor blk.52.attn_norm.weight buffer type overriden to CUDA0
  886. Tensor blk.52.attn_q_a_norm.weight buffer type overriden to CUDA0
  887. Tensor blk.52.attn_kv_a_norm.weight buffer type overriden to CUDA0
  888. Tensor blk.52.attn_q_a.weight buffer type overriden to CUDA0
  889. Tensor blk.52.attn_q_b.weight buffer type overriden to CUDA0
  890. Tensor blk.52.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  891. Tensor blk.52.attn_kv_b.weight buffer type overriden to CUDA0
  892. Tensor blk.52.attn_k_b.weight buffer type overriden to CUDA0
  893. Tensor blk.52.attn_v_b.weight buffer type overriden to CUDA0
  894. Tensor blk.52.attn_output.weight buffer type overriden to CUDA0
  895. Tensor blk.52.ffn_gate_exps.weight buffer type overriden to CPU
  896. Tensor blk.52.ffn_down_exps.weight buffer type overriden to CPU
  897. Tensor blk.52.ffn_up_exps.weight buffer type overriden to CPU
  898. Tensor blk.52.ffn_gate_shexp.weight buffer type overriden to CUDA0
  899. Tensor blk.52.ffn_down_shexp.weight buffer type overriden to CUDA0
  900. Tensor blk.52.ffn_up_shexp.weight buffer type overriden to CUDA0
  901. Tensor blk.53.attn_norm.weight buffer type overriden to CUDA0
  902. Tensor blk.53.attn_q_a_norm.weight buffer type overriden to CUDA0
  903. Tensor blk.53.attn_kv_a_norm.weight buffer type overriden to CUDA0
  904. Tensor blk.53.attn_q_a.weight buffer type overriden to CUDA0
  905. Tensor blk.53.attn_q_b.weight buffer type overriden to CUDA0
  906. Tensor blk.53.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  907. Tensor blk.53.attn_kv_b.weight buffer type overriden to CUDA0
  908. Tensor blk.53.attn_k_b.weight buffer type overriden to CUDA0
  909. Tensor blk.53.attn_v_b.weight buffer type overriden to CUDA0
  910. Tensor blk.53.attn_output.weight buffer type overriden to CUDA0
  911. Tensor blk.53.ffn_gate_exps.weight buffer type overriden to CPU
  912. Tensor blk.53.ffn_down_exps.weight buffer type overriden to CPU
  913. Tensor blk.53.ffn_up_exps.weight buffer type overriden to CPU
  914. Tensor blk.53.ffn_gate_shexp.weight buffer type overriden to CUDA0
  915. Tensor blk.53.ffn_down_shexp.weight buffer type overriden to CUDA0
  916. Tensor blk.53.ffn_up_shexp.weight buffer type overriden to CUDA0
  917. Tensor blk.54.attn_norm.weight buffer type overriden to CUDA0
  918. Tensor blk.54.attn_q_a_norm.weight buffer type overriden to CUDA0
  919. Tensor blk.54.attn_kv_a_norm.weight buffer type overriden to CUDA0
  920. Tensor blk.54.attn_q_a.weight buffer type overriden to CUDA0
  921. Tensor blk.54.attn_q_b.weight buffer type overriden to CUDA0
  922. Tensor blk.54.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  923. Tensor blk.54.attn_kv_b.weight buffer type overriden to CUDA0
  924. Tensor blk.54.attn_k_b.weight buffer type overriden to CUDA0
  925. Tensor blk.54.attn_v_b.weight buffer type overriden to CUDA0
  926. Tensor blk.54.attn_output.weight buffer type overriden to CUDA0
  927. Tensor blk.54.ffn_gate_exps.weight buffer type overriden to CPU
  928. Tensor blk.54.ffn_down_exps.weight buffer type overriden to CPU
  929. Tensor blk.54.ffn_up_exps.weight buffer type overriden to CPU
  930. Tensor blk.54.ffn_gate_shexp.weight buffer type overriden to CUDA0
  931. Tensor blk.54.ffn_down_shexp.weight buffer type overriden to CUDA0
  932. Tensor blk.54.ffn_up_shexp.weight buffer type overriden to CUDA0
  933. Tensor blk.55.attn_norm.weight buffer type overriden to CUDA0
  934. Tensor blk.55.attn_q_a_norm.weight buffer type overriden to CUDA0
  935. Tensor blk.55.attn_kv_a_norm.weight buffer type overriden to CUDA0
  936. Tensor blk.55.attn_q_a.weight buffer type overriden to CUDA0
  937. Tensor blk.55.attn_q_b.weight buffer type overriden to CUDA0
  938. Tensor blk.55.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  939. Tensor blk.55.attn_kv_b.weight buffer type overriden to CUDA0
  940. Tensor blk.55.attn_k_b.weight buffer type overriden to CUDA0
  941. Tensor blk.55.attn_v_b.weight buffer type overriden to CUDA0
  942. Tensor blk.55.attn_output.weight buffer type overriden to CUDA0
  943. Tensor blk.55.ffn_gate_exps.weight buffer type overriden to CPU
  944. Tensor blk.55.ffn_down_exps.weight buffer type overriden to CPU
  945. Tensor blk.55.ffn_up_exps.weight buffer type overriden to CPU
  946. Tensor blk.55.ffn_gate_shexp.weight buffer type overriden to CUDA0
  947. Tensor blk.55.ffn_down_shexp.weight buffer type overriden to CUDA0
  948. Tensor blk.55.ffn_up_shexp.weight buffer type overriden to CUDA0
  949. Tensor blk.56.attn_norm.weight buffer type overriden to CUDA0
  950. Tensor blk.56.attn_q_a_norm.weight buffer type overriden to CUDA0
  951. Tensor blk.56.attn_kv_a_norm.weight buffer type overriden to CUDA0
  952. Tensor blk.56.attn_q_a.weight buffer type overriden to CUDA0
  953. Tensor blk.56.attn_q_b.weight buffer type overriden to CUDA0
  954. Tensor blk.56.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  955. Tensor blk.56.attn_kv_b.weight buffer type overriden to CUDA0
  956. Tensor blk.56.attn_k_b.weight buffer type overriden to CUDA0
  957. Tensor blk.56.attn_v_b.weight buffer type overriden to CUDA0
  958. Tensor blk.56.attn_output.weight buffer type overriden to CUDA0
  959. Tensor blk.56.ffn_gate_exps.weight buffer type overriden to CPU
  960. Tensor blk.56.ffn_down_exps.weight buffer type overriden to CPU
  961. Tensor blk.56.ffn_up_exps.weight buffer type overriden to CPU
  962. Tensor blk.56.ffn_gate_shexp.weight buffer type overriden to CUDA0
  963. Tensor blk.56.ffn_down_shexp.weight buffer type overriden to CUDA0
  964. Tensor blk.56.ffn_up_shexp.weight buffer type overriden to CUDA0
  965. Tensor blk.57.attn_norm.weight buffer type overriden to CUDA0
  966. Tensor blk.57.attn_q_a_norm.weight buffer type overriden to CUDA0
  967. Tensor blk.57.attn_kv_a_norm.weight buffer type overriden to CUDA0
  968. Tensor blk.57.attn_q_a.weight buffer type overriden to CUDA0
  969. Tensor blk.57.attn_q_b.weight buffer type overriden to CUDA0
  970. Tensor blk.57.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  971. Tensor blk.57.attn_kv_b.weight buffer type overriden to CUDA0
  972. Tensor blk.57.attn_k_b.weight buffer type overriden to CUDA0
  973. Tensor blk.57.attn_v_b.weight buffer type overriden to CUDA0
  974. Tensor blk.57.attn_output.weight buffer type overriden to CUDA0
  975. Tensor blk.57.ffn_gate_exps.weight buffer type overriden to CPU
  976. Tensor blk.57.ffn_down_exps.weight buffer type overriden to CPU
  977. Tensor blk.57.ffn_up_exps.weight buffer type overriden to CPU
  978. Tensor blk.57.ffn_gate_shexp.weight buffer type overriden to CUDA0
  979. Tensor blk.57.ffn_down_shexp.weight buffer type overriden to CUDA0
  980. Tensor blk.57.ffn_up_shexp.weight buffer type overriden to CUDA0
  981. Tensor blk.58.attn_norm.weight buffer type overriden to CUDA0
  982. Tensor blk.58.attn_q_a_norm.weight buffer type overriden to CUDA0
  983. Tensor blk.58.attn_kv_a_norm.weight buffer type overriden to CUDA0
  984. Tensor blk.58.attn_q_a.weight buffer type overriden to CUDA0
  985. Tensor blk.58.attn_q_b.weight buffer type overriden to CUDA0
  986. Tensor blk.58.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  987. Tensor blk.58.attn_kv_b.weight buffer type overriden to CUDA0
  988. Tensor blk.58.attn_k_b.weight buffer type overriden to CUDA0
  989. Tensor blk.58.attn_v_b.weight buffer type overriden to CUDA0
  990. Tensor blk.58.attn_output.weight buffer type overriden to CUDA0
  991. Tensor blk.58.ffn_gate_exps.weight buffer type overriden to CPU
  992. Tensor blk.58.ffn_down_exps.weight buffer type overriden to CPU
  993. Tensor blk.58.ffn_up_exps.weight buffer type overriden to CPU
  994. Tensor blk.58.ffn_gate_shexp.weight buffer type overriden to CUDA0
  995. Tensor blk.58.ffn_down_shexp.weight buffer type overriden to CUDA0
  996. Tensor blk.58.ffn_up_shexp.weight buffer type overriden to CUDA0
  997. Tensor blk.59.attn_norm.weight buffer type overriden to CUDA0
  998. Tensor blk.59.attn_q_a_norm.weight buffer type overriden to CUDA0
  999. Tensor blk.59.attn_kv_a_norm.weight buffer type overriden to CUDA0
  1000. Tensor blk.59.attn_q_a.weight buffer type overriden to CUDA0
  1001. Tensor blk.59.attn_q_b.weight buffer type overriden to CUDA0
  1002. Tensor blk.59.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  1003. Tensor blk.59.attn_kv_b.weight buffer type overriden to CUDA0
  1004. Tensor blk.59.attn_k_b.weight buffer type overriden to CUDA0
  1005. Tensor blk.59.attn_v_b.weight buffer type overriden to CUDA0
  1006. Tensor blk.59.attn_output.weight buffer type overriden to CUDA0
  1007. Tensor blk.59.ffn_gate_exps.weight buffer type overriden to CPU
  1008. Tensor blk.59.ffn_down_exps.weight buffer type overriden to CPU
  1009. Tensor blk.59.ffn_up_exps.weight buffer type overriden to CPU
  1010. Tensor blk.59.ffn_gate_shexp.weight buffer type overriden to CUDA0
  1011. Tensor blk.59.ffn_down_shexp.weight buffer type overriden to CUDA0
  1012. Tensor blk.59.ffn_up_shexp.weight buffer type overriden to CUDA0
  1013. Tensor blk.60.attn_norm.weight buffer type overriden to CUDA0
  1014. Tensor blk.60.attn_q_a_norm.weight buffer type overriden to CUDA0
  1015. Tensor blk.60.attn_kv_a_norm.weight buffer type overriden to CUDA0
  1016. Tensor blk.60.attn_q_a.weight buffer type overriden to CUDA0
  1017. Tensor blk.60.attn_q_b.weight buffer type overriden to CUDA0
  1018. Tensor blk.60.attn_kv_a_mqa.weight buffer type overriden to CUDA0
  1019. Tensor blk.60.attn_kv_b.weight buffer type overriden to CUDA0
  1020. Tensor blk.60.attn_k_b.weight buffer type overriden to CUDA0
  1021. Tensor blk.60.attn_v_b.weight buffer type overriden to CUDA0
  1022. Tensor blk.60.attn_output.weight buffer type overriden to CUDA0
  1023. Tensor blk.60.ffn_gate_exps.weight buffer type overriden to CPU
  1024. Tensor blk.60.ffn_down_exps.weight buffer type overriden to CPU
  1025. Tensor blk.60.ffn_up_exps.weight buffer type overriden to CPU
  1026. Tensor blk.60.ffn_gate_shexp.weight buffer type overriden to CUDA0
  1027. Tensor blk.60.ffn_down_shexp.weight buffer type overriden to CUDA0
  1028. Tensor blk.60.ffn_up_shexp.weight buffer type overriden to CUDA0
  1029. llm_load_tensors: offloading 61 repeating layers to GPU
  1030. llm_load_tensors: offloading non-repeating layers to GPU
  1031. llm_load_tensors: offloaded 62/62 layers to GPU
  1032. llm_load_tensors: CPU buffer size = 138301.00 MiB
  1033. llm_load_tensors: CUDA_Host buffer size = 607.58 MiB
  1034. llm_load_tensors: CUDA0 buffer size = 24196.79 MiB
  1035. llm_load_tensors: CUDA1 buffer size = 19104.12 MiB
  1036. llm_load_tensors: CUDA2 buffer size = 19104.12 MiB
  1037. llm_load_tensors: CUDA3 buffer size = 23887.17 MiB
  1038. llm_load_tensors: CUDA4 buffer size = 14384.31 MiB
  1039. llm_load_tensors: CUDA5 buffer size = 14377.28 MiB
  1040. llm_load_tensors: CUDA6 buffer size = 34255.44 MiB
  1041. ....................................................................................................
  1042. llama_new_context_with_model: n_ctx = 16384
  1043. llama_new_context_with_model: n_batch = 2048
  1044. llama_new_context_with_model: n_ubatch = 2048
  1045. llama_new_context_with_model: flash_attn = 1
  1046. llama_new_context_with_model: mla_attn = 3
  1047. llama_new_context_with_model: attn_max_b = 256
  1048. llama_new_context_with_model: fused_moe = 1
  1049. llama_new_context_with_model: ser = -1, 0
  1050. llama_new_context_with_model: freq_base = 10000.0
  1051. llama_new_context_with_model: freq_scale = 0.025
  1052. llama_kv_cache_init: CUDA0 KV buffer size = 180.00 MiB
  1053. llama_kv_cache_init: CUDA1 KV buffer size = 126.00 MiB
  1054. llama_kv_cache_init: CUDA2 KV buffer size = 126.00 MiB
  1055. llama_kv_cache_init: CUDA3 KV buffer size = 162.00 MiB
  1056. llama_kv_cache_init: CUDA4 KV buffer size = 144.00 MiB
  1057. llama_kv_cache_init: CUDA5 KV buffer size = 126.00 MiB
  1058. llama_kv_cache_init: CUDA6 KV buffer size = 234.00 MiB
  1059. llama_new_context_with_model: KV self size = 1098.00 MiB, c^KV (f16): 1098.00 MiB, kv^T: not used
  1060. llama_new_context_with_model: CUDA_Host output buffer size = 0.99 MiB
  1061. llama_new_context_with_model: pipeline parallelism enabled (n_copies=1)
  1062. llama_new_context_with_model: CUDA0 compute buffer size = 3566.01 MiB
  1063. llama_new_context_with_model: CUDA1 compute buffer size = 688.00 MiB
  1064. llama_new_context_with_model: CUDA2 compute buffer size = 688.00 MiB
  1065. llama_new_context_with_model: CUDA3 compute buffer size = 688.00 MiB
  1066. llama_new_context_with_model: CUDA4 compute buffer size = 688.00 MiB
  1067. llama_new_context_with_model: CUDA5 compute buffer size = 688.00 MiB
  1068. llama_new_context_with_model: CUDA6 compute buffer size = 1122.00 MiB
  1069. llama_new_context_with_model: CUDA_Host compute buffer size = 184.02 MiB
  1070. llama_new_context_with_model: graph nodes = 8184
  1071. llama_new_context_with_model: graph splits = 296
Advertisement
Add Comment
Please, Sign In to add comment