Advertisement
Guest User

rx6900xt_navi2_amdgpu_5.10.20_ppc64le_4kpages

a guest
Mar 9th, 2021
687
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 26.71 KB | None | 0 0
  1. [ 263.680735] [drm] amdgpu kernel modesetting enabled.
  2. [ 263.682186] CRAT table error: (null)
  3. [ 263.682187] DSDT table not found for OEM information
  4. [ 263.682189] IO link not available for non x86 platforms
  5. [ 263.682190] Virtual CRAT table created for CPU
  6. [ 263.682199] amdgpu: Topology: Add CPU node
  7. [ 263.683458] amdgpu 0001:03:00.0: enabling device (0140 -> 0142)
  8. [ 263.683472] [drm] initializing kernel modesetting (SIENNA_CICHLID 0x1002:0x73BF 0x1DA2:0xE438 0xC0).
  9. [ 263.683476] amdgpu 0001:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
  10. [ 263.683489] [drm] register mmio base: 0x80000000
  11. [ 263.683491] [drm] register mmio size: 1048576
  12. [ 263.683493] [drm] PCI I/O BAR is not found.
  13. [ 263.683505] [drm] PCIE atomic ops is not supported
  14. [ 263.685953] [drm] add ip block number 0 <nv_common>
  15. [ 263.685955] [drm] add ip block number 1 <gmc_v10_0>
  16. [ 263.685957] [drm] add ip block number 2 <navi10_ih>
  17. [ 263.685958] [drm] add ip block number 3 <psp>
  18. [ 263.685960] [drm] add ip block number 4 <smu>
  19. [ 263.685962] [drm] add ip block number 5 <gfx_v10_0>
  20. [ 263.685963] [drm] add ip block number 6 <sdma_v5_2>
  21. [ 263.685965] [drm] add ip block number 7 <vcn_v3_0>
  22. [ 263.685966] [drm] add ip block number 8 <jpeg_v3_0>
  23. [ 263.717433] amdgpu 0001:03:00.0: amdgpu: Fetched VBIOS from ROM BAR
  24. [ 263.717437] amdgpu: ATOM BIOS: 113-E438XTX-UO2
  25. [ 263.717449] [drm] VCN(0) decode is enabled in VM mode
  26. [ 263.717450] [drm] VCN(1) decode is enabled in VM mode
  27. [ 263.717452] [drm] VCN(0) encode is enabled in VM mode
  28. [ 263.717453] [drm] VCN(1) encode is enabled in VM mode
  29. [ 263.717456] [drm] JPEG decode is enabled in VM mode
  30. [ 263.717463] [drm] GPU posting now...
  31. [ 263.717519] amdgpu 0001:03:00.0: amdgpu: HBM ECC is not presented.
  32. [ 263.717523] amdgpu 0001:03:00.0: amdgpu: SRAM ECC is not presented.
  33. [ 263.717530] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
  34. [ 263.717575] amdgpu 0001:03:00.0: BAR 2: releasing [mem 0x6004010000000-0x60040101fffff 64bit pref]
  35. [ 263.717580] amdgpu 0001:03:00.0: BAR 0: releasing [mem 0x6004000000000-0x600400fffffff 64bit pref]
  36. [ 263.717615] pci 0001:02:00.0: BAR 15: releasing [mem 0x6004000000000-0x600403fffffff 64bit pref]
  37. [ 263.717620] pci 0001:01:00.0: BAR 15: releasing [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
  38. [ 263.717624] pci 0001:00:00.0: BAR 15: releasing [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
  39. [ 263.717638] pci 0001:00:00.0: BAR 15: assigned [mem 0x6004000000000-0x60045ffffffff 64bit pref]
  40. [ 263.717645] pci 0001:01:00.0: BAR 15: assigned [mem 0x6004000000000-0x60045ffffffff 64bit pref]
  41. [ 263.717649] pci 0001:02:00.0: BAR 15: assigned [mem 0x6004000000000-0x60045ffffffff 64bit pref]
  42. [ 263.717655] amdgpu 0001:03:00.0: BAR 0: assigned [mem 0x6004000000000-0x60043ffffffff 64bit pref]
  43. [ 263.717667] amdgpu 0001:03:00.0: BAR 2: assigned [mem 0x6004400000000-0x60044001fffff 64bit pref]
  44. [ 263.717680] pci 0001:00:00.0: PCI bridge to [bus 01-03]
  45. [ 263.717687] pci 0001:00:00.0: bridge window [mem 0x600c080000000-0x600c0ffefffff]
  46. [ 263.717692] pci 0001:00:00.0: bridge window [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
  47. [ 263.717699] pci 0001:01:00.0: PCI bridge to [bus 02-03]
  48. [ 263.717708] pci 0001:01:00.0: bridge window [mem 0x600c080000000-0x600c0ffefffff]
  49. [ 263.717713] pci 0001:01:00.0: bridge window [mem 0x6004000000000-0x6007f7ff0ffff 64bit pref]
  50. [ 263.717720] pci 0001:02:00.0: PCI bridge to [bus 03]
  51. [ 263.717727] pci 0001:02:00.0: bridge window [mem 0x600c080000000-0x600c0807fffff]
  52. [ 263.717732] pci 0001:02:00.0: bridge window [mem 0x6004000000000-0x60045ffffffff 64bit pref]
  53. [ 263.717747] amdgpu 0001:03:00.0: amdgpu: VRAM: 16368M 0x0000008000000000 - 0x00000083FEFFFFFF (16368M used)
  54. [ 263.717751] amdgpu 0001:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
  55. [ 263.717755] [drm] Detected VRAM RAM=16368M, BAR=16384M
  56. [ 263.717757] [drm] RAM width 256bits GDDR6
  57. [ 263.717820] [drm] amdgpu: 16368M of VRAM memory ready
  58. [ 263.717827] [drm] amdgpu: 16368M of GTT memory ready.
  59. [ 263.717838] [drm] GART: num cpu pages 8192, num gpu pages 131072
  60. [ 263.717950] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
  61. [ 272.048495] [drm] use_doorbell being set to: [true]
  62. [ 272.048552] [drm] use_doorbell being set to: [true]
  63. [ 272.048605] [drm] use_doorbell being set to: [true]
  64. [ 272.048662] [drm] use_doorbell being set to: [true]
  65. [ 272.048976] [drm] Found VCN firmware Version ENC: 1.3 DEC: 2 VEP: 0 Revision: 17
  66. [ 272.048986] [drm] PSP loading VCN firmware
  67. [ 272.273424] [drm] reserve 0xa00000 from 0x83fe000000 for PSP TMR
  68. [ 272.943503] amdgpu 0001:03:00.0: amdgpu: smu driver if version = 0x00000039, smu fw if version = 0x0000003b, smu fw version = 0x003a3100 (58.49.0)
  69. [ 272.943507] amdgpu 0001:03:00.0: amdgpu: SMU driver if version not matched
  70. [ 272.943517] amdgpu 0001:03:00.0: amdgpu: use vbios provided pptable
  71. [ 273.018737] amdgpu 0001:03:00.0: amdgpu: SMU is initialized successfully!
  72. [ 273.023894] [drm] kiq ring mec 2 pipe 1 q 0
  73. [ 273.085574] [drm] VCN decode and encode initialized successfully(under DPG Mode).
  74. [ 273.085784] [drm] JPEG decode initialized successfully.
  75. [ 273.086032] kfd kfd: Allocated 3969056 bytes on gart
  76. [ 273.086334] Virtual CRAT table created for GPU
  77. [ 273.086837] amdgpu: Topology: Add dGPU node [0x73bf:0x1002]
  78. [ 273.086845] kfd kfd: added device 1002:73bf
  79. [ 273.086850] amdgpu 0001:03:00.0: amdgpu: SE 4, SH per SE 2, CU per SH 10, active_cu_number 80
  80. [ 273.087044] amdgpu 0001:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
  81. [ 273.087048] amdgpu 0001:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
  82. [ 273.087051] amdgpu 0001:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
  83. [ 273.087055] amdgpu 0001:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
  84. [ 273.087058] amdgpu 0001:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
  85. [ 273.087062] amdgpu 0001:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
  86. [ 273.087065] amdgpu 0001:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
  87. [ 273.087069] amdgpu 0001:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
  88. [ 273.087072] amdgpu 0001:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
  89. [ 273.087076] amdgpu 0001:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
  90. [ 273.087079] amdgpu 0001:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
  91. [ 273.087083] amdgpu 0001:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
  92. [ 273.087086] amdgpu 0001:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
  93. [ 273.087089] amdgpu 0001:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
  94. [ 273.087093] amdgpu 0001:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
  95. [ 273.087096] amdgpu 0001:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
  96. [ 273.087100] amdgpu 0001:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
  97. [ 273.087103] amdgpu 0001:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
  98. [ 273.087106] amdgpu 0001:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
  99. [ 273.087110] amdgpu 0001:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
  100. [ 273.087113] amdgpu 0001:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
  101. [ 273.094373] EEH: Recovering PHB#1-PE#0
  102. [ 273.094380] EEH: PE location: UOPWR.D100020-Node0-SLOT1 PCIE 4.0 X16, PHB location: N/A
  103. [ 273.094385] EEH: Frozen PHB#1-PE#0 detected
  104. [ 273.094386] EEH: Call Trace:
  105. [ 273.094393] EEH: [0000000088d68852] __eeh_send_failure_event+0x7c/0x160
  106. [ 273.094396] EEH: [0000000053433783] eeh_dev_check_failure.part.0+0x254/0x5e0
  107. [ 273.094499] EEH: [000000000f3ba7f6] amdgpu_device_rreg+0x180/0x210 [amdgpu]
  108. [ 273.094627] EEH: [0000000069e7642c] mmhub_v2_0_set_clockgating+0x1f8/0x320 [amdgpu]
  109. [ 273.094738] EEH: [00000000a554a501] gmc_v10_0_set_clockgating_state+0x44/0xb0 [amdgpu]
  110. [ 273.094841] EEH: [0000000063a011e7] amdgpu_device_ip_late_init+0x150/0x7d0 [amdgpu]
  111. [ 273.094947] EEH: [00000000294ed418] amdgpu_device_init+0x19a8/0x1fc0 [amdgpu]
  112. [ 273.095051] EEH: [00000000273acd85] amdgpu_driver_load_kms+0x30/0x520 [amdgpu]
  113. [ 273.095153] EEH: [00000000f91deff0] amdgpu_pci_probe+0x18c/0x340 [amdgpu]
  114. [ 273.095158] EEH: [0000000028f6d7d4] local_pci_probe+0x68/0x110
  115. [ 273.095161] EEH: [00000000b5bc188e] work_for_cpu_fn+0x38/0x60
  116. [ 273.095163] EEH: [00000000bf267e16] process_one_work+0x300/0x5d0
  117. [ 273.095166] EEH: [00000000ac280537] worker_thread+0x360/0x780
  118. [ 273.095170] EEH: [00000000409ee3ee] kthread+0x1e4/0x1f0
  119. [ 273.095176] EEH: [000000001c930e8a] ret_from_kernel_thread+0x5c/0x6c
  120. [ 273.095178] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 failures.
  121. [ 273.095180] EEH: Notify device drivers to shutdown
  122. [ 273.095185] EEH: Beginning: 'error_detected(IO frozen)'
  123. [ 273.356962] [drm] Initialized amdgpu 3.40.0 20150101 for 0001:03:00.0 on minor 1
  124. [ 273.357162] PCI 0001:03:00.0#0000: EEH: Invoking amdgpu->error_detected(IO frozen)
  125. [ 273.357165] [drm] PCI error: detected callback, state(2)!!
  126. [ 273.357588] PCI 0001:03:00.0#0000: EEH: amdgpu driver reports: 'need reset'
  127. [ 273.357593] PCI 0001:03:00.1#0000: EEH: driver not EEH aware
  128. [ 273.357595] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset'
  129. [ 273.357601] EEH: Collect temporary log
  130. [ 273.357639] EEH: of node=0001:03:00.0
  131. [ 273.357642] EEH: PCI device/vendor: 73bf1002
  132. [ 273.357644] EEH: PCI cmd/status register: 00100546
  133. [ 273.357646] EEH: PCI-E capabilities and status follow:
  134. [ 273.357656] EEH: PCI-E 00: 0012a010 00008fa1 00002930 00440d04
  135. [ 273.357664] EEH: PCI-E 10: 11040040 00000000 00000000 00000000
  136. [ 273.357665] EEH: PCI-E 20: 00000000
  137. [ 273.357667] EEH: PCI-E AER capability register set follows:
  138. [ 273.357676] EEH: PCI-E AER 00: 20020001 00000000 00000000 00462030
  139. [ 273.357684] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
  140. [ 273.357691] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
  141. [ 273.357695] EEH: PCI-E AER 30: 00000000 00000000
  142. [ 273.357697] EEH: of node=0001:03:00.1
  143. [ 273.357700] EEH: PCI device/vendor: ab281002
  144. [ 273.357703] EEH: PCI cmd/status register: 00100546
  145. [ 273.357704] EEH: PCI-E capabilities and status follow:
  146. [ 273.357713] EEH: PCI-E 00: 0012a010 00008fa1 00002930 00440d04
  147. [ 273.357721] EEH: PCI-E 10: 11040040 00000000 00000000 00000000
  148. [ 273.357722] EEH: PCI-E 20: 00000000
  149. [ 273.357724] EEH: PCI-E AER capability register set follows:
  150. [ 273.357733] EEH: PCI-E AER 00: 2a020001 00000000 00000000 00462030
  151. [ 273.357740] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
  152. [ 273.357748] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
  153. [ 273.357751] EEH: PCI-E AER 30: 00000000 00000000
  154. [ 273.357754] PHB4 PHB#1 Diag-data (Version: 1)
  155. [ 273.357755] brdgCtl: 00000002
  156. [ 273.357757] RootSts: 00000020 00402000 a0440008 00100107 00001000
  157. [ 273.357759] RootErrSts: 00000000 00008000 00000000
  158. [ 273.357761] PhbSts: 0000001c00000000 0000001c00000000
  159. [ 273.357762] Lem: 0000000100280000 0000000000000000 0000000100000000
  160. [ 273.357764] PhbErr: 0000088000000000 0000008000000000 2148000098000240 a008400000000000
  161. [ 273.357766] RxeArbErr: 8000200000000000 0000200000000000 00009fde30000000 0000000000000000
  162. [ 273.357768] PblErr: 0000000008000000 0000000008000000 0000000000000000 0000000000000000
  163. [ 273.357770] PcieDlp: 0000000000000000 0000000000000000 b000000000000000
  164. [ 273.357771] RegbErr: 0000004000000000 0000004000000000 4800003c00000000 0000000000000200
  165. [ 273.357773] PE[000] A/B: a480002a03000000 8000000000000000
  166. [ 273.357776] EEH: Reset without hotplug activity
  167. [ 273.357779] EEH: Removing 0001:03:00.1 without EEH sensitive driver
  168. [ 273.463561] amdgpu 0001:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  169. [ 273.463564] amdgpu 0001:03:00.0: amdgpu: Failed to enable gfxoff!
  170. [ 273.488713] snd_hda_intel 0001:03:00.1: CORB reset timeout#2, CORBRP = 65535
  171. [ 273.948759] snd_hda_intel 0001:03:00.1: CORB reset timeout#2, CORBRP = 65535
  172. [ 274.353721] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
  173. [ 274.353738] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  174. [ 274.353755] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  175. [ 274.353769] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  176. [ 274.353782] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  177. [ 274.353795] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  178. [ 274.353807] snd_hda_codec_hdmi hdaudioC0D0: HDMI ATI/AMD: no speaker allocation for ELD
  179. [ 274.389593] [drm] Register(0) [mmUVD_PGFSM_STATUS] failed to reach value 0x00800000 != 0x00c00000
  180. [ 274.389649] [drm:jpeg_v3_0_set_powergating_state [amdgpu]] *ERROR* amdgpu: JPEG enable power gating failed
  181. [ 274.389694] [drm:amdgpu_device_ip_set_powergating_state [amdgpu]] *ERROR* set_powergating_state of IP block <jpeg_v3_0> failed -110
  182. [ 274.403707] amdgpu 0001:03:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx_0.0.0 (-110).
  183. [ 274.403771] [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).
  184. [ 274.625435] [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003
  185. [ 274.861011] [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x7fffffff != 0xffffffff
  186. [ 275.097223] [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003
  187. [ 275.332748] [drm] Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003
  188. [ 275.568688] [drm] Register(1) [mmUVD_RBC_RB_RPTR] failed to reach value 0x7fffffff != 0xffffffff
  189. [ 275.804270] [drm] Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000003
  190. [ 275.804277] amdgpu 0001:03:00.0: amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
  191. [ 275.804279] amdgpu 0001:03:00.0: amdgpu: Failed to power gate VCN!
  192. [ 275.804336] [drm:amdgpu_dpm_enable_uvd [amdgpu]] *ERROR* Dpm disable uvd failed, ret = -5.
  193. [ 276.244073] pci 0001:03:00.1: Removing from iommu group 1
  194. [ 278.395265] amdgpu 0001:03:00.0: enabling device (0140 -> 0142)
  195. [ 278.401960] EEH: Sleep 5s ahead of partial hotplug
  196. [ 283.434989] pci 0001:03:00.1: [1002:ab28] type 00 class 0x040300
  197. [ 283.435009] pci 0001:03:00.1: reg 0x10: [mem 0x600c080120000-0x600c080123fff]
  198. [ 283.435067] pci 0001:03:00.1: BAR0 [mem size 0x00004000]: requesting alignment to 0x10000
  199. [ 283.435131] pci 0001:03:00.1: PME# supported from D1 D2 D3hot D3cold
  200. [ 283.435698] pci 0001:03:00.1: can't claim BAR 0 [mem size 0x00004000]: no address assigned
  201. [ 283.435706] pci 0001:03:00.1: BAR 0: assigned [mem 0x600c080120000-0x600c080123fff]
  202. [ 283.435711] pci 0001:02:00.0: PCI bridge to [bus 03]
  203. [ 283.435716] pci 0001:02:00.0: bridge window [mem 0x600c080000000-0x600c0807fffff]
  204. [ 283.435720] pci 0001:02:00.0: bridge window [mem 0x6004000000000-0x60045ffffffff 64bit pref]
  205. [ 283.435731] pci 0001:03:00.1: Added to existing PE#0
  206. [ 283.435738] pci 0001:03:00.1: Adding to iommu group 1
  207. [ 283.435833] pci 0001:03:00.1: D0 power state depends on 0001:03:00.0
  208. [ 283.435903] snd_hda_intel 0001:03:00.1: enabling device (0140 -> 0142)
  209. [ 283.435912] snd_hda_intel 0001:03:00.1: Force to snoop mode by module option
  210. [ 283.435956] EEH: Beginning: 'slot_reset'
  211. [ 283.435961] PCI 0001:03:00.0#0000: EEH: Invoking amdgpu->slot_reset()
  212. [ 283.435963] [drm] PCI error: slot reset callback!!
  213. [ 283.442319] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input11
  214. [ 283.442436] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input12
  215. [ 283.442513] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input13
  216. [ 283.442587] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input14
  217. [ 283.442658] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input15
  218. [ 283.442730] input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0001:00/0001:00:00.0/0001:01:00.0/0001:02:00.0/0001:03:00.1/sound/card0/input16
  219. [ 284.283468] [drm] free PSP TMR buffer
  220. [ 284.304489] amdgpu 0001:03:00.0: amdgpu: GPU reset succeeded, trying to resume
  221. [ 284.304576] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
  222. [ 284.304600] [drm] VRAM is lost due to GPU reset!
  223. [ 284.305078] [drm] PSP is resuming...
  224. [ 284.544795] [drm] reserve 0xa00000 from 0x83fe000000 for PSP TMR
  225. [ 285.204874] amdgpu 0001:03:00.0: amdgpu: SMU is resuming...
  226. [ 285.204882] amdgpu 0001:03:00.0: amdgpu: smu driver if version = 0x00000039, smu fw if version = 0x0000003b, smu fw version = 0x003a3100 (58.49.0)
  227. [ 285.204885] amdgpu 0001:03:00.0: amdgpu: SMU driver if version not matched
  228. [ 285.275239] amdgpu 0001:03:00.0: amdgpu: failed send message: GetDpmFreqByIndex (31) param: 0x000500ff response 0xfffffffb
  229. [ 285.275242] amdgpu 0001:03:00.0: amdgpu: [smu_v11_0_set_single_dpm_table] failed to get dpm levels!
  230. [ 285.275244] amdgpu 0001:03:00.0: amdgpu: Failed to setup default dpm clock tables!
  231. [ 285.275246] amdgpu 0001:03:00.0: amdgpu: Failed to setup default dpm clock tables!
  232. [ 285.275248] amdgpu 0001:03:00.0: amdgpu: Failed to setup smc hw!
  233. [ 285.275315] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -5
  234. [ 285.275397] [drm:amdgpu_pci_slot_reset [amdgpu]] *ERROR* PCIe error recovery failed, err:-5
  235. [ 285.275401] PCI 0001:03:00.0#0000: EEH: amdgpu driver reports: 'disconnect'
  236. [ 285.275406] PCI 0001:03:00.1#0000: EEH: driver not EEH aware
  237. [ 285.275408] EEH: Finished:'slot_reset' with aggregate recovery state:'disconnect'
  238. [ 285.275410] EEH: Unable to recover from failure from PHB#1-PE#0.
  239. Please try reseating or replacing it
  240. [ 285.275455] EEH: of node=0001:03:00.0
  241. [ 285.275458] EEH: PCI device/vendor: 73bf1002
  242. [ 285.275461] EEH: PCI cmd/status register: 00100546
  243. [ 285.275463] EEH: PCI-E capabilities and status follow:
  244. [ 285.275474] EEH: PCI-E 00: 0012a010 00008fa1 00002930 00440d04
  245. [ 285.275483] EEH: PCI-E 10: 11040040 00000000 00000000 00000000
  246. [ 285.275484] EEH: PCI-E 20: 00000000
  247. [ 285.275486] EEH: PCI-E AER capability register set follows:
  248. [ 285.275496] EEH: PCI-E AER 00: 20020001 00000000 00000000 00462030
  249. [ 285.275505] EEH: PCI-E AER 10: 00000000 00002000 000001f4 60008002
  250. [ 285.275513] EEH: PCI-E AER 20: 000000ff 00060044 00000458 00000000
  251. [ 285.275517] EEH: PCI-E AER 30: 00000000 00000000
  252. [ 285.275520] EEH: of node=0001:03:00.1
  253. [ 285.275522] EEH: PCI device/vendor: ab281002
  254. [ 285.275525] EEH: PCI cmd/status register: 00100546
  255. [ 285.275527] EEH: PCI-E capabilities and status follow:
  256. [ 285.275537] EEH: PCI-E 00: 0012a010 00008fa1 00002930 00440d04
  257. [ 285.275545] EEH: PCI-E 10: 11040000 00000000 00000000 00000000
  258. [ 285.275547] EEH: PCI-E 20: 00000000
  259. [ 285.275548] EEH: PCI-E AER capability register set follows:
  260. [ 285.275558] EEH: PCI-E AER 00: 2a020001 00000000 00000000 00462030
  261. [ 285.275567] EEH: PCI-E AER 10: 00000000 00002000 000001f4 60008002
  262. [ 285.275575] EEH: PCI-E AER 20: 000000ff 00060044 00000458 00000000
  263. [ 285.275579] EEH: PCI-E AER 30: 00000000 00000000
  264. [ 285.275581] PHB4 PHB#1 Diag-data (Version: 1)
  265. [ 285.275582] brdgCtl: 00000002
  266. [ 285.275585] RootSts: 00000020 00402000 a0440008 00100107 00005000
  267. [ 285.275587] RootErrSts: 00000024 00008000 00000000
  268. [ 285.275588] sourceId: 03010000
  269. [ 285.275590] PhbSts: 0000001c00000000 0000001c00000000
  270. [ 285.275592] Lem: 0000000104280000 0000000000000000 0000000100000000
  271. [ 285.275594] PhbErr: 0000088000000000 0000008000000000 2148000098000240 a008400000000000
  272. [ 285.275596] RxeArbErr: 8000200000000020 0000200000000000 00009fde30000000 0000000000000000
  273. [ 285.275598] PblErr: 0000000008000000 0000000008000000 0000000000000000 0000000000000000
  274. [ 285.275600] PcieDlp: 0000000000000000 0000000000000000 b000000000000000
  275. [ 285.275602] RegbErr: 0000004000000000 0000004000000000 4800003c00000000 0000000000000200
  276. [ 285.275604] PE[000] A/B: a480002a03000000 8000000000000000
  277. [ 285.275607] EEH: Beginning: 'error_detected(permanent failure)'
  278. [ 285.275610] PCI 0001:03:00.0#0000: EEH: not actionable (1,1,1)
  279. [ 285.275613] PCI 0001:03:00.1#0000: EEH: not actionable (1,1,1)
  280. [ 285.275615] EEH: Finished:'error_detected(permanent failure)'
  281. [ 286.001810] pci 0001:03:00.1: Removing from iommu group 1
  282. [ 286.001983] [drm:amdgpu_pci_remove [amdgpu]] *ERROR* Hotplug removal is not supported
  283. [ 286.002383] amdgpu 0001:03:00.0: amdgpu: amdgpu: finishing device.
  284. [ 290.430911] amdgpu: cp queue pipe 4 queue 0 preemption failed
  285. [ 290.871333] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  286. [ 290.871376] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  287. [ 291.201813] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  288. [ 291.201876] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  289. [ 292.408325] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  290. [ 292.408380] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  291. [ 292.848782] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  292. [ 292.848846] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  293. [ 293.179174] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  294. [ 293.179217] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  295. [ 293.179225] [drm] free PSP TMR buffer
  296. [ 293.513528] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 00000000d8d7cfd5; ring_buffer_end = 000000004bc2dd70; write_frame = 00000000415de82c
  297. [ 293.513593] [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
  298. [ 297.650869] BUG: Unable to handle kernel data access on read at 0xf0a803030303a898
  299. [ 297.650872] Faulting instruction address: 0xc000000000cc8298
  300. [ 297.650875] Oops: Kernel access of bad area, sig: 11 [#1]
  301. [ 297.650877] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  302. [ 297.650879] Modules linked in: amdgpu mfd_core gpu_sched xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter rfkill ip6_tables iptable_filter snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_usb_audio snd_hda_codec at24 regmap_i2c snd_hda_core snd_usbmidi_lib snd_rawmidi snd_hwdep snd_seq joydev snd_seq_device crct10dif_vpmsum snd_pcm mc ofpart ipmi_powernv ipmi_devintf ipmi_msghandler powernv_flash snd_timer mtd rtc_opal snd opal_prd i2c_opal soundcore zram ip_tables ast drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea
  303. [ 297.650935] sysfillrect sysimgblt fb_sys_fops cec drm tg3 vmx_crypto i2c_core crc32c_vpmsum drm_panel_orientation_quirks nvme nvme_core sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi fuse scsi_transport_iscsi
  304. [ 297.650959] CPU: 23 PID: 177 Comm: eehd Not tainted 5.10.20-200.fc33.ppc64le #1
  305. [ 297.650961] NIP: c000000000cc8298 LR: c000000000cc8bb0 CTR: c000000000cc8b30
  306. [ 297.650963] REGS: c000000010e67630 TRAP: 0380 Not tainted (5.10.20-200.fc33.ppc64le)
  307. [ 297.650965] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84002822 XER: 00000000
  308. [ 297.650973] CFAR: c000000000cc8bac IRQMASK: 0
  309. GPR00: c000000000cc8bb0 c000000010e678c0 c0000000023dc800 f0a803030303a880
  310. GPR04: 00000000000000c0 00000000c0000000 c00000000303a830 c00000000171f338
  311. GPR08: 003ffff800000201 c00000000171f338 c008000004190000 c008000005f28338
  312. GPR12: c000000000cc8b30 c000000fff6e7000 c0000000001af288 c000000010c704c0
  313. GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  314. GPR20: 0000000000000000 c00000001ee96d90 c00000001ee85b70 c00000001ee85b90
  315. GPR24: c00000001ee85b98 c00000001ee85b88 0000000000000000 c0080000060c8dc8
  316. GPR28: 0000000000000003 0000000000000000 c00000001ee80000 f0a803030303a880
  317. [ 297.651005] NIP [c000000000cc8298] free_fw_priv+0x28/0x280
  318. [ 297.651007] LR [c000000000cc8bb0] release_firmware+0x80/0xe0
  319. [ 297.651009] Call Trace:
  320. [ 297.651011] [c000000010e67930] [c000000000cc8bb0] release_firmware+0x80/0xe0
  321. [ 297.651062] [c000000010e67960] [c008000005b96b48] psp_sw_fini+0x90/0x120 [amdgpu]
  322. [ 297.651116] [c000000010e679a0] [c008000005f1fe48] amdgpu_device_fini+0x3d0/0x630 [amdgpu]
  323. [ 297.651151] [c000000010e67a60] [c008000005acce70] amdgpu_driver_unload_kms+0x1c8/0x330 [amdgpu]
  324. [ 297.651185] [c000000010e67aa0] [c008000005ac08bc] amdgpu_pci_remove+0x64/0xa0 [amdgpu]
  325. [ 297.651189] [c000000010e67b10] [c000000000b3c158] pci_device_remove+0x68/0x120
  326. [ 297.651192] [c000000010e67b50] [c000000000c93688] device_release_driver_internal+0x2f8/0x410
  327. [ 297.651195] [c000000010e67ba0] [c000000000b26668] pci_stop_and_remove_bus_device+0xb8/0x110
  328. [ 297.651198] [c000000010e67be0] [c0000000000732f0] pci_hp_remove_devices+0x90/0x130
  329. [ 297.651201] [c000000010e67c70] [c00000000004e9c0] eeh_handle_normal_event+0x510/0xa40
  330. [ 297.651203] [c000000010e67d50] [c00000000004fdd8] eeh_event_handler+0x118/0x1a0
  331. [ 297.651206] [c000000010e67db0] [c0000000001af464] kthread+0x1e4/0x1f0
  332. [ 297.651208] [c000000010e67e20] [c00000000000d4f0] ret_from_kernel_thread+0x5c/0x6c
  333. [ 297.651210] Instruction dump:
  334. [ 297.651212] 60000000 4bffffd8 3c4c0171 38424590 7c0802a6 60000000 7c0802a6 fbe1fff8
  335. [ 297.651218] fbc1fff0 7c7f1b78 f8010010 f821ff91 <ebc30018> 7fc3f378 48601309 60000000
  336. [ 297.651226] ---[ end trace 87a3804e7d686ea3 ]---
  337.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement