Advertisement
Guest User

Untitled

a guest
Nov 24th, 2008
178
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 8.63 KB | None | 0 0
  1. To: FreeBSD-gnats-submit@freebsd.org
  2. From: swell.k@gmail.com
  3. X-send-pr-version: 3.113
  4. X-GNATS-Notify:
  5.  
  6.  
  7. >Submitter-Id: current-users
  8. >Originator: <swell.k@gmail.com>
  9. >Organization: n/a
  10. >Confidential: no
  11. >Synopsis: [zfs] panic on concurrent writing & rollback
  12. >Severity: non-critical
  13. >Priority: low
  14. >Category: kern
  15. >Class: sw-bug
  16. >Release: FreeBSD 8.0-CURRENT amd64
  17. >Environment:
  18. System: FreeBSD 8.0-CURRENT FreeBSD 8.0-CURRENT #2 r185244: Mon Nov 24 16:29:02 UTC 2008 luser@qemu:/usr/obj/usr/src/sys/TEST amd64
  19.  
  20. qemu-devel cmdline:
  21. qemu-system-x86_64 -no-kqemu -m 512 -net nic,model=rtl8139 \
  22. -net tap,ifname=tap0 -nographic -s -echr 0x03 scrap/freebsd-generic-amd64.qcow2
  23.  
  24. zpool upgrade shows version 13
  25. zfs upgrade shows version 3
  26.  
  27. The box boots from gptzfsboot. There are no UFS partitions on it.
  28.  
  29. kernel config:
  30. include GENERIC
  31. options BREAK_TO_DEBUGGER
  32. options DIAGNOSTIC
  33. options DEBUG_LOCKS
  34. options DEBUG_VFS_LOCKS
  35. nooption WITNESS_SKIPSPIN
  36.  
  37. loader.conf:
  38. autoboot_delay=0
  39. beastie_disable=YES
  40. zfs_load=YES
  41. vfs.root.mountfrom="zfs:q"
  42. kern.hz=100
  43. hint.uart.0.flags=0x90
  44.  
  45. no kmem_size and prefetch_disable tunings here.
  46.  
  47. boot.config: -h -S115200
  48.  
  49. entire system was built with __MAKE_CONF=/dev/null on host machine.
  50. No local patches applied on it.
  51.  
  52. The host is on 8-CURRENT r185232M amd64. `M' stands for slightly updated ZFS.
  53. It experiences similar problem along with another box on i386.
  54.  
  55. >Description:
  56.  
  57. When doing rollbacks on snapshot multiple times there is
  58. chance to encounter a panic.
  59. %%%
  60. # sh crash.sh
  61. lock order reversal:
  62. 1st 0xffffff0002888638 vnode interlock (vnode interlock) @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:3699
  63. 2nd 0xffffff0002429710 struct mount mtx (struct mount mtx) @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1050
  64. KDB: stack backtrace:
  65. db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
  66. _witness_debugger() at _witness_debugger+0x2e
  67. witness_checkorder() at witness_checkorder+0x81e
  68. _mtx_lock_flags() at _mtx_lock_flags+0x78
  69. zfs_znode_free() at zfs_znode_free+0x84
  70. zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a
  71. VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0xb5
  72. vinactive() at vinactive+0x90
  73. vput() at vput+0x25c
  74. vn_close() at vn_close+0xb9
  75. vn_closefile() at vn_closefile+0x7d
  76. _fdrop() at _fdrop+0x23
  77. closef() at closef+0x4d
  78. do_dup() at do_dup+0x351
  79. syscall() at syscall+0x1e7
  80. Xfast_syscall() at Xfast_syscall+0xab
  81. --- syscall (90, FreeBSD ELF64, dup2), rip = 0x80093b08c, rsp = 0x7fffffffe328, rbp = 0x800b0d0a0 ---
  82. KDB: stack backtrace:
  83. db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
  84. vfs_badlock() at vfs_badlock+0x95
  85. VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0xc8
  86. vinactive() at vinactive+0x90
  87. vput() at vput+0x25c
  88. vn_close() at vn_close+0xb9
  89. vn_closefile() at vn_closefile+0x7d
  90. _fdrop() at _fdrop+0x23
  91. closef() at closef+0x4d
  92. do_dup() at do_dup+0x351
  93. syscall() at syscall+0x1e7
  94. Xfast_syscall() at Xfast_syscall+0xab
  95. --- syscall (90, FreeBSD ELF64, dup2), rip = 0x80093b08c, rsp = 0x7fffffffe328, rbp = 0x800b0d0a0 ---
  96. VOP_INACTIVE: 0xffffff00028884e0 interlock is locked but should not be
  97. KDB: enter: lock violation
  98. [thread pid 85 tid 100056 ]
  99. Stopped at kdb_enter+0x3d: movq $0,0x65c598(%rip)
  100.  
  101. db> show all locks
  102. Process 85 (sh) thread 0xffffff0002427390 (100056)
  103. exclusive sleep mutex vnode interlock (vnode interlock) r = 0 (0xffffff0002888638) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:3699
  104. exclusive lockmgr zfs (zfs) r = 0 (0xffffff0002888578) locked @ /usr/src/sys/kern/vfs_vnops.c:293
  105.  
  106. db> show lockedvnods
  107. Locked vnodes
  108.  
  109. 0xffffff00028884e0: tag zfs, type VREG
  110. usecount 0, writecount 0, refcount 1 mountedhere 0
  111. flags (VI_DOINGINACT)
  112. VI_LOCKed v_object 0xffffff0002886960 ref 0 pages 0
  113. lock type zfs: EXCL by thread 0xffffff0002427390 (pid 85)
  114. #0 0xffffffff804dfc78 at __lockmgr_args+0x758
  115. #1 0xffffffff8056de19 at vop_stdlock+0x39
  116. #2 0xffffffff8080d77b at VOP_LOCK1_APV+0x9b
  117. #3 0xffffffff805894a7 at _vn_lock+0x57
  118. #4 0xffffffff8058a58e at vn_close+0x6e
  119. #5 0xffffffff8058a6bd at vn_closefile+0x7d
  120. #6 0xffffffff804c7443 at _fdrop+0x23
  121. #7 0xffffffff804c8a6d at closef+0x4d
  122. #8 0xffffffff804c9ed1 at do_dup+0x351
  123. #9 0xffffffff807c9d27 at syscall+0x1e7
  124. #10 0xffffffff807ac85b at Xfast_syscall+0xab
  125.  
  126. db> show all pcpu
  127. Current CPU: 0
  128.  
  129. cpuid = 0
  130. curthread = 0xffffff0002427390: pid 85 "sh"
  131. curpcb = 0xfffffffe40180d50
  132. fpcurthread = none
  133. idlethread = 0xffffff00021cc720: pid 11 "idle: cpu0"
  134. spin locks held:
  135. %%%
  136.  
  137. Complete msgbuf with ps and alltrace is here:
  138. http://pastebin.com/f44ad88b3
  139.  
  140. It can occur with a slightly different message:
  141.  
  142. %%%
  143. # sh crash.sh
  144. Fatal trap 12: page fault while in kernel mode
  145. cpuid = 0; apic id = 00
  146. fault virtual address = 0x70
  147. fault code = supervisor read data, page not present
  148. instruction pointer = 0x8:0xffffffff804fb57a
  149. stack pointer = 0x10:0xfffffffe401997a0
  150. frame pointer = 0x10:0xfffffffe401997e0
  151. code segment = base 0x0, limit 0xfffff, type 0x1b
  152. = DPL 0, pres 1, long 1, def32 0, gran 1
  153. processor eflags = interrupt enabled, IOPL = 0
  154. current process = 179 (zfs)
  155. [thread pid 179 tid 100061 ]
  156. Stopped at _sx_xlock+0x3a: movq 0x18(%rdi),%rax
  157.  
  158. db> bt
  159. Tracing pid 179 tid 100061 td 0xffffff000245dab0
  160. _sx_xlock() at _sx_xlock+0x3a
  161. dmu_buf_update_user() at dmu_buf_update_user+0x47
  162. zfs_znode_dmu_fini() at zfs_znode_dmu_fini+0x38
  163. zfs_freebsd_reclaim() at zfs_freebsd_reclaim+0xbe
  164. VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0xb5
  165. vgonel() at vgonel+0x119
  166. vflush() at vflush+0x284
  167. zfs_umount() at zfs_umount+0x105
  168. dounmount() at dounmount+0x2ed
  169. unmount() at unmount+0x24b
  170. syscall() at syscall+0x1e7
  171. Xfast_syscall() at Xfast_syscall+0xab
  172. --- syscall (22, FreeBSD ELF64, unmount), rip = 0x800f401cc, rsp = 0x7fffffffe478, rbp = 0x801202300 ---
  173.  
  174. db> show all locks
  175. Process 179 (zfs) thread 0xffffff000245dab0 (100061)
  176. exclusive lockmgr zfs (zfs) r = 0 (0xffffff0002693098) locked @ /usr/src/sys/kern/vfs_subr.c:2358
  177. exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff80b5eea0) locked @ /usr/src/sys/kern/vfs_mount.c:1139
  178. exclusive lockmgr zfs (zfs) r = 0 (0xffffff0002693a58) locked @ /usr/src/sys/kern/vfs_mount.c:1207
  179.  
  180. db> show lockedvnods
  181. Locked vnodes
  182.  
  183. 0xffffff00026939c0: tag zfs, type VDIR
  184. usecount 1, writecount 0, refcount 1 mountedhere 0xffffff0002432710
  185. flags ()
  186. lock type zfs: EXCL by thread 0xffffff000245dab0 (pid 179)
  187. #0 0xffffffff804dfc78 at __lockmgr_args+0x758
  188. #1 0xffffffff8056de19 at vop_stdlock+0x39
  189. #2 0xffffffff8080d77b at VOP_LOCK1_APV+0x9b
  190. #3 0xffffffff805894a7 at _vn_lock+0x57
  191. #4 0xffffffff80577303 at dounmount+0x93
  192. #5 0xffffffff80577adb at unmount+0x24b
  193. #6 0xffffffff807c9d27 at syscall+0x1e7
  194. #7 0xffffffff807ac85b at Xfast_syscall+0xab
  195.  
  196.  
  197. 0xffffff0002693000: tag zfs, type VREG
  198. usecount 0, writecount 0, refcount 1 mountedhere 0
  199. flags (VI_DOOMED)
  200. lock type zfs: EXCL by thread 0xffffff000245dab0 (pid 179)
  201. #0 0xffffffff804dfc78 at __lockmgr_args+0x758
  202. #1 0xffffffff8056de19 at vop_stdlock+0x39
  203. #2 0xffffffff8080d77b at VOP_LOCK1_APV+0x9b
  204. #3 0xffffffff805894a7 at _vn_lock+0x57
  205. #4 0xffffffff8057fecf at vflush+0x20f
  206. #5 0xffffffff80f5f175 at zfs_umount+0x105
  207. #6 0xffffffff8057755d at dounmount+0x2ed
  208. #7 0xffffffff80577adb at unmount+0x24b
  209. #8 0xffffffff807c9d27 at syscall+0x1e7
  210. #9 0xffffffff807ac85b at Xfast_syscall+0xab
  211.  
  212. db> show all pcpu
  213. Current CPU: 0
  214.  
  215. cpuid = 0
  216. curthread = 0xffffff000245dab0: pid 179 "zfs"
  217. curpcb = 0xfffffffe40199d50
  218. fpcurthread = none
  219. idlethread = 0xffffff00021cc720: pid 11 "idle: cpu0"
  220. spin locks held:
  221. %%%
  222.  
  223. Again, full session with ps and alltrace include is here:
  224. http://pastebin.com/f21e46723
  225.  
  226. BTW, here is a backup of this message in case it's mangled:
  227. you're already looking at it ;)
  228.  
  229. >How-To-Repeat:
  230.  
  231. It's not very reliable but the following script triggers it very
  232. often. If the panic don't occur within a minute then there is a chance
  233. it will occur after you interrupt and restart the script.
  234.  
  235. %%%
  236. #! /bin/sh
  237. # crash.sh
  238.  
  239. PATH=/sbin:/bin
  240.  
  241. pool=q
  242. dataset=test
  243. snapshot=last
  244. prefix=foo_
  245. cycles=999999999
  246.  
  247. zfs destroy -r $pool/$dataset
  248. zfs create $pool/$dataset
  249. zfs snapshot $pool/$dataset@$snapshot
  250.  
  251. mountpoint=$(zfs get -Ho value mountpoint $pool/$dataset)
  252.  
  253. loop() {
  254. local i=0
  255. while [ $((i+=1)) -lt $cycles ]; do
  256. eval $@
  257. done &
  258. pids="$pids $!"
  259. }
  260. trap 'kill $pids' int term exit
  261.  
  262. # juggle these
  263. loop : \>$mountpoint/$prefix\${i}
  264. loop zfs rollback $pool/$dataset@$snapshot
  265.  
  266. wait
  267. %%%
  268.  
  269. >Fix:
  270.  
  271.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement