Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- **Edit**
- Ha. I downgraded kernel to:
- ```
- > uname -a
- Linux ren 6.14.2 #1-NixOS SMP PREEMPT_DYNAMIC Thu Apr 10 12:44:49 UTC 2025 x86_64 GNU/Linux
- ```
- and evacuation works:
- ```
- > sudo bcachefs device evacuate /dev/nvme0n1p2
- Setting /dev/nvme0n1p2 readonly
- 0% complete: current position btree extents:25828954:26160
- ```
- Ooops. But this does not look OK:
- ```
- [ 63.966285] bcachefs (a933c02c-19d2-40d7-b5d7-42892bd5e154): Error setting device state: device_state_not_allowed 20:24:20 [1/1571]
- [ 67.870661] bcachefs (nvme0n1p2): ro
- [ 77.215213] ------------[ cut here ]------------
- [ 77.215217] kernel BUG at fs/bcachefs/btree_update_interior.c:1785!
- [ 77.215226] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
- [ 77.215230] CPU: 30 UID: 0 PID: 4637 Comm: bcachefs Not tainted 6.14.2 #1-NixOS
- [ 77.215233] Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI, BIOS 1809 09/28/2023
- [ 77.215235] RIP: 0010:bch2_btree_insert_node+0x50f/0x6c0 [bcachefs]
- [ 77.215270] Code: c8 49 8b 7f 08 41 0f b7 47 3a eb 82 48 8b 5d c8 49 8b 7f 08 4d 8b 84 24 98 00 00 00 41 0f b7 47 3a e9 68 ff ff ff 90 0f 0b 90
- <0f> 0b 90 0f 0b 31 c9 4c 89 e2 48 89 de 4c 89 ff e8 2c d8 fe ff 89
- [ 77.215272] RSP: 0018:ffffafe748823b40 EFLAGS: 00010293
- [ 77.215275] RAX: 0000000000000000 RBX: ffff8ea82b4d41f8 RCX: 0000000000000002
- [ 77.215277] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8ea885846000
- [ 77.215278] RBP: ffffafe748823b90 R08: ffff8ea885846d50 R09: 0000000000000000
- [ 77.215279] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8ea602757200
- [ 77.215280] R13: ffff8ea885846000 R14: 0000000000000001 R15: ffff8ea82b4d4000
- [ 77.215282] FS: 0000000000000000(0000) GS:ffff8eb51e700000(0000) knlGS:0000000000000000
- [ 77.215283] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
- [ 77.215285] CR2: 000000c001b64000 CR3: 000000015ce22000 CR4: 0000000000f50ef0
- [ 77.215286] PKRU: 55555554
- [ 77.215287] Call Trace:
- [ 77.215291] <TASK>
- [ 77.215295] ? srso_alias_return_thunk+0x5/0xfbef5
- [ 77.215301] bch2_btree_node_rewrite+0x1b3/0x370 [bcachefs]
- [ 77.215323] bch2_move_btree.isra.0+0x30d/0x490 [bcachefs]
- [ 77.215355] ? __pfx_migrate_btree_pred+0x10/0x10 [bcachefs]
- [ 77.215378] ? bch2_move_btree.isra.0+0x106/0x490 [bcachefs]
- [ 77.215402] ? __pfx_bch2_data_thread+0x10/0x10 [bcachefs]
- [ 77.215426] bch2_data_job+0x10a/0x2f0 [bcachefs]
- [ 77.215450] bch2_data_thread+0x4a/0x70 [bcachefs]
- [ 77.215472] kthread+0xeb/0x250
- ```
- **Original post**
- My single and only nvme started reporting smart errors. Great, time for my choice of bcachefs to save me now! Ordered another one, added it to the file system (thanks to two m.2 slots), set metadata replicas to 2, though that I can live with some data loss possibilty so just kept it this way. But after a few days of seeing even more smartd errors, I decided to just replace with another new one.
- Ordered another one, now I want to remove the failing one from the fs so I can swap it in the nvme slot.
- My understanding is that I should `device evacuate`, then `device remove` and I'm OK to swap. But I can't:
- ```
- > sudo bcachefs device evacuate /dev/nvme0n1p2
- Setting /dev/nvme0n1p2 readonly
- BCH_IOCTL_DISK_SET_STATE ioctl error: Invalid argument
- > sudo dmesg | tail -n 3
- [ 241.528859] bcachefs (a933c02c-19d2-40d7-b5d7-42892bd5e154): Error setting device state: device_state_not_allowed
- [ 361.951314] block nvme0n1: No UUID available providing old NGUID
- [ 498.032801] bcachefs (a933c02c-19d2-40d7-b5d7-42892bd5e154): Error setting device state: device_state_not_allowed
- ```
- ```
- > sudo bcachefs device remove /dev/nvme0n1p2
- BCH_IOCTL_DISK_REMOVE ioctl error: Invalid argument
- > sudo dmesg | tail -n 3
- [ 361.951314] block nvme0n1: No UUID available providing old NGUID
- [ 498.032801] bcachefs (a933c02c-19d2-40d7-b5d7-42892bd5e154): Error setting device state: device_state_not_allowed
- [ 585.233829] bcachefs (nvme0n1p2): Cannot remove without losing data
- ```
- I tried:
- ```
- > sudo bcachefs data rereplicate /
- ```
- and `set-state failed`, and possibly some other things, with no result.
- It completed, but does not change anything.
- ```
- > sudo bcachefs show-super /dev/nvme1n1p2
- Device: (unknown device)
- External UUID: a933c02c-19d2-40d7-b5d7-42892bd5e154
- Internal UUID: 61d26938-b11f-42f0-8968-372a21e8b739
- Magic number: c68573f6-66ce-90a9-d96a-60cf803df7ef
- Device index: 1
- Label: (none)
- Version: 1.25: (unknown version)
- Version upgrade complete: 1.25: (unknown version)
- Oldest version on disk: 1.3: rebalance_work
- Created: Sun Jan 28 21:07:10 2024
- Sequence number: 383
- Time of last write: Mon May 5 16:48:37 2025
- Superblock size: 5.30 KiB/1.00 MiB
- Clean: 0
- Devices: 2
- Sections: members_v1,crypt,replicas_v0,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
- Features: journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
- Compat features: alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done
- Options:
- block_size: 512 B
- btree_node_size: 256 KiB
- errors: continue [fix_safe] panic ro
- metadata_replicas: 2
- data_replicas: 1
- metadata_replicas_required: 1
- data_replicas_required: 1
- encoded_extent_max: 64.0 KiB
- metadata_checksum: none [crc32c] crc64 xxhash
- data_checksum: none [crc32c] crc64 xxhash
- compression: none
- background_compression: none
- str_hash: crc32c crc64 [siphash]
- metadata_target: none
- foreground_target: none
- background_target: none
- promote_target: none
- erasure_code: 0
- inodes_32bit: 1
- shard_inode_numbers: 1
- inodes_use_key_cache: 1
- gc_reserve_percent: 8
- gc_reserve_bytes: 0 B
- root_reserve_percent: 0
- wide_macs: 0
- promote_whole_extents: 0
- acl: 1
- usrquota: 0
- grpquota: 0
- prjquota: 0
- journal_flush_delay: 1000
- journal_flush_disabled: 0
- journal_reclaim_delay: 100
- journal_transaction_names: 1
- allocator_stuck_timeout: 30
- version_upgrade: [compatible] incompatible none
- nocow: 0
- members_v2 (size 304):
- Device: 0
- Label: (none)
- UUID: 8e6a97e3-33c6-4aad-ac45-6122ea1eb394
- Size: 3.64 TiB
- read errors: 1067
- write errors: 0
- checksum errors: 0
- seqread iops: 0
- seqwrite iops: 0
- randread iops: 0
- randwrite iops: 0
- Bucket size: 512 KiB
- First bucket: 0
- Buckets: 7629918
- Last mount: Mon May 5 16:48:37 2025
- Last superblock write: 383
- State: rw
- Data allowed: journal,btree,user
- Has data: journal,btree,user
- Btree allocated bitmap blocksize: 128 MiB
- Btree allocated bitmap: 0000000000011111111111111111111111111111111111111111111111111111
- Durability: 1
- Discard: 0
- Freespace initialized: 1
- Device: 1
- Label: (none)
- UUID: 4bd08f3b-030e-4cd1-8b1e-1f3c8662b455
- Size: 3.72 TiB
- read errors: 0
- write errors: 0
- checksum errors: 0
- seqread iops: 0
- seqwrite iops: 0
- randread iops: 0
- randwrite iops: 0
- Bucket size: 1.00 MiB
- First bucket: 0
- Buckets: 3906505
- Last mount: Mon May 5 16:48:37 2025
- Last superblock write: 383
- State: rw
- Data allowed: journal,btree,user
- Has data: journal,btree,user
- Btree allocated bitmap blocksize: 32.0 MiB
- Btree allocated bitmap: 0000010000000000000000000000000000000000000000100000000000101111
- Durability: 1
- Discard: 0
- Freespace initialized: 1
- errors (size 184):
- btree_node_bset_older_than_sb_min 1 Sat Apr 27 17:18:02 2024
- fs_usage_data_wrong 1 Sat Apr 27 17:20:43 2024
- fs_usage_replicas_wrong 1 Sat Apr 27 17:20:48 2024
- dev_usage_sectors_wrong 1 Sat Apr 27 17:20:36 2024
- dev_usage_fragmented_wrong 1 Sat Apr 27 17:20:39 2024
- alloc_key_dirty_sectors_wrong 3 Sat Apr 27 17:20:35 2024
- bucket_sector_count_overflow 1 Sat Apr 27 16:42:51 2024
- backpointer_to_missing_ptr 5 Sat Apr 27 17:21:53 2024
- ptr_to_missing_backpointer 2 Sat Apr 27 17:21:57 2024
- key_in_missing_inode 5 Sat Apr 27 17:22:48 2024
- accounting_key_version_0 8 Fri Oct 25 19:00:01 2024
- ```
- Am I hitting a bug, or just confused about something?
- `nvme0` is the failing drive, `nvme1` is the new one I just added. Another drive waits in the box to replace `nvme0`.
- ```
- > bcachefs version
- 1.13.0
- > uname -a
- Linux ren 6.15.0-rc1 #1-NixOS SMP PREEMPT_DYNAMIC Tue Jan 1 00:00:00 UTC 1980 x86_64 GNU/Linux
- ```
- Upgraded
- ```
- > bcachefs version
- 1.25.1
- ```
- but does not seem to change anything.
- Did the scrub:
- ```
- > sudo bcachefs data scrub /
- Starting scrub on 2 devices: nvme0n1p2 nvme1n1p2
- device checked corrected uncorrected total
- nvme0n1p2 1.93 TiB 0 B 192 KiB 34.6 GiB 5721% complete
- nvme1n1p2 175 GiB 0 B 0 B 34.6 GiB 505% complete
- ```
Add Comment
Please, Sign In to add comment