Bug #48036
closedbluefs corrupted in a OSD
0%
Description
I hit a problem that is very similar to the following issue/PR in v15.2.4.
upgrade/nautilus-x-master: bluefs mount failed to replay log: (14) Bad address during upgrade
https://tracker.ceph.com/issues/46886
os/bluestore: get rid of obsolete stuff in bluefs. #36745 (it's not backported to Octopus)
https://github.com/ceph/ceph/pull/36745
In addition, This PR is not backported to Octopus. If my problem is the same as issue#46886, please backport PR#36754 to Octopus.
Here is the detail.
"ceph-bluestore-tool bluefs-bdev-expand --path <osdDataPath>" failed with the following log (the full log is attached as "osd-9-bluefs.txt").
```
2020-10-28T02:07:55.499+0000 7f0aa7f1e0c0 -1 bluefs _check_new_allocations invalid extent 1: 0x7ae18a0000~10000: duplicate reference, ino 26
2020-10-28T02:07:55.500+0000 7f0aa7f1e0c0 -1 bluefs mount failed to replay log: (14) Bad address
2020-10-28T02:07:55.500+0000 7f0aa7f1e0c0 -1 bluestore(/var/lib/ceph/osd/ceph-9) _open_bluefs failed bluefs mount: (14) Bad address
```
The tail of the bluefs's journal log is as follows (the full log is attached as "bluefs-dump.zip").
```
...
2020-10-28T08:39:58.788+0000 7f79675c70c0 10 bluefs replay 0x263000: txn(seq 51924 len 0xc8 crc 0x4314d6b)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_unlink db/CURRENT
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_file_remove 25
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_link db/CURRENT to 30
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 30 = 0x56375176dad0
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_unlink db/000020.dbtmp
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_file_update file(ino 31 size 0x0 mtime 2020-10-27T23:10:42.718117+0000 allocated 0 extents [])
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 31 = 0x56375176c0d0 (new)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_link db.wal/000021.log to 31
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 31 = 0x56375176c0d0
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_unlink db.wal/000015.log
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_file_remove 26 ... (1)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_dir_unlink db/MANIFEST-000014
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x263000: op_file_remove 24
2020-10-28T08:39:58.788+0000 7f79675c70c0 10 bluefs _read h 0x563751752d80 0x264000~1000 from file(ino 1 size 0x264000 mtime 0.000000 allocated 410000 extents [1:0x7ae18b0000~10000,1:0x7ae1470000~400000])
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _read left 0xac000 len 0x1000
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _read result chunk (0x1000 bytes):
00000000 01 01 80 00 00 00 1c 67 9e 3f 31 cb 40 5b bc f4 |.......g.?1.@[..|
00000010 49 4f 77 68 e7 24 d5 ca 00 00 00 00 00 00 60 00 |IOwh.$........`.|
00000020 00 00 08 01 01 0f 00 00 00 20 00 f2 a8 98 5f 7c |......... ....||
00000030 5b ec 36 00 00 00 00 00 04 02 00 00 00 64 62 14 |[.6..........db.|
00000040 00 00 00 4f 50 54 49 4f 4e 53 2d 30 30 30 30 32 |...OPTIONS-00002|
00000050 32 2e 64 62 74 6d 70 20 00 00 00 00 00 00 00 08 |2.dbtmp ........|
00000060 01 01 1c 00 00 00 20 d6 26 f2 a8 98 5f 30 83 f2 |...... .&..._0..|
00000070 3a 00 01 00 00 00 01 01 06 00 00 00 29 86 eb 01 |:...........)...|
00000080 43 01 29 e6 5d fa 00 00 00 00 00 00 00 00 00 00 |C.).]...........|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001000
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _read got 4096
2020-10-28T08:39:58.788+0000 7f79675c70c0 10 bluefs _replay 0x264000: txn(seq 51925 len 0x60 crc 0xfa5de629)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x264000: op_file_update file(ino 32 size 0x0 mtime 2020-10-27T23:10:42.921459+0000 allocated 0 extents [])
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 32 = 0x56375176cd00 (new)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x264000: op_dir_link db/OPTIONS-000022.dbtmp to 32
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 32 = 0x56375176cd00
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x264000: op_file_update file(ino 32 size 0x1356 mtime 2020-10-27T23:10:42.988971+0000 allocated 10000 extents [1:0x7ae18a0000~10000]) ... (2)
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 32 = 0x56375176cd00
2020-10-28T08:39:58.788+0000 7f79675c70c0 10 bluefs _read h 0x563751752d80 0x265000~1000 from file(ino 1 size 0x265000 mtime 0.000000 allocated 410000 extents [1:0x7ae18b0000~10000,1:0x7ae1470000~400000])
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _read left 0xab000 len 0x1000
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _read result chunk (0x1000 bytes):
00000000 01 01 82 00 00 00 1c 67 9e 3f 31 cb 40 5b bc f4 |.......g.?1.@[..|
00000010 49 4f 77 68 e7 24 d6 ca 00 00 00 00 00 00 62 00 |IOwh.$........b.|
00000020 00 00 04 02 00 00 00 64 62 0e 00 00 00 4f 50 54 |.......db....OPT|
00000030 49 4f 4e 53 2d 30 30 30 30 32 33 20 00 00 00 00 |IONS-000023 ....|
00000040 00 00 00 05 02 00 00 00 64 62 14 00 00 00 4f 50 |........db....OP|
00000050 54 49 4f 4e 53 2d 30 30 30 30 32 32 2e 64 62 74 |TIONS-000022.dbt|
00000060 6d 70 05 02 00 00 00 64 62 0e 00 00 00 4f 50 54 |mp.....db....OPT|
00000070 49 4f 4e 53 2d 30 30 30 30 31 34 09 16 00 00 00 |IONS-000014.....|
00000080 00 00 00 00 69 19 3e 68 00 00 00 00 00 00 00 00 |....i.>h........|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001000
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _read got 4096
2020-10-28T08:39:58.788+0000 7f79675c70c0 10 bluefs _replay 0x266000: txn(seq 51927 len 0x4b crc 0x9ae4c5e9)
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x266000: op_file_update file(ino 26 size 0x24e934 mtime 2020-10-23T08:29:44.981029+0000 allocated 250000 extents [1:0x7ae18a0000~10000,1:0x7ae18c0000~
10000,1:0x7ae1870000~20000,1:0x7ae1e10000~210000])
2020-10-28T08:39:58.788+0000 7f79675c70c0 30 bluefs _get_file ino 26 = 0x56375176cea0 (new)
2020-10-28T08:39:58.788+0000 7f79675c70c0 -1 bluefs _check_new_allocations invalid extent 1: 0x7ae18a0000~10000: duplicate reference, ino 26 ... (3)
2020-10-28T08:39:58.788+0000 7f79675c70c0 -1 bluefs mount failed to replay log: (14) Bad address
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _stop_alloc
```
It looks this bluefs corrupted after the following steps.
(1) delete ino 26
(2) allocate "0x7ae18a0000~10000" to ino 32
(3) allocate the same extent to already-removed ino 26 and detect duplicate reference.
In addition, the mtime of the last op_file_update of ino 26 is "2020-10-23T08:29:44.981029+0000".
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x266000: op_file_update file(ino 26 size 0x24e934 mtime 2020-10-23T08:29:44.981029+0000 allocated 250000 extents [1:0x7ae18a0000~10000,1:0x7ae18c0000~
However, it's far older than the previous ones.
2020-10-28T08:39:58.788+0000 7f79675c70c0 20 bluefs _replay 0x264000: op_file_update file(ino 32 size 0x1356 mtime 2020-10-27T23:10:42.988971+0000 allocated 10000 extents [1:0x7ae18a0000~10000])
Files