Bug #23840
closed
Bluestore OSD hit assert((log_reader->buf.pos & ~super.block_mask()) == 0)
Added by Rafal Wadolowski about 6 years ago.
Updated almost 6 years ago.
Description
We're using ceph 12.2.4, where db and wal are on seperate nvme.
After restart on some OSDs we see the following error:
-3> 2018-04-24 16:18:33.940657 7f95ee81de00 10 bluefs _replay 0xffffe000: txn(seq 1550851 len 0xa6a crc 0x8cdd70c6)
-2> 2018-04-24 16:18:33.940663 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xfffff000~1000 from file(ino 1 size 0xfffff000 mtime 0.000000 bdev 0 extents
-1> 2018-04-24 16:18:33.940884 7f95ee81de00 10 bluefs _replay 0xfffff000: txn(seq 1550852 len 0xa6a crc 0x5e1c1b4f)
0> 2018-04-24 16:18:33.943750 7f95ee81de00 -1 /build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool)' thread 7f95ee81de00 time 2018-04-24 16:18:33.940909
/build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: 551: FAILED assert((log_reader->buf.pos & ~super.block_mask()) == 0)
Right now we have some pgs inactive.
Full log in attachment.
Files
What is interesting, it looks like the max read of file is limited to 0xffffffff.
-11> 2018-04-24 16:18:33.939739 7f95ee81de00 10 bluefs _replay 0xffffa000: txn(seq 1550847 len 0xa6a crc 0x95281bbd)
-10> 2018-04-24 16:18:33.939745 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffb000~1000 from file(ino 1 size 0xffffb000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
-9> 2018-04-24 16:18:33.939970 7f95ee81de00 10 bluefs _replay 0xffffb000: txn(seq 1550848 len 0xa6a crc 0x86579e18)
-8> 2018-04-24 16:18:33.939976 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffc000~1000 from file(ino 1 size 0xffffc000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
-7> 2018-04-24 16:18:33.940196 7f95ee81de00 10 bluefs _replay 0xffffc000: txn(seq 1550849 len 0xa6a crc 0x61eaa8e2)
-6> 2018-04-24 16:18:33.940203 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffd000~1000 from file(ino 1 size 0xffffd000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
-5> 2018-04-24 16:18:33.940427 7f95ee81de00 10 bluefs _replay 0xffffd000: txn(seq 1550850 len 0xa6a crc 0xbcc8bda3)
-4> 2018-04-24 16:18:33.940433 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffe000~1000 from file(ino 1 size 0xffffe000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
-3> 2018-04-24 16:18:33.940657 7f95ee81de00 10 bluefs _replay 0xffffe000: txn(seq 1550851 len 0xa6a crc 0x8cdd70c6)
-2> 2018-04-24 16:18:33.940663 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xfffff000~1000 from file(ino 1 size 0xfffff000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
-1> 2018-04-24 16:18:33.940884 7f95ee81de00 10 bluefs _replay 0xfffff000: txn(seq 1550852 len 0xa6a crc 0x5e1c1b4f)
0> 2018-04-24 16:18:33.943750 7f95ee81de00 -1 /build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool)' thread 7f95ee81de00 time 2018-04-24 16:18:33.940909
/build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: 551: FAILED assert((log_reader->buf.pos & ~super.block_mask()) == 0)
In each line of bluefs _read, we see iterating address of log file, and it is breaking at 0xfffff000~1000.
Moment of planned shutdown
- Status changed from New to 7
- Project changed from Ceph to bluestore
- Category deleted (
OSD)
- Priority changed from Normal to High
- Backport set to luminous
This change is working, I think it could be merge with master.
- Status changed from 7 to Pending Backport
I have met the same issue in Luminous.
- Copied to Backport #23881: luminous: Bluestore OSD hit assert((log_reader->buf.pos & ~super.block_mask()) == 0) added
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF