Project

General

Profile

Actions

Bug #23840

closed

Bluestore OSD hit assert((log_reader->buf.pos & ~super.block_mask()) == 0)

Added by Rafal Wadolowski almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We're using ceph 12.2.4, where db and wal are on seperate nvme.

After restart on some OSDs we see the following error:

-3> 2018-04-24 16:18:33.940657 7f95ee81de00 10 bluefs _replay 0xffffe000: txn(seq 1550851 len 0xa6a crc 0x8cdd70c6)
    -2> 2018-04-24 16:18:33.940663 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xfffff000~1000 from file(ino 1 size 0xfffff000 mtime 0.000000 bdev 0 extents 
    -1> 2018-04-24 16:18:33.940884 7f95ee81de00 10 bluefs _replay 0xfffff000: txn(seq 1550852 len 0xa6a crc 0x5e1c1b4f)
     0> 2018-04-24 16:18:33.943750 7f95ee81de00 -1 /build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool)' thread 7f95ee81de00 time 2018-04-24 16:18:33.940909
/build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: 551: FAILED assert((log_reader->buf.pos & ~super.block_mask()) == 0)

Right now we have some pgs inactive.

Full log in attachment.


Files

osd.log (19.1 KB) osd.log Rafal Wadolowski, 04/24/2018 04:33 PM
shutdown.log (8.15 KB) shutdown.log Rafal Wadolowski, 04/24/2018 05:16 PM

Related issues 1 (0 open1 closed)

Copied to bluestore - Backport #23881: luminous: Bluestore OSD hit assert((log_reader->buf.pos & ~super.block_mask()) == 0)ResolvedPrashant DActions
Actions #1

Updated by Rafal Wadolowski almost 6 years ago

What is interesting, it looks like the max read of file is limited to 0xffffffff.

   -11> 2018-04-24 16:18:33.939739 7f95ee81de00 10 bluefs _replay 0xffffa000: txn(seq 1550847 len 0xa6a crc 0x95281bbd)
   -10> 2018-04-24 16:18:33.939745 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffb000~1000 from file(ino 1 size 0xffffb000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
    -9> 2018-04-24 16:18:33.939970 7f95ee81de00 10 bluefs _replay 0xffffb000: txn(seq 1550848 len 0xa6a crc 0x86579e18)
    -8> 2018-04-24 16:18:33.939976 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffc000~1000 from file(ino 1 size 0xffffc000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
    -7> 2018-04-24 16:18:33.940196 7f95ee81de00 10 bluefs _replay 0xffffc000: txn(seq 1550849 len 0xa6a crc 0x61eaa8e2)
    -6> 2018-04-24 16:18:33.940203 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffd000~1000 from file(ino 1 size 0xffffd000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
    -5> 2018-04-24 16:18:33.940427 7f95ee81de00 10 bluefs _replay 0xffffd000: txn(seq 1550850 len 0xa6a crc 0xbcc8bda3)
    -4> 2018-04-24 16:18:33.940433 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xffffe000~1000 from file(ino 1 size 0xffffe000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
    -3> 2018-04-24 16:18:33.940657 7f95ee81de00 10 bluefs _replay 0xffffe000: txn(seq 1550851 len 0xa6a crc 0x8cdd70c6)
    -2> 2018-04-24 16:18:33.940663 7f95ee81de00 10 bluefs _read h 0x55a7876b6f00 0xfffff000~1000 from file(ino 1 size 0xfffff000 mtime 0.000000 bdev 0 extents [1:0x15300000+100000,0:0x3d800000+c00000,
    -1> 2018-04-24 16:18:33.940884 7f95ee81de00 10 bluefs _replay 0xfffff000: txn(seq 1550852 len 0xa6a crc 0x5e1c1b4f)
     0> 2018-04-24 16:18:33.943750 7f95ee81de00 -1 /build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool)' thread 7f95ee81de00 time 2018-04-24 16:18:33.940909
/build/ceph-12.2.4/src/os/bluestore/BlueFS.cc: 551: FAILED assert((log_reader->buf.pos & ~super.block_mask()) == 0)

In each line of bluefs _read, we see iterating address of log file, and it is breaking at 0xfffff000~1000.

Actions #2

Updated by Rafal Wadolowski almost 6 years ago

Moment of planned shutdown

Actions #3

Updated by Sage Weil almost 6 years ago

  • Status changed from New to 7
Actions #4

Updated by Sage Weil almost 6 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)
  • Priority changed from Normal to High
  • Backport set to luminous
Actions #5

Updated by Rafal Wadolowski almost 6 years ago

This change is working, I think it could be merge with master.

Actions #6

Updated by Kefu Chai almost 6 years ago

  • Status changed from 7 to Pending Backport
Actions #7

Updated by Enming Zhang almost 6 years ago

I have met the same issue in Luminous.

Actions #8

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #23881: luminous: Bluestore OSD hit assert((log_reader->buf.pos & ~super.block_mask()) == 0) added
Actions #9

Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF