Project

General

Profile

Actions

Bug #25098

closed

Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)`

Added by benoit hudzia almost 6 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This occurs sometimes... hard to catch :

1. we zap the device
2. do ceph lvm active with a cache and DB on SSD ( separate disk )
3. OSD activated successfully then we try to start it
4. keep getting stack trace

Note : we have another OSD being setup in parrallel on the same node which start without an hitch

Can this be related to a parallel setup of OSD activation ?

Logs:

Zapping device /dev/sdc , All DATA WILL BE LOST
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 6.0558 s, 173 MB/s
Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
The operation has completed successfully.
--> Zapping: /dev/sdc
Running command: /usr/sbin/cryptsetup status /dev/mapper/
 stdout: /dev/mapper/ is inactive.
--> Skipping --destroy because no associated physical volumes are found for /dev/sdc
Running command: wipefs --all /dev/sdc
 stdout: /dev/sdc: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdc: 8 bytes were erased at offset 0x1d1c1115e00 (gpt): 45 46 49 20 50 41 52 54
/dev/sdc: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdc: calling ioclt to re-read partition table: Success
Running command: dd if=/dev/zero of=/dev/sdc bs=1M count=10
 stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied
 stderr: , 0.00540907 s, 1.9 GB/s
--> Zapping successful for: /dev/sdc
Preparing cached /dev/sdc for bluestore with 
Zapping cache
-->  KeyError: 'ceph.cluster_name'
--> Zapping: /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-wal
--> Zapping: /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-db
-->  KeyError: 'ceph.cluster_name'
Creating volume with cache
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ab3eea63-4e3c-40a8-80ae-426a69fc90e0
Running command: vgcreate --force --yes ceph-7ecdff2f-917e-480a-9af0-32efb26c5604 /dev/sdc
 stdout: Physical volume "/dev/sdc" successfully created.
 stdout: Volume group "ceph-7ecdff2f-917e-480a-9af0-32efb26c5604" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0 ceph-7ecdff2f-917e-480a-9af0-32efb26c5604
 stdout: Logical volume "osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
Running command: chown -h ceph:ceph /dev/ceph-7ecdff2f-917e-480a-9af0-32efb26c5604/osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0
Running command: chown -R ceph:ceph /dev/mapper/ceph--7ecdff2f--917e--480a--9af0--32efb26c5604-osd--block--ab3eea63--4e3c--40a8--80ae--426a69fc90e0
Running command: ln -s /dev/ceph-7ecdff2f-917e-480a-9af0-32efb26c5604/osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0 /var/lib/ceph/osd/ceph-1/block
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-1/activate.monmap
 stderr: got monmap epoch 3
Running command: ceph-authtool /var/lib/ceph/osd/ceph-1/keyring --create-keyring --name osd.1 --add-key AQBGj1dbWqmuEhAAjGQ2C7SnDP0J4NLZFCG49g==
 stdout: creating /var/lib/ceph/osd/ceph-1/keyring
 stdout: added entity osd.1 auth auth(auid = 18446744073709551615 key=AQBGj1dbWqmuEhAAjGQ2C7SnDP0J4NLZFCG49g== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/keyring
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/
Running command: chown -h ceph:ceph /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-wal
Running command: chown -R ceph:ceph /dev/mapper/inaugurator-ab3eea63--4e3c--40a8--80ae--426a69fc90e0--wal
Running command: chown -h ceph:ceph /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-db
Running command: chown -R ceph:ceph /dev/mapper/inaugurator-ab3eea63--4e3c--40a8--80ae--426a69fc90e0--db
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 1 --monmap /var/lib/ceph/osd/ceph-1/activate.monmap --keyfile - --bluestore-block-wal-path /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-wal --bluestore-block-db-path /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-db --osd-data /var/lib/ceph/osd/ceph-1/ --osd-uuid ab3eea63-4e3c-40a8-80ae-426a69fc90e0 --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdc
Starting OSD service 
Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-7ecdff2f-917e-480a-9af0-32efb26c5604/osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0 --path /var/lib/ceph/osd/ceph-1
Running command: ln -snf /dev/ceph-7ecdff2f-917e-480a-9af0-32efb26c5604/osd-block-ab3eea63-4e3c-40a8-80ae-426a69fc90e0 /var/lib/ceph/osd/ceph-1/block
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
Running command: chown -R ceph:ceph /dev/mapper/ceph--7ecdff2f--917e--480a--9af0--32efb26c5604-osd--block--ab3eea63--4e3c--40a8--80ae--426a69fc90e0
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
Running command: ln -snf /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-wal /var/lib/ceph/osd/ceph-1/block.wal
Running command: chown -h ceph:ceph /dev/inaugurator/ab3eea63-4e3c-40a8-80ae-426a69fc90e0-wal
Running command: chown -R ceph:ceph /dev/mapper/inaugurator-ab3eea63--4e3c--40a8--80ae--426a69fc90e0--wal
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block.wal
Running command: chown -R ceph:ceph /dev/mapper/inaugurator-ab3eea63--4e3c--40a8--80ae--426a69fc90e0--wal
Running command: systemctl enable ceph-volume@lvm-1-ab3eea63-4e3c-40a8-80ae-426a69fc90e0
 stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-1-ab3eea63-4e3c-40a8-80ae-426a69fc90e0.service, pointing to /usr/lib/systemd/system/ceph-volume@.service.
Running command: systemctl start ceph-osd@1
 stderr: Running in chroot, ignoring request.
--> ceph-volume lvm activate successful for osd ID: 1
osd.1 belongs to no class, 
set osd(s) 1 to class '79631a28-8d7f-4ca4-a3f6-b02e0f8de8ec'
OSD ID 1 for device : /dev/sdc , UUID: ab3eea63-4e3c-40a8-80ae-426a69fc90e0
starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: In function 'static void bluefs_fnode_t::_denc_finish(ceph::buffer::ptr::iterator&, __u8*, __u8*, char**, uint32_t*)' thread 7f637a5fed80 time 2018-07-24 20:43:05.884337
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: 54: FAILED assert(pos <= end)
 ceph version 12.2.6 (488df8a1076c4f5fc5b8d18a90463262c438740f) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5570da469bf0]
 2: (bluefs_super_t::decode(ceph::buffer::list::iterator&)+0x776) [0x5570da3ffd66]
 3: (BlueFS::_open_super()+0xfe) [0x5570da3df17e]
 4: (BlueFS::mount()+0xe3) [0x5570da3f7283]
 5: (BlueStore::_open_db(bool)+0x1847) [0x5570da311147]
 6: (BlueStore::_mount(bool)+0x40e) [0x5570da34240e]
 7: (OSD::init()+0x3bd) [0x5570d9ef732d]
 8: (main()+0x2d07) [0x5570d9dfc927]
 9: (__libc_start_main()+0xf5) [0x7f6376abe445]
 10: (()+0x4b8fe3) [0x5570d9e9afe3]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2018-07-24 20:43:05.888052 7f637a5fed80 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: In function 'static void bluefs_fnode_t::_denc_finish(ceph::buffer::ptr::iterator&, __u8*, __u8*, char**, uint32_t*)' thread 7f637a5fed80 time 2018-07-24 20:43:05.884337
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: 54: FAILED assert(pos <= end)

 ceph version 12.2.6 (488df8a1076c4f5fc5b8d18a90463262c438740f) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5570da469bf0]
 2: (bluefs_super_t::decode(ceph::buffer::list::iterator&)+0x776) [0x5570da3ffd66]
 3: (BlueFS::_open_super()+0xfe) [0x5570da3df17e]
 4: (BlueFS::mount()+0xe3) [0x5570da3f7283]
 5: (BlueStore::_open_db(bool)+0x1847) [0x5570da311147]
 6: (BlueStore::_mount(bool)+0x40e) [0x5570da34240e]
 7: (OSD::init()+0x3bd) [0x5570d9ef732d]
 8: (main()+0x2d07) [0x5570d9dfc927]
 9: (__libc_start_main()+0xf5) [0x7f6376abe445]
 10: (()+0x4b8fe3) [0x5570d9e9afe3]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2018-07-24 20:43:05.888052 7f637a5fed80 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: In function 'static void bluefs_fnode_t::_denc_finish(ceph::buffer::ptr::iterator&, __u8*, __u8*, char**, uint32_t*)' thread 7f637a5fed80 time 2018-07-24 20:43:05.884337
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.6/rpm/el7/BUILD/ceph-12.2.6/src/os/bluestore/bluefs_types.h: 54: FAILED assert(pos <= end)

 ceph version 12.2.6 (488df8a1076c4f5fc5b8d18a90463262c438740f) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5570da469bf0]
 2: (bluefs_super_t::decode(ceph::buffer::list::iterator&)+0x776) [0x5570da3ffd66]
 3: (BlueFS::_open_super()+0xfe) [0x5570da3df17e]
 4: (BlueFS::mount()+0xe3) [0x5570da3f7283]
 5: (BlueStore::_open_db(bool)+0x1847) [0x5570da311147]
 6: (BlueStore::_mount(bool)+0x40e) [0x5570da34240e]
 7: (OSD::init()+0x3bd) [0x5570d9ef732d]
 8: (main()+0x2d07) [0x5570d9dfc927]
 9: (__libc_start_main()+0xf5) [0x7f6376abe445]
 10: (()+0x4b8fe3) [0x5570d9e9afe3]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues 1 (0 open1 closed)

Related to bluestore - Bug #18389: crash when opening bluefs superblockCan't reproduce01/02/2017

Actions
Actions

Also available in: Atom PDF