Project

General

Profile

Actions

Bug #36624

open

Ceph assert when enable SPDK with 64Kb kernel page size

Added by Anonymous over 5 years ago. Updated over 5 years ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
v14.0.0
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When started Ceph cluster enabling SPDK with 64KB kernel page size, observed assert in bluestore/NVMEDevice.cc as below:

Starting SPDK v18.04.1 / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: nvme-device-manager -c 0x1 -m 2048 --file-prefix=spdk_pid20837 ]
EAL: Detected 46 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid20837/mp_socket
EAL: Probing VFIO support...
EAL: VFIO support initialized
Unable to unlink shared memory file: /var/run/.spdk_pid20837_hugepage_info. Error code: 2
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:953 spdk_nvme
EAL:   using IOMMU type 1 (Type 1)
/home/ubuntu/ceph/src/os/bluestore/NVMEDevice.cc: In function 'virtual int NVMEDevice::write(uint64_t, ceph::bufferlist&, bool)' thread ffff81c7adf0 time 2018-10-20 09:22:26.229229
/home/ubuntu/ceph/src/os/bluestore/NVMEDevice.cc: 844: FAILED ceph_assert(off % block_size == 0)
 ceph version 14.0.0-4420-g98fc7ebc99 (98fc7ebc99a3639240eef4f745c9bd633446d2b3) nautilus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0xaaaab5c4a8ac]
 2: (()+0x2a0aab0) [0xaaaab5c4aab0]
 3: (NVMEDevice::write(unsigned long, ceph::buffer::list&, bool)+0x1b8) [0xaaaab5bbc6b4]
 4: (BlueFS::_write_super()+0x39c) [0xaaaab5b61768]
 5: (BlueFS::mkfs(uuid_d)+0x590) [0xaaaab5b5fcd8]
 6: (BlueStore::_open_db(bool, bool)+0x1c24) [0xaaaab59db724]
 7: (BlueStore::mkfs()+0x116c) [0xaaaab59e2c2c]
 8: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0x94) [0xaaaab520d514]
 9: (main()+0x1650) [0xaaaab51dad34]
 10: (__libc_start_main()+0xe0) [0xffff812f06e0]
 11: (()+0x1f984e4) [0xaaaab51d84e4]
*** Caught signal (Aborted) **
 in thread ffff81c7adf0 thread_name:ceph-osd
2018-10-30 09:22:26.242 ffff81c7adf0 -1 /home/ubuntu/ceph/src/os/bluestore/NVMEDevice.cc: In function 'virtual int NVMEDevice::write(uint64_t, ceph::bufferlist&, bool)' thread ffff81c7adf0 time 2018-10-30 09:22:26.229229
/home/ubuntu/ceph/src/os/bluestore/NVMEDevice.cc: 844: FAILED ceph_assert(off % block_size == 0)

 ceph version 14.0.0-4420-g98fc7ebc99 (98fc7ebc99a3639240eef4f745c9bd633446d2b3) nautilus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0xaaaab5c4a8ac]
 2: (()+0x2a0aab0) [0xaaaab5c4aab0]
 3: (NVMEDevice::write(unsigned long, ceph::buffer::list&, bool)+0x1b8) [0xaaaab5bbc6b4]
 4: (BlueFS::_write_super()+0x39c) [0xaaaab5b61768]
 5: (BlueFS::mkfs(uuid_d)+0x590) [0xaaaab5b5fcd8]
 6: (BlueStore::_open_db(bool, bool)+0x1c24) [0xaaaab59db724]
 7: (BlueStore::mkfs()+0x116c) [0xaaaab59e2c2c]
 8: (OSD::mkfs(CephContext*, ObjectStore*, uuid_d, int)+0x94) [0xaaaab520d514]
 9: (main()+0x1650) [0xaaaab51dad34]
 10: (__libc_start_main()+0xe0) [0xffff812f06e0]
 11: (()+0x1f984e4) [0xaaaab51d84e4]

 ceph version 14.0.0-4420-g98fc7ebc99 (98fc7ebc99a3639240eef4f745c9bd633446d2b3) nautilus (dev)
 1: (()+0x2990938) [0xaaaab5bd0938]
 2: (__kernel_rt_sigreturn()+0) [0xffff81df066c]
 3: (raise()+0xb0) [0xffff813024d8]
2018-10-30 09:22:26.242 ffff81c7adf0 -1 *** Caught signal (Aborted) **
 in thread ffff81c7adf0 thread_name:ceph-osd

 ceph version 14.0.0-4420-g98fc7ebc99 (98fc7ebc99a3639240eef4f745c9bd633446d2b3) nautilus (dev)
 1: (()+0x2990938) [0xaaaab5bd0938]
 2: (__kernel_rt_sigreturn()+0) [0xffff81df066c]
 3: (raise()+0xb0) [0xffff813024d8]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Observed the same assert after switch to version master/LATEST.

Actions #1

Updated by Anonymous over 5 years ago

  • Assignee set to Anonymous

I'm working on the issue.

Actions #3

Updated by Kefu Chai over 5 years ago

  • Description updated (diff)
Actions #4

Updated by Kefu Chai over 5 years ago

  • Status changed from New to Fix Under Review
Actions

Also available in: Atom PDF