Project

General

Profile

Actions

Bug #21001

closed

osd crash

Added by Chengguang Xu over 6 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When I restart OSDs, an OSD crash and could not fix by restart.

detail osd log:

53> 2017-08-15 15:56:58.157920 7f0cf49d97c0  1 - :/0 messenger.start
-52> 2017-08-15 15:56:58.158012 7f0cf49d97c0 2 osd.6 0 mounting /var/lib/ceph/osd/ceph-6 /dev/disk/by-partlabel/osd-journal-6
-51> 2017-08-15 15:56:58.158078 7f0cf49d97c0 0 filestore(/var/lib/ceph/osd/ceph-6) backend xfs (magic 0x58465342)
-50> 2017-08-15 15:56:58.158568 7f0cf49d97c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-6) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
-49> 2017-08-15 15:56:58.158574 7f0cf49d97c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-6) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
-48> 2017-08-15 15:56:58.158591 7f0cf49d97c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-6) detect_features: splice is supported
-47> 2017-08-15 15:56:58.159968 7f0cf49d97c0 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-6) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
-46> 2017-08-15 15:56:58.160058 7f0cf49d97c0 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-6) detect_feature: extsize is disabled by conf
-45> 2017-08-15 15:56:58.161025 7f0cf49d97c0 1 leveldb: Recovering log #135
-44> 2017-08-15 15:56:58.162025 7f0cf49d97c0 1 leveldb: Delete type=0 #135
-43> 2017-08-15 15:56:58.162058 7f0cf49d97c0  1 leveldb: Delete type=3 #134
-42> 2017-08-15 15:56:58.163288 7f0cf49d97c0  0 filestore(/var/lib/ceph/osd/ceph-6) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
-41> 2017-08-15 15:56:58.164499 7f0cf49d97c0 2 journal open /dev/disk/by-partlabel/osd-journal-6 fsid b0507013-7c27-4587-bb6d-c793333f95b1 fs_op_seq 76752
-40> 2017-08-15 15:56:58.164543 7f0cf49d97c0 1 journal _open /dev/disk/by-partlabel/osd-journal-6 fd 21: 20999831552 bytes, block size 4096 bytes, directio = 1, aio = 1
-39> 2017-08-15 15:56:58.164903 7f0cf49d97c0 2 journal No further valid entries found, journal is most likely valid
-38> 2017-08-15 15:56:58.164916 7f0cf49d97c0 2 journal No further valid entries found, journal is most likely valid
-37> 2017-08-15 15:56:58.164919 7f0cf49d97c0 3 journal journal_replay: end of journal, done.
-36> 2017-08-15 15:56:58.164993 7f0cf49d97c0 1 journal _open /dev/disk/by-partlabel/osd-journal-6 fd 21: 20999831552 bytes, block size 4096 bytes, directio = 1, aio = 1
-35> 2017-08-15 15:56:58.166705 7f0cf49d97c0 1 filestore(/var/lib/ceph/osd/ceph-6) upgrade
-34> 2017-08-15 15:56:58.166734 7f0cf49d97c0 2 osd.6 0 boot
-33> 2017-08-15 15:56:58.167651 7f0cf49d97c0 0 <cls> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan
-32> 2017-08-15 15:56:58.167801 7f0cf49d97c0 0 <cls> cls/hello/cls_hello.cc:305: loading cls_hello
-31> 2017-08-15 15:56:58.168428 7f0cf49d97c0 1 <cls> cls/log/cls_log.cc:317: Loaded log class!
-30> 2017-08-15 15:56:58.171231 7f0cf49d97c0 1 <cls> cls/refcount/cls_refcount.cc:232: Loaded refcount class!
-29> 2017-08-15 15:56:58.171345 7f0cf49d97c0 1 <cls> cls/replica_log/cls_replica_log.cc:141: Loaded replica log class!
-28> 2017-08-15 15:56:58.173406 7f0cf49d97c0 1 <cls> cls/rgw/cls_rgw.cc:3362: Loaded rgw class!
-27> 2017-08-15 15:56:58.173535 7f0cf49d97c0 1 <cls> cls/statelog/cls_statelog.cc:306: Loaded log class!
-26> 2017-08-15 15:56:58.173640 7f0cf49d97c0 1 <cls> cls/timeindex/cls_timeindex.cc:259: Loaded timeindex class!
-25> 2017-08-15 15:56:58.173743 7f0cf49d97c0 1 <cls> cls/user/cls_user.cc:375: Loaded user class!
-24> 2017-08-15 15:56:58.173840 7f0cf49d97c0 1 <cls> cls/version/cls_version.cc:228: Loaded version class!
-23> 2017-08-15 15:56:58.174107 7f0cf49d97c0 0 osd.6 664 crush map has features 2200130813952, adjusting msgr requires for clients
-22> 2017-08-15 15:56:58.174116 7f0cf49d97c0 0 osd.6 664 crush map has features 2200130813952 was 8705, adjusting msgr requires for mons
-21> 2017-08-15 15:56:58.174122 7f0cf49d97c0 0 osd.6 664 crush map has features 2200130813952, adjusting msgr requires for osds
-20> 2017-08-15 15:56:58.185928 7f0cf49d97c0 0 osd.6 664 load_pgs
-19> 2017-08-15 15:56:58.186721 7f0cf49d97c0 5 osd.6 pg_epoch: 653 pg[3.dc(unlocked)] enter Initial
-18> 2017-08-15 15:56:58.197808 7f0cf49d97c0 5 osd.6 pg_epoch: 653 pg[3.dc( v 390'3950 (390'900,390'3950] local-les=502 n=68 ec=387 les/c/f 502/502/0 501/501/410) [6,19,36] r=0 lpr=0 crt=390'3950 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 0.011087 0 0.000000
-17> 2017-08-15 15:56:58.197838 7f0cf49d97c0 5 osd.6 pg_epoch: 653 pg[3.dc( v 390'3950 (390'900,390'3950] local-les=502 n=68 ec=387 les/c/f 502/502/0 501/501/410) [6,19,36] r=0 lpr=0 crt=390'3950 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset
-16> 2017-08-15 15:56:58.198256 7f0cf49d97c0 5 osd.6 pg_epoch: 528 pg[3.44(unlocked)] enter Initial
-15> 2017-08-15 15:56:58.208512 7f0cf49d97c0 5 osd.6 pg_epoch: 528 pg[3.44( v 390'3475 (390'400,390'3475] local-les=528 n=65 ec=387 les/c/f 528/528/0 526/527/480) [29,6,44] r=1 lpr=0 pi=476-526/4 crt=390'3475 lcod 0'0 inactive NOTIFY NIBBLEWISE] exit Initial 0.010255 0 0.000000
-14> 2017-08-15 15:56:58.208528 7f0cf49d97c0 5 osd.6 pg_epoch: 528 pg[3.44( v 390'3475 (390'400,390'3475] local-les=528 n=65 ec=387 les/c/f 528/528/0 526/527/480) [29,6,44] r=1 lpr=0 pi=476-526/4 crt=390'3475 lcod 0'0 inactive NOTIFY NIBBLEWISE] enter Reset
-13> 2017-08-15 15:56:58.208969 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[3.86(unlocked)] enter Initial
-12> 2017-08-15 15:56:58.219641 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[3.86( v 390'4059 (390'1000,390'4059] local-les=522 n=59 ec=387 les/c/f 522/522/0 520/521/466) [24,6,42] r=1 lpr=0 pi=461-520/5 crt=390'4059 lcod 0'0 inactive NOTIFY NIBBLEWISE] exit Initial 0.010672 0 0.000000
-11> 2017-08-15 15:56:58.219659 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[3.86( v 390'4059 (390'1000,390'4059] local-les=522 n=59 ec=387 les/c/f 522/522/0 520/521/466) [24,6,42] r=1 lpr=0 pi=461-520/5 crt=390'4059 lcod 0'0 inactive NOTIFY NIBBLEWISE] enter Reset
-10> 2017-08-15 15:56:58.220056 7f0cf49d97c0 5 osd.6 pg_epoch: 515 pg[0.33(unlocked)] enter Initial
-9> 2017-08-15 15:56:58.220214 7f0cf49d97c0 5 osd.6 pg_epoch: 515 pg[0.33( empty local-les=515 n=0 ec=1 les/c/f 515/515/0 513/514/514) [40,35,6] r=2 lpr=0 pi=66-513/25 crt=0'0 inactive NOTIFY NIBBLEWISE] exit Initial 0.000158 0 0.000000
-8> 2017-08-15 15:56:58.220228 7f0cf49d97c0 5 osd.6 pg_epoch: 515 pg[0.33( empty local-les=515 n=0 ec=1 les/c/f 515/515/0 513/514/514) [40,35,6] r=2 lpr=0 pi=66-513/25 crt=0'0 inactive NOTIFY NIBBLEWISE] enter Reset
-7> 2017-08-15 15:56:58.220364 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[0.16(unlocked)] enter Initial
-6> 2017-08-15 15:56:58.220511 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[0.16( empty local-les=522 n=0 ec=1 les/c/f 522/522/0 520/521/521) [42,16,6] r=2 lpr=0 pi=66-520/25 crt=0'0 inactive NOTIFY NIBBLEWISE] exit Initial 0.000147 0 0.000000
-5> 2017-08-15 15:56:58.220524 7f0cf49d97c0 5 osd.6 pg_epoch: 522 pg[0.16( empty local-les=522 n=0 ec=1 les/c/f 522/522/0 520/521/521) [42,16,6] r=2 lpr=0 pi=66-520/25 crt=0'0 inactive NOTIFY NIBBLEWISE] enter Reset
-4> 2017-08-15 15:56:58.220856 7f0cf49d97c0 5 osd.6 pg_epoch: 545 pg[0.1d(unlocked)] enter Initial
-3> 2017-08-15 15:56:58.220989 7f0cf49d97c0 5 osd.6 pg_epoch: 545 pg[0.1d( empty local-les=545 n=0 ec=1 les/c/f 545/545/0 544/544/544) [50,6,28] r=1 lpr=0 pi=540-543/1 crt=0'0 inactive NOTIFY NIBBLEWISE] exit Initial 0.000132 0 0.000000
-2> 2017-08-15 15:56:58.221002 7f0cf49d97c0 5 osd.6 pg_epoch: 545 pg[0.1d( empty local-les=545 n=0 ec=1 les/c/f 545/545/0 544/544/544) [50,6,28] r=1 lpr=0 pi=540-543/1 crt=0'0 inactive NOTIFY NIBBLEWISE] enter Reset
-1> 2017-08-15 15:56:58.221134 7f0cf49d97c0 -1 osd.6 664 load_pgs: have pgid 1.23 at epoch 644, but missing map. Crashing.
0> 2017-08-15 15:56:58.222707 7f0cf49d97c0 -1 osd/OSD.cc: In function 'void OSD::load_pgs()' thread 7f0cf49d97c0 time 2017-08-15 15:56:58.221144
osd/OSD.cc: 3215: FAILED assert(0 == "Missing map in load_pgs")
ceph version 10.2.9 (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f0cf5418aa5]
2: (OSD::load_pgs()+0x1ff6) [0x7f0cf4d90536]
3: (OSD::init()+0x2086) [0x7f0cf4da1386]
4: (main()+0x2c01) [0x7f0cf4d04ee1]
5: (__libc_start_main()+0xf5) [0x7f0cf1b5db15]
6: (()+0x35d289) [0x7f0cf4d4f289]
NOTE: a copy of the executable, or `objdump rdS <executable>` is needed to interpret this.
--
end dump of recent events ---
2017-08-15 15:56:58.230600 7f0cf49d97c0 -1 ** Caught signal (Aborted) *
in thread 7f0cf49d97c0 thread_name:ceph-osd
Actions #1

Updated by Sage Weil almost 3 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF