Actions
Bug #10670
closedosd segfault
Status:
Rejected
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Hi!
I have cluster with 5 nodes. After update glibc (CVE-2015-0235), 12 hours later 2 nodes is dies. After reset, at the start of ceph, i have segfault. If start this osd manually with debug, getting messages:
root@virt-master:~# ceph-osd -f -d --debug_ms=10 -c /etc/ceph/ceph.conf --name=osd.2 2015-01-28 14:35:46.995375 7f92c1adc840 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 16275 starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal 2015-01-28 14:35:46.996114 7f92c1adc840 10 -- :/0 rank.bind 10.100.23.2:0/0 2015-01-28 14:35:46.996125 7f92c1adc840 10 accepter.accepter.bind 2015-01-28 14:35:46.996139 7f92c1adc840 10 accepter.accepter.bind bound on random port 10.100.23.2:6801/0 2015-01-28 14:35:46.996143 7f92c1adc840 10 accepter.accepter.bind bound to 10.100.23.2:6801/0 2015-01-28 14:35:46.996178 7f92c1adc840 1 -- 10.100.23.2:0/0 learned my addr 10.100.23.2:0/0 2015-01-28 14:35:46.996183 7f92c1adc840 1 accepter.accepter.bind my_inst.addr is 10.100.23.2:6801/16275 need_addr=0 2015-01-28 14:35:46.996187 7f92c1adc840 10 -- :/0 rank.bind :/0 2015-01-28 14:35:46.996188 7f92c1adc840 10 accepter.accepter.bind 2015-01-28 14:35:46.996193 7f92c1adc840 10 accepter.accepter.bind bound on random port 0.0.0.0:6802/0 2015-01-28 14:35:46.996195 7f92c1adc840 10 accepter.accepter.bind bound to 0.0.0.0:6802/0 2015-01-28 14:35:46.996198 7f92c1adc840 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6802/16275 need_addr=1 2015-01-28 14:35:46.996200 7f92c1adc840 10 -- :/0 rank.bind :/0 2015-01-28 14:35:46.996201 7f92c1adc840 10 accepter.accepter.bind 2015-01-28 14:35:46.996205 7f92c1adc840 10 accepter.accepter.bind bound on random port 0.0.0.0:6803/0 2015-01-28 14:35:46.996207 7f92c1adc840 10 accepter.accepter.bind bound to 0.0.0.0:6803/0 2015-01-28 14:35:46.996210 7f92c1adc840 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6803/16275 need_addr=1 2015-01-28 14:35:46.996212 7f92c1adc840 10 -- :/0 rank.bind 10.100.23.2:0/0 2015-01-28 14:35:46.996213 7f92c1adc840 10 accepter.accepter.bind 2015-01-28 14:35:46.996218 7f92c1adc840 10 accepter.accepter.bind bound on random port 10.100.23.2:6804/0 2015-01-28 14:35:46.996220 7f92c1adc840 10 accepter.accepter.bind bound to 10.100.23.2:6804/0 2015-01-28 14:35:46.996229 7f92c1adc840 1 -- 10.100.23.2:0/0 learned my addr 10.100.23.2:0/0 2015-01-28 14:35:46.996232 7f92c1adc840 1 accepter.accepter.bind my_inst.addr is 10.100.23.2:6804/16275 need_addr=0 2015-01-28 14:35:46.996233 7f92c1adc840 10 -- :/0 rank.bind 10.100.23.2:0/0 2015-01-28 14:35:46.996235 7f92c1adc840 10 accepter.accepter.bind 2015-01-28 14:35:46.996240 7f92c1adc840 10 accepter.accepter.bind bound on random port 10.100.23.2:6805/0 2015-01-28 14:35:46.996242 7f92c1adc840 10 accepter.accepter.bind bound to 10.100.23.2:6805/0 2015-01-28 14:35:46.996250 7f92c1adc840 1 -- 10.100.23.2:0/0 learned my addr 10.100.23.2:0/0 2015-01-28 14:35:46.996253 7f92c1adc840 1 accepter.accepter.bind my_inst.addr is 10.100.23.2:6805/16275 need_addr=0 2015-01-28 14:35:46.998332 7f92c1adc840 0 filestore(/var/lib/ceph/osd/ceph-2) backend xfs (magic 0x58465342) 2015-01-28 14:35:46.998342 7f92c1adc840 1 filestore(/var/lib/ceph/osd/ceph-2) disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs 2015-01-28 14:35:47.107992 7f92c1adc840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is supported and appears to work 2015-01-28 14:35:47.108003 7f92c1adc840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2015-01-28 14:35:47.116258 7f92c1adc840 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: syscall(SYS_syncfs, fd) fully supported 2015-01-28 14:35:47.126209 7f92c1adc840 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_feature: extsize is disabled by conf 2015-01-28 14:35:47.296360 7f92c1adc840 0 filestore(/var/lib/ceph/osd/ceph-2) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2015-01-28 14:35:48.105737 7f92c1adc840 1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 19: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 *** Caught signal (Segmentation fault) ** in thread 7f92c1adc840 Segmentation Fault
How can this be fixed?
Thanks!
Actions