Project

General

Profile

Actions

Bug #8733

closed

OSD crashed at void ECBackend::handle_sub_read

Added by Jingjing Zhao almost 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Firefly
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When took one OSD out (total 219 OSDs) to trigger recovery, 30 OSDs crashed after about 20min. All OSD crashed for same error.

2014-07-02 07:09:48.846158 7fdb620f9700 10 osd.60 1430 dequeue_op 0x34a5ab40
prio 10 cost 0 latency 0.000216 MOSDECSubOpRead(3.e29s2 1430
ECSubRead(tid=2587872,
to_read={64280e29/default.5470.106_gw02c902.com_636c00157cb41cee8345a98979397e6c/head//3=0,
1056768,a5280e29/default.17216.315_gw01c902.com_95356b91b9f5fd455749a6e55fd1c965/head//3=0,
1056768,e6280e29/default.5722.321_osd187.ceph.com_4b7d05ee419528ad931a84c8dfa88b46/head//3=0,
1056768,77280e29/default.5470.719_gw01c902.com_ed268e9b5a26ffd759a48358d0b67018/head//3=0,
1056768,d8280e29/default.5470.508_gw04c902.com_9308381b2ca7c3c379167dcb7302e165/head//3=0,1056768},
attrs_to_read=)) v1 pg pg[3.e29s2( v 1430'24852 (1385'21848,1430'24852]
local-les=1399 n=24852 ec=315 les/c 1399/1394 1396/1398/315)
[26,44,60,203,19,151,211,90,81,128,198]/[26,44,60,2147483647,19,151,211,90,81,128,198]
r=2 lpr=1398 pi=352-1397/6 luod=0'0 crt=633'13712 active+remapped]
2014-07-02 07:09:48.846196 7fdb620f9700 10 osd.60 pg_epoch: 1430 pg[3.e29s2( v
1430'24852 (1385'21848,1430'24852] local-les=1399 n=24852 ec=315 les/c
1399/1394 1396/1398/315)
[26,44,60,203,19,151,211,90,81,128,198]/[26,44,60,2147483647,19,151,211,90,81,128,198]
r=2 lpr=1398 pi=352-1397/6 luod=0'0 crt=633'13712 active+remapped]
handle_message: MOSDECSubOpRead(3.e29s2 1430 ECSubRead(tid=2587872,
to_read={64280e29/default.5470.106_gw02c902.com_636c00157cb41cee8345a98979397e6c/head//3=0,
1056768,a5280e29/default.17216.315_gw01c902.com_95356b91b9f5fd455749a6e55fd1c965/head//3=0,
1056768,e6280e29/default.5722.321_osd187.ceph.com_4b7d05ee419528ad931a84c8dfa88b46/head//3=0,
1056768,77280e29/default.5470.719_gw01c902.com_ed268e9b5a26ffd759a48358d0b67018/head//3=0,
1056768,d8280e29/default.5470.508_gw04c902.com_9308381b2ca7c3c379167dcb7302e165/head//3=0,1056768},
attrs_to_read=)) v1
2014-07-02 07:09:48.906100 7fdb620f9700 -1 osd/ECBackend.cc: In function 'void
ECBackend::handle_sub_read(pg_shard_t, ECSubRead&, ECSubReadReply*)' thread
7fdb620f9700 time 2014-07-02 07:09:48.895522
osd/ECBackend.cc: 875: FAILED assert(0)
ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (ECBackend::handle_sub_read(pg_shard_t, ECSubRead&, ECSubReadReply*)+0xca6)
[0x94d7e6]
 2: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x452)
[0x95d062]
 3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x250) [0x7eca30]
 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x37c) [0x60e63c]
 5: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>,
ThreadPool::TPHandle&)+0x63d) [0x63e97d]
 6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG>
>::_void_process(void*, ThreadPool::TPHandle&)+0xae) [0x67649e]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0xa8c301]
 8: (ThreadPool::WorkThread::entry()+0x10) [0xa8f340]
 9: /lib64/libpthread.so.0() [0x3087407851]
 10: (clone()+0x6d) [0x30870e890d]

More information:
1. the pool is using EC
2. Ceph version: ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
3. Restarting the OSD, but crashed again later
4. There a lot of objects in the cluster, 320TB used, 477TB/797TB avail


Files

part_osd.log (90.7 KB) part_osd.log Zhi Zhang, 07/08/2014 05:33 AM

Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #8694: OSD crashed (assertion failure) at FileStore::_collection_move_renameDuplicate06/29/2014

Actions
Actions

Also available in: Atom PDF