Actions
Bug #36372
closedOSD:Segmentation fault thread_name:tp_osd_tp--10.2.10
Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I have a radosgw cluster,using SSD as index pool.
this is the second time that three ssd osd down with error: Caught signal (Segmentation fault) ** in thread 7f40a31ed700 thread_name:tp_osd_tp
based on the log,it seems these osds are doing gc and leveldb compact.
the omap of all osd all below 1G.
2018-10-09 15:24:10.928644 7f40a49f0700 0 <cls> cls/rgw/cls_rgw.cc:962: rgw_bucket_complete_op(): entry.name=_multipart_
qyygvlnzsysbivasdazt5q/backup/_realm_data.tar.gz.0.2~xhamSQArn7VklJbokArSW8Juz-Y4Ne9.6 entry.instance= entry.meta.categor
y=1
2018-10-09 15:24:11.802039 7f40a31ed700 0 <cls> cls/rgw/cls_rgw.cc:3223: gc_iterate_entries end_key=1_01539069851.802036688
2018-10-09 15:24:11.939506 7f40b2415700 1 leveldb: Compacting 1@0 + 8@1 files
2018-10-09 15:24:12.000742 7f40b2415700 1 leveldb: Generated table #293566: 66090 keys, 1570112 bytes
2018-10-09 15:24:12.095546 7f40b2415700 1 leveldb: Generated table #293567: 110772 keys, 2138845 bytes
2018-10-09 15:24:12.119825 7f40b2415700 1 leveldb: Generated table #293568: 19291 keys, 773711 bytes
2018-10-09 15:24:12.183109 7f40b2415700 1 leveldb: Generated table #293569: 10398 keys, 2123221 bytes
2018-10-09 15:24:12.185634 7f40b2415700 1 leveldb: Generated table #293570: 3080 keys, 54454 bytes
2018-10-09 15:24:12.188960 7f40b2415700 1 leveldb: Generated table #293571: 4822 keys, 85400 bytes
2018-10-09 15:24:12.244189 7f40b2415700 1 leveldb: Generated table #293572: 26542 keys, 2133394 bytes
2018-10-09 15:24:12.248450 7f40b2415700 1 leveldb: Generated table #293573: 4820 keys, 72908 bytes
2018-10-09 15:24:12.248462 7f40b2415700 1 leveldb: Compacted 1@0 + 8@1 files => 8952045 bytes
2018-10-09 15:24:12.248750 7f40b2415700 1 leveldb: compacted to: files[ 0 8 63 285 0 0 0 ]
2018-10-09 15:24:12.568564 7f40a31ed700 -1 *** Caught signal (Segmentation fault) **
in thread 7f40a31ed700 thread_name:tp_osd_tp
ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
1: (()+0x961ee7) [0x5579b753eee7]
2: (()+0xf890) [0x7f40c88e7890]
3: (std::string::assign(std::string const&)+0x14) [0x7f40c727b2e4]
4: (()+0xacb8c) [0x7f40b3ebbb8c]
5: (ClassHandler::ClassMethod::exec(void*, ceph::buffer::list&, ceph::buffer::list&)+0x34) [0x5579b6fdd414]
6: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x1df8) [0x5579b70dc808]
7: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x61) [0x5579b70ec851]
8: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x966) [0x5579b70f4996]
9: (ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)+0x323a) [0x5579b70f927a]
10: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x727) [0x5579b70b13a7]
11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x420) [0x5579b6f57b10]
12: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x6a) [0x5579b6f57d6a]
13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x787) [0x5579b6f72a57]
14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8b6) [0x5579b76349a6]
15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5579b7636960]
16: (()+0x8064) [0x7f40c88e0064]
17: (clone()+0x6d) [0x7f40c69e162d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
-10000> 2018-10-09 15:23:28.334705 7f4098fad700 1 -- 10.204.22.37:6859/142746 <== osd.155 10.204.22.23:0/29542 5316124 ==== osd_ping(ping e16677 stamp 2018-10-09 15:23:28.330696) v3 ==== 2004+0+0 (3938300591 0 0) 0x5579d6a94600 con 0x5579d036d080
Files
Updated by lin zhou over 5 years ago
- File ceph-osd.323.log.gz ceph-osd.323.log.gz added
Updated by John Spray over 5 years ago
- Project changed from Ceph to RADOS
- Category deleted (
OSD)
Updated by Casey Bodley over 5 years ago
- Related to Bug #26882: jewel: cls_rgw: avoid undefined iterator access added
Updated by Casey Bodley over 5 years ago
- Status changed from New to Duplicate
this bug only showed up in jewel, and there's a fix staged at https://github.com/ceph/ceph/pull/23495
Actions