Project

General

Profile

Actions

Bug #9339

closed

ReplicatedPG crash in hitset_create

Added by Samuel Just over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version 0.84-376-g970d983 (970d9830a3a6e8568337c660fb8b4c4a60a2b3bf)
1: ceph-osd() [0x9a676a]
2: (()+0xfcb0) [0x7fd9bac61cb0]
3: (gsignal()+0x35) [0x7fd9b954d0d5]
4: (abort()+0x17b) [0x7fd9b955083b]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7fd9b9e9f69d]
6: (()+0xb5846) [0x7fd9b9e9d846]
7: (()+0xb5873) [0x7fd9b9e9d873]
8: (()+0xb596e) [0x7fd9b9e9d96e]
9: (operator new[](unsigned long)+0x47e) [0x7fd9baea0b1e]
10: (HitSet::HitSet(HitSet::Params const&)+0x6fd) [0x8743bd]
11: (ReplicatedPG::hit_set_create()+0x129) [0x7d7229]
12: (ReplicatedPG::hit_set_setup()+0x6c) [0x7d77dc]
13: (ReplicatedPG::on_pool_change()+0x122) [0x821d62]
14: (PG::handle_advance_map(std::tr1::shared_ptr<OSDMap const>, std::tr1::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, int, std::vector<int, std::allocator<int> >&, int, PG::RecoveryCtx*)+0x554) [0x781f44]
15: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >)+0x2d0) [0x655830]
16: (OSD::process_peering_events(std::list<PG
, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x235) [0x6563e5]
17: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x12) [0x6a89d2]
18: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa7a246]
19: (ThreadPool::WorkThread::entry()+0x10) [0xa7d2f0]
20: (()+0x7e9a) [0x7fd9bac59e9a]
21: (clone()+0x6d) [0x7fd9b960b31d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

2014-09-03 21:35:42.990988 7fd9a2818700 20 PGPool::update cached_removed_snaps [1~1d2,1d4~e,1e3~7,1eb~8,1f4~2,1f7~a,202~2,205~6,20c~b,218~14,22d~d,23b~2,241~1,243~2] newly_removed_snaps [23c~1,244~1] snapc 244=[242,240,23f,23e,23d,23a,22c,217,20b,204,201,1f6,1f3,1ea,1e2,1d3] (updated)
2014-09-03 21:35:42.990994 7fd9a2818700 10 osd.1 pg_epoch: 854 pg[2.2( v 846'5494 (332'2481,846'5494] local-les=821 n=50 ec=7 les/c 821/821 801/820/783) [1,0] r=0 lpr=820 crt=842'5487 lcod 846'5492 mlcod 846'5492 active+clean snaptrimq=[236~1,243~1]] on_pool_change
2014-09-03 21:35:42.991003 7fd9a2818700 20 osd.1 pg_epoch: 854 pg[2.2( v 846'5494 (332'2481,846'5494] local-les=821 n=50 ec=7 les/c 821/821 801/820/783) [1,0] r=0 lpr=820 crt=842'5487 lcod 846'5492 mlcod 846'5492 active+clean snaptrimq=[236~1,243~1]] hit_set_create bloom{false_positive_probability: 0.05, target_size: 0, seed: 0}
2014-09-03 21:35:42.991054 7fd9a2818700 20 osd.1 pg_epoch: 854 pg[2.2( v 846'5494 (332'2481,846'5494] local-les=821 n=50 ec=7 les/c 821/821 801/820/783) [1,0] r=0 lpr=820 crt=842'5487 lcod 846'5492 mlcod 846'5492 active+clean snaptrimq=[236~1,243~1]] hit_set_create previous set had approx 74 unique items over 0.000633 seconds
2014-09-03 21:35:42.991062 7fd9a2818700 10 osd.1 pg_epoch: 854 pg[2.2( v 846'5494 (332'2481,846'5494] local-les=821 n=50 ec=7 les/c 821/821 801/820/783) [1,0] r=0 lpr=820 crt=842'5487 lcod 846'5492 mlcod 846'5492 active+clean snaptrimq=[236~1,243~1]] hit_set_create target_size 420853080 fpp 0.00625

If the interval is small enough, the denominator causes HitSet to allocate something monstrous.

Actions #1

Updated by Samuel Just over 9 years ago

  • Status changed from 7 to Pending Backport
Actions #2

Updated by Samuel Just over 9 years ago

wip-sam-testing-firefly

Actions #3

Updated by Samuel Just over 9 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF