Bug #612: OSD: Crash during auto scrub - Ceph - Ceph

Actions

Copy link

Bug #612

closed

OSD: Crash during auto scrub

Added by Wido den Hollander over 13 years ago. Updated over 13 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Sage Weil

Category:

OSD

Target version:

v0.24

% Done:

Spent time:

0:06 h

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

After I saw #611 my cluster started to crash. One after the other, the OSD's started to go down, all with a message about the PG scrub.

osd3

2010-11-24 13:40:30.219184 7f1ceb598710 osd3 7496 pg[3.73( v 3632'261 (3632'257,3632'261] n=5 ec=2 les=7105 7496/7496/7496) [4,3] r=1 lcod 0'0 inactive] on_change
2010-11-24 13:40:30.219200 7f1ceb598710 osd3 7496 dec_scrubs_pending 0 -> -1 (max 1, active 0)
2010-11-24 13:40:30.267352 7f1cea596710 -- [2001:16f8:10:2::c3c3:2e3a]:6802/3084 <== osd0 [2001:16f8:10:2::c3c3:8f6b]:6802/22472 6513 ==== osd_ping(e0 as_of 7493) v1 ==== 61+0+0 (1292880165 0 0) 0x986cc40
2010-11-24 13:40:30.267380 7f1cea596710 osd3 7496 heartbeat_dispatch 0x986cc40
2010-11-24 13:40:30.267393 7f1cea596710 osd3 7496 handle_osd_ping from osd0 got stat stat(2010-11-24 13:40:30.617188 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
2010-11-24 13:40:30.368203 7f1cea596710 -- [2001:16f8:10:2::c3c3:2e3a]:6802/3084 <== osd7 [2001:16f8:10:2::c3c3:2f6c]:6802/18600 6409 ==== osd_ping(e0 as_of 7491) v1 ==== 61+0+0 (3351412111 0 0) 0x2470a80
2010-11-24 13:40:30.368244 7f1cea596710 osd3 7496 heartbeat_dispatch 0x2470a80
2010-11-24 13:40:30.368254 7f1cea596710 osd3 7496 handle_osd_ping from osd7 got stat stat(2010-11-24 13:40:30.380154 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
osd/OSD.cc: In function 'void OSD::dec_scrubs_pending()':
osd/OSD.cc:2378: FAILED assert(scrubs_pending >= 0)
 ceph version 0.24~rc (commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d)
 1: /usr/bin/cosd() [0x4c1f75]
 2: (ReplicatedPG::on_change()+0x5f) [0x47b49f]
 3: (OSD::advance_map(ObjectStore::Transaction&)+0x159b) [0x4e775b]
 4: (OSD::handle_osd_map(MOSDMap*)+0x1d66) [0x4eb046]
 5: (OSD::_dispatch(Message*)+0x2c8) [0x4f5848]
 6: (OSD::ms_dispatch(Message*)+0x39) [0x4f62d9]
 7: (SimpleMessenger::dispatch_entry()+0x759) [0x463b29]
 8: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45a99c]
 9: (Thread::_entry_func(void*)+0xa) [0x47015a]
 10: (()+0x69ca) [0x7f1cf4d589ca]
 11: (clone()+0x6d) [0x7f1cf3abe70d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
*** Caught signal (ABRT) ***
 ceph version 0.24~rc (commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d)
 1: (sigabrt_handler(int)+0x7d) [0x5dec4d]
 2: (()+0x33af0) [0x7f1cf3a0baf0]
 3: (gsignal()+0x35) [0x7f1cf3a0ba75]
 4: (abort()+0x180) [0x7f1cf3a0f5c0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f1cf42c18e5]
 6: (()+0xcad16) [0x7f1cf42bfd16]
 7: (()+0xcad43) [0x7f1cf42bfd43]
 8: (()+0xcae3e) [0x7f1cf42bfe3e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x448) [0x5cbc88]
 10: /usr/bin/cosd() [0x4c1f75]
 11: (ReplicatedPG::on_change()+0x5f) [0x47b49f]
 12: (OSD::advance_map(ObjectStore::Transaction&)+0x159b) [0x4e775b]
 13: (OSD::handle_osd_map(MOSDMap*)+0x1d66) [0x4eb046]
 14: (OSD::_dispatch(Message*)+0x2c8) [0x4f5848]
 15: (OSD::ms_dispatch(Message*)+0x39) [0x4f62d9]
 16: (SimpleMessenger::dispatch_entry()+0x759) [0x463b29]
 17: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45a99c]
 18: (Thread::_entry_func(void*)+0xa) [0x47015a]
 19: (()+0x69ca) [0x7f1cf4d589ca]
 20: (clone()+0x6d) [0x7f1cf3abe70d]

osd0 & osd1

2010-11-24 13:41:57.831888 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] scrub  /0 1000004c77c.00000000/head(1330'3 mds0.6:4072 wrlock_by=unknown0.0:0)
2010-11-24 13:41:57.831902 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] scrub got 5/5 objects, 0/0 clones, 593242/593242 bytes, 581/581 kb.
2010-11-24 13:41:57.831916 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] _scrub (scrub) finish
2010-11-24 13:41:57.831934 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] write_info info 309
2010-11-24 13:41:57.831950 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] write_info bigbl 54
2010-11-24 13:41:57.831974 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] share_pg_info
2010-11-24 13:41:57.831995 7ffde11bb710 -- [2001:16f8:10:2::c3c3:8f6b]:6801/22472 --> osd2 [2001:16f8:10:2::c3c3:4a8c]:6801/9520 -- pg_info(1 pgs e7527) v1 -- ?+0 0x5cff380
2010-11-24 13:41:57.832016 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+inconsistent] update_stats 6967'3637
2010-11-24 13:41:57.832030 7ffde11bb710 osd0 7527 dec_scrubs_active 1 -> 0 (max 1, pending 0)
2010-11-24 13:41:57.832037 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+inconsistent] scrub requesting unreserve from osd2
2010-11-24 13:41:57.832052 7ffde11bb710 -- [2001:16f8:10:2::c3c3:8f6b]:6801/22472 --> osd2 [2001:16f8:10:2::c3c3:4a8c]:6801/9520 -- osd_sub_op(unknown0.0:0 1.24 /0 [scrub-unreserve] v 0'0 snapset=0=[]:[] snapc=0=[]) v1 -- ?+0 0xb8fc580
2010-11-24 13:41:57.832081 7ffde11bb710 osd0 7527 pg[1.24( v 7116'14 (1534'4,7116'14] n=5 ec=2 les=7504 7496/7496/6967) [0,2] r=0 mlcod 0'0 active+clean+degraded+inconsistent] scrub done
2010-11-24 13:41:57.832119 7ffde11bb710 osd0 7527 pg[0.25( v 1486'438 (1486'436,1486'438] n=435 ec=2 les=7504 7496/7496/6967) [0,2] r=0 lcod 0'0 mlcod 0'0 active+clean+degraded+inconsistent] scrub start
2010-11-24 13:41:57.832132 7ffde11bb710 osd0 7527 pg[0.25( v 1486'438 (1486'436,1486'438] n=435 ec=2 les=7504 7496/7496/6967) [0,2] r=0 lcod 0'0 mlcod 0'0 active+clean+degraded+scrubbing+inconsistent] update_stats 6967'4384
2010-11-24 13:41:57.835255 7ffde43c3710 -- [2001:16f8:10:2::c3c3:8f6b]:6801/22472 <== osd7 [2001:16f8:10:2::c3c3:2f6c]:6801/18600 9813 ==== pg_info(1 pgs e7527) v1 ==== 316+0+0 (198917258 0 0) 0x24951c0
2010-11-24 13:41:57.835277 7ffde43c3710 osd0 7527 _dispatch 0x24951c0 pg_info(1 pgs e7527) v1
2010-11-24 13:41:57.835284 7ffde43c3710 osd0 7527 handle_pg_info pg_info(1 pgs e7527) v1 from osd7
2010-11-24 13:41:57.835291 7ffde43c3710 osd0 7527 require_same_or_newer_map 7527 (i am 7527) 0x24951c0
2010-11-24 13:41:57.835299 7ffde43c3710 osd0 7527 pg[1.21f( v 7116'437 (1330'433,7116'437] n=4 ec=2 les=7057 7050/7050/7050) [7,0,4] r=1 luod=0'0 lcod 6736'436 active]: _process_pg_info info: 1.21f( v 7116'437 (1330'433,7116'437] n=4 ec=2 les=7057 7050/7050/7050), (log omitted), (missing omitted)
2010-11-24 13:41:57.842995 7ffde3bc2710 -- [2001:16f8:10:2::c3c3:8f6b]:6802/22472 <== osd5 [2001:16f8:10:2::c3c3:3b6a]:6802/15476 133 ==== osd_ping(e0 as_of 7525) v1 ==== 61+0+0 (3586543513 0 0) 0x6c3f000
2010-11-24 13:41:57.843018 7ffde3bc2710 osd0 7527 heartbeat_dispatch 0x6c3f000
2010-11-24 13:41:57.843024 7ffde3bc2710 osd0 7527 handle_osd_ping from osd5 got stat stat(2010-11-24 13:41:57.476449 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
2010-11-24 13:41:57.843041 7ffde3bc2710 osd0 7527 take_peer_stat peer osd5 stat(2010-11-24 13:41:57.476449 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
osd/PG.cc: In function 'void PG::scrub()':
osd/PG.cc:2971: FAILED assert(osd->scrubs_pending >= 0)
 ceph version 0.24~rc (commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d)
 1: (PG::scrub()+0x24a7) [0x542667]
 2: (ThreadPool::worker()+0x28f) [0x5cc57f]
 3: (ThreadPool::WorkThread::entry()+0xd) [0x4fe19d]
 4: (Thread::_entry_func(void*)+0xa) [0x47015a]
 5: (()+0x69ca) [0x7ffdee3849ca]
 6: (clone()+0x6d) [0x7ffded0ea70d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
*** Caught signal (ABRT) ***
 ceph version 0.24~rc (commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d)
 1: (sigabrt_handler(int)+0x7d) [0x5dec4d]
 2: (()+0x33af0) [0x7ffded037af0]
 3: (gsignal()+0x35) [0x7ffded037a75]
 4: (abort()+0x180) [0x7ffded03b5c0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7ffded8ed8e5]
 6: (()+0xcad16) [0x7ffded8ebd16]
 7: (()+0xcad43) [0x7ffded8ebd43]
 8: (()+0xcae3e) [0x7ffded8ebe3e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x448) [0x5cbc88]
 10: (PG::scrub()+0x24a7) [0x542667]
 11: (ThreadPool::worker()+0x28f) [0x5cc57f]
 12: (ThreadPool::WorkThread::entry()+0xd) [0x4fe19d]
 13: (Thread::_entry_func(void*)+0xa) [0x47015a]
 14: (()+0x69ca) [0x7ffdee3849ca]
 15: (clone()+0x6d) [0x7ffded0ea70d]

What I noticed in 'ceph -w', is:

2010-11-24 13:39:22.427341   log 2010-11-24 13:39:21.143950 osd1 [2001:16f8:10:2::c3c3:a24f]:6800/3750 222 : [INF] 3.11d scrub ok
2010-11-24 13:39:22.918378   log 2010-11-24 13:39:22.020493 osd11 [2001:16f8:10:2::c3c3:ab76]:6800/11274 368 : [ERR] 0.38 scrub osd3 /0 size 22890 != 4194304
2010-11-24 13:39:24.626067   log 2010-11-24 13:39:23.654725 osd11 [2001:16f8:10:2::c3c3:ab76]:6800/11274 369 : [INF] 0.7e scrub ok
2010-11-24 13:39:25.187218    pg v97833: 3216 pgs: 1 active, 2483 active+clean, 565 peering, 33 crashed+peering, 12 down+peering, 89 crashed+down+peering, 28 active+clean+inconsistent, 5 inconsistent+peering; 429 GB data, 1390 GB used, 3299 GB / 4713 GB avail; 1/1410396 degraded (0.000%)
....
....

2010-11-24 13:39:29.149193   log 2010-11-24 13:39:27.966334 osd1 [2001:16f8:10:2::c3c3:a24f]:6800/3750 224 : [INF] 3.170 scrub ok
2010-11-24 13:39:31.216851   log 2010-11-24 13:39:30.163439 osd1 [2001:16f8:10:2::c3c3:a24f]:6800/3750 225 : [INF] 3.175 scrub ok
2010-11-24 13:39:31.216851   log 2010-11-24 13:39:31.011049 mon0 [2001:16f8:10:2::c3c3:af78]:6789/0 11 : [INF] osd5 [2001:16f8:10:2::c3c3:3b6a]:6800/15476 failed (by osd0 [2001:16f8:10:2::c3c3:8f6b]:6800/22472)
2010-11-24 13:39:31.216851   log 2010-11-24 13:39:31.022855 mon0 [2001:16f8:10:2::c3c3:af78]:6789/0 12 : [INF] osd5 [2001:16f8:10:2::c3c3:3b6a]:6800/15476 failed (by osd1 [2001:16f8:10:2::c3c3:a24f]:6800/3750)
2010-11-24 13:39:31.556649    pg v97837: 3216 pgs: 1 active, 2482 active+clean, 565 peering, 33 crashed+peering, 12 down+peering, 89 crashed+down+peering, 29 active+clean+inconsistent, 5 inconsistent+peering; 429 GB data, 1390 GB used, 3299 GB / 4713 GB avail; 1/1410396 degraded (0.000%)
2010-11-24 13:39:31.872471   osd e7474: 12 osds: 9 up, 10 in

But osd5 never went down, it just got busy, since it had been down after #611

I've ran cdebugpack on 4 OSD's and uploaded the data to logger.ceph.widodh.nl:/srv/ceph/issues/osd_crash_pg_auto_scrub

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Greg Farnum over 13 years ago

Assignee set to Colin McCabe

This looks like the OSD is assembling a list of missing queries and then sending them out without bothering to check if the recipient OSD is up. It needs to filter the list somewhere, but I'm not sure where's appropriate and this is your code, so it's yours! :)

Actions

Copy link

Updated by Greg Farnum over 13 years ago

Assignee changed from Colin McCabe to Samuel Just

Dunno how, but somehow commented/assigned this and another bug backwards. Meant to say:
Sam said he'd look at this since it's in the background scrubbing bits that he and Josh did.

Actions

Copy link