Bug #12410: OSDMonitor::preprocess_get_osdmap: must send the last map as well - Ceph - Ceph

Actions

Copy link

Bug #12410

closed

OSDMonitor::preprocess_get_osdmap: must send the last map as well

Added by Yuri Weinstein almost 9 years ago. Updated over 8 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Samuel Just

Category:

Monitor

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

hammer

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

upgrade/hammer-x

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-07-19_17:10:02-upgrade:hammer-x-next-distro-basic-vps/
Jobs: ['979942', '979943', '979944', '979946', '979947', '979948', '979949', '979952', '979954']
Logs for one: http://qa-proxy.ceph.com/teuthology/teuthology-2015-07-19_17:10:02-upgrade:hammer-x-next-distro-basic-vps/979948/teuthology.log

2015-07-19T17:59:50.881 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 10 pgs peering; 10 pgs stuck inactive; 10 pgs stuck unclean; 7 requests are blocked > 32 sec
2015-07-19T17:59:57.882 INFO:teuthology.orchestra.run.vpm113:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2015-07-19T17:59:58.097 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 10 pgs peering; 10 pgs stuck inactive; 10 pgs stuck unclean; 7 requests are blocked > 32 sec
2015-07-19T17:59:59.096 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 56, in run_tasks
    manager.__enter__()
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph.py", line 1136, in restart
    healthy(ctx=ctx, config=None)
  File "/var/lib/teuthworker/src/ceph-qa-suite_next/tasks/ceph.py", line 1042, in healthy
    remote=mon0_remote,
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 875, in wait_until_healthy
    while proceed():
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 134, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: 'wait_until_healthy' reached maximum tries (150) after waiting for 900 seconds

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Yuri Weinstein almost 9 years ago

Run: http://pulpito.ceph.com/teuthology-2015-07-21_17:10:25-upgrade:hammer-x-next-distro-basic-vps/
Jobs: ['982806', '982807', '982811', '982812', '982813', '982816', '982817', '982818']

Actions

Copy link

Updated by Samuel Just almost 9 years ago

Probably similar: http://pulpito.ceph.com/teuthology-2015-07-23_09:51:45-upgrade:hammer-x:point-to-point-x-next-distro-basic-vps/

PGs get stuck peering apparently because they are waiting on a map. Mons are running current hammer at that point 45beb86423c3bd74dbafd36c6822e71ad9680e17, OSDs are running something close to next/master 29ea23c3ff997edbccf04bdf4d983c6630f8ff18.

Mon appears to be ignoring requests for upthru as dups, but osd didn't get the map.

Actions

Copy link

Updated by Yuri Weinstein almost 9 years ago

Run: http://pulpito.ceph.com/teuthology-2015-07-21_17:02:02-upgrade:hammer-x-next-distro-basic-multi/
Jobs: ['982771', '982773', '982775', '982777', '982779', '982781', '982783']

Actions

Copy link

Updated by Yuri Weinstein over 8 years ago

Subject changed from "Failure: 'wait_until_healthy'" stuck pgs peering in upgrade:hammer-x-next-distro-basic-vpsu to "Failure: 'wait_until_healthy'" stuck pgs peering in upgrade:hammer-x-next-distro-basic-vps

Run: http://pulpito.ceph.com/teuthology-2015-07-28_17:10:15-upgrade:hammer-x-next-distro-basic-vps/
Jobs: ['990367', '990368', '990370', '990371', '990372', '990373', '990375', '990378', '990383', '990384']

Actions

Copy link

Updated by Samuel Just over 8 years ago

Looking at /a/teuthology-2015-07-28_17:10:15-upgrade:hammer-x-next-distro-basic-vps/990370/remote

Looks like 2/3 mons are on 0.94.2-156-g8355bda, one is on 9.0.2-680-ge957872. Once again, pgs seem to be stuck peering.

The problem seems to be that those pgs are waiting on upthru. osd.0 and osd.1 never get map 72, which presumably has the right upthru value. osd.2 and osd.3 do.

Actions

Copy link

Updated by Samuel Just over 8 years ago

OSD:

2015-07-29 03:46:08.591454 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 61 ==== osd_map(71..72 src has 1..72) v3 ==== 428+0+0 (3579990867 0 0) 0x551b680 con 0x5124f20
2015-07-29 03:46:08.591468 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.591470 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.591472 7f1fc844a700 20 osd.0 71 _dispatch 0x551b680 osd_map(71..72 src has 1..72) v3
2015-07-29 03:46:08.591480 7f1fc844a700 3 osd.0 71 handle_osd_map epochs [71,72], i have 71, src has [1,72]
2015-07-29 03:46:08.591484 7f1fc844a700 10 osd.0 71 handle_osd_map got inc map for epoch 72
2015-07-29 03:46:08.591609 7f1fc844a700 2 osd.0 71 got incremental 72 but failed to encode full with correct crc; requesting
2015-07-29 03:46:08.591619 7f1fc844a700 0 log_channel(cluster) log [WRN] : failed to encode map e72 with expected crc
2015-07-29 03:46:08.591642 7f1fc844a700 1 -- 10.214.130.161:6801/16764 --> 10.214.130.105:6789/0 -- mon_get_osdmap(full 72-72) v1 -- ?+0 0x47b1e00 con 0x5124f20
2015-07-29 03:46:08.591828 7f1fc844a700 10 osd.0 71 no new maps here, dropping
2015-07-29 03:46:08.591837 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.591839 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.591847 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 62 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1443808793 0 0) 0x54a85a0 con 0x5124f20
2015-07-29 03:46:08.592243 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 63 ==== osd_map(72..72 src has 1..72) v3 ==== 230+0+0 (1500960621 0 0) 0x551b440 con 0x5124f20
2015-07-29 03:46:08.592265 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.592268 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.592270 7f1fc844a700 20 osd.0 71 _dispatch 0x551b440 osd_map(72..72 src has 1..72) v3
2015-07-29 03:46:08.592277 7f1fc844a700 3 osd.0 71 handle_osd_map epochs [72,72], i have 71, src has [1,72]
2015-07-29 03:46:08.592280 7f1fc844a700 10 osd.0 71 handle_osd_map got inc map for epoch 72
2015-07-29 03:46:08.592386 7f1fc844a700 2 osd.0 71 got incremental 72 but failed to encode full with correct crc; requesting
2015-07-29 03:46:08.592396 7f1fc844a700 0 log_channel(cluster) log [WRN] : failed to encode map e72 with expected crc
2015-07-29 03:46:08.592415 7f1fc844a700 1 -- 10.214.130.161:6801/16764 --> 10.214.130.105:6789/0 -- mon_get_osdmap(full 72-72) v1 -- ?+0 0x47b0000 con 0x5124f20
2015-07-29 03:46:08.592471 7f1fc844a700 10 osd.0 71 no new maps here, dropping
2015-07-29 03:46:08.592479 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.592481 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.592486 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 64 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1443808793 0 0) 0x3f7e3c0 con 0x5124f20
2015-07-29 03:46:08.592505 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 65 ==== osd_map(0..0 src has 1..72) v3 ==== 32+0+0 (2065093375 0 0) 0x551b200 con 0x5124f20
2015-07-29 03:46:08.592515 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.592518 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.592520 7f1fc844a700 20 osd.0 71 _dispatch 0x551b200 osd_map(0..0 src has 1..72) v3
2015-07-29 03:46:08.592526 7f1fc844a700 3 osd.0 71 handle_osd_map epochs [0,0], i have 71, src has [1,72]
2015-07-29 03:46:08.592530 7f1fc844a700 10 osd.0 71 no new maps here, dropping
2015-07-29 03:46:08.592532 7f1fc844a700 10 osd.0 71 do_waiters -- start
2015-07-29 03:46:08.592534 7f1fc844a700 10 osd.0 71 do_waiters -- finish
2015-07-29 03:46:08.593065 7f1fc844a700 1 -- 10.214.130.161:6801/16764 <== mon.0 10.214.130.105:6789/0 66 ==== osd_map(0..0 src has 1..72) v3 ==== 32+0+0 (2065093375 0 0) 0x551afc0 con 0x5124f20

MON:

2015-07-29 03:48:12.987426 7f1ae5f93700 1 -- 10.214.130.105:6789/0 <== osd.0 10.214.130.161:6801/16764 80 ==== mon_get_osdmap(full 72-72) v1 ==== 34+0+0 (1226373273 0 0) 0x43ce000 con 0x4192000
2015-07-29 03:48:12.987442 7f1ae5f93700 20 mon.b@0(leader) e1 have connection
2015-07-29 03:48:12.987444 7f1ae5f93700 20 mon.b@0(leader) e1 ms_dispatch existing session MonSession: osd.0 10.214.130.161:6801/16764 is openallow * for osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:12.987449 7f1ae5f93700 20 mon.b@0(leader) e1 caps allow *
2015-07-29 03:48:12.987451 7f1ae5f93700 10 mon.b@0(leader).paxosservice(osdmap 1..72) dispatch mon_get_osdmap(full 72-72) v1 from osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:12.987457 7f1ae5f93700 5 mon.b@0(leader).paxos(paxos active c 1..396) is_readable = 1 - now=2015-07-29 03:48:12.987458 lease_expire=2015-07-29 03:48:27.643940 has v0 lc 396
2015-07-29 03:48:12.987463 7f1ae5f93700 10 mon.b@0(leader).osd e72 preprocess_query mon_get_osdmap(full 72-72) v1 from osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:12.987467 7f1ae5f93700 10 mon.b@0(leader).osd e72 preprocess_get_osdmap mon_get_osdmap(full 72-72) v1
2015-07-29 03:48:12.987471 7f1ae5f93700 1 -- 10.214.130.105:6789/0 --> 10.214.130.161:6801/16764 -- osd_map(0..0 src has 1..72) v3 -- ?+0 0x3ae3b00 con 0x4192000
2015-07-29 03:48:13.248741 7fc64f8ae700 11 mon.a@1(peon) e1 tick
2015-07-29 03:48:13.248772 7fc64f8ae700 10 mon.a@1(peon).pg v163 v163: 32 pgs: 11 peering, 21 active+clean; 127 GB data, 850 MB used, 798 GB / 799 GB avail
2015-07-29 03:48:13.248801 7fc64f8ae700 10 mon.a@1(peon).mds e8 e8: 1/1/1 up {0=a=up:active}
2015-07-29 03:48:13.248814 7fc64f8ae700 10 mon.a@1(peon).osd e72 e72: 4 osds: 4 up, 4 in
2015-07-29 03:48:13.248819 7fc64f8ae700 10 mon.a@1(peon).log v175 log
2015-07-29 03:48:13.248823 7fc64f8ae700 10 mon.a@1(peon).auth v3 auth
2015-07-29 03:48:13.248827 7fc64f8ae700 20 mon.a@1(peon) e1 sync_trim_providers
2015-07-29 03:48:13.481910 7f1ae5f93700 1 -- 10.214.130.105:6789/0 <== osd.0 10.214.130.161:6801/16764 81 ==== mon_get_osdmap(full 72-72) v1 ==== 34+0+0 (1226373273 0 0) 0x3999c00 con 0x4192000
2015-07-29 03:48:13.481940 7f1ae5f93700 20 mon.b@0(leader) e1 have connection
2015-07-29 03:48:13.481944 7f1ae5f93700 20 mon.b@0(leader) e1 ms_dispatch existing session MonSession: osd.0 10.214.130.161:6801/16764 is openallow * for osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:13.481951 7f1ae5f93700 20 mon.b@0(leader) e1 caps allow *
2015-07-29 03:48:13.481954 7f1ae5f93700 10 mon.b@0(leader).paxosservice(osdmap 1..72) dispatch mon_get_osdmap(full 72-72) v1 from osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:13.481961 7f1ae5f93700 5 mon.b@0(leader).paxos(paxos active c 1..396) is_readable = 1 - now=2015-07-29 03:48:13.481961 lease_expire=2015-07-29 03:48:27.643940 has v0 lc 396
2015-07-29 03:48:13.481969 7f1ae5f93700 10 mon.b@0(leader).osd e72 preprocess_query mon_get_osdmap(full 72-72) v1 from osd.0 10.214.130.161:6801/16764
2015-07-29 03:48:13.481974 7f1ae5f93700 10 mon.b@0(leader).osd e72 preprocess_get_osdmap mon_get_osdmap(full 72-72) v1
2015-07-29 03:48:13.481977 7f1ae5f93700 1 -- 10.214.130.105:6789/0 --> 10.214.130.161:6801/16764 -- osd_map(0..0 src has 1..72) v3 -- ?+0 0x44d5b00 con 0x4192000

Mon seems to be sending maps 0..0 when 72 is requested. Odd.

Actions

Copy link

Updated by Samuel Just over 8 years ago

Part of the bug seems to be that preprocess_get_osdmap only sends up to m->get_full_last()-1. That's been true for a long time though, not sure why it's starting to be a problem now. Might be that this test upgrades osds before mons, and normally we don't do that?

...
  for (epoch_t e = MAX(first, m->get_full_first());
       e < MIN(last, m->get_full_last()) && max > 0;
       ++e, --max) {
    int r = get_version_full(e, reply->maps[e]);
    assert(r >= 0);
  }
  for (epoch_t e = MAX(first, m->get_inc_first());
       e < MIN(last, m->get_inc_last()) && max > 0;
       ++e, --max) {
    int r = get_version(e, reply->incremental_maps[e]);
    assert(r >= 0);
  }
...

Actions

Copy link

Updated by Samuel Just over 8 years ago

Yuri: this test upgrades the osds before the mons. Is that new? Do the existing tests upgrade the mons first? If this test is the first to upgrade the osds before the mons, then I think https://github.com/athanatos/ceph/commit/42c47b7cfb52d2d5f197cbf9ec4b4537782891b3 is the fix. If this test previously worked, though, there must be something else since that bug has been around for a year.

Actions

Copy link

Updated by Kefu Chai over 8 years ago

the fix of e <= MIN looks good to me, coz the osd was off the last epoch by only 1. and we don't reply the OSD with any osdmap or incremental in that case.

and we do have existing tests which restart the osd after monitor after upgrading to hammer, see:

and the test previously worked, for example: http://pulpito.ceph.com/teuthology-2015-07-20_16:05:09-upgrade:hammer-hammer-distro-basic-vps/980922/

i guess what triggered this issue is "osdmap crc mismatch":

2015-07-29 03:46:08.591472 7f1fc844a700 20 osd.0 71 _dispatch 0x551b680 osd_map(71..72 src has 1..72) v3
2015-07-29 03:46:08.591480 7f1fc844a700 3 osd.0 71 handle_osd_map epochs [71,72], i have 71, src has [1,72]
2015-07-29 03:46:08.591484 7f1fc844a700 10 osd.0 71 handle_osd_map got inc map for epoch 72
2015-07-29 03:46:08.591609 7f1fc844a700 2 osd.0 71 got incremental 72 but failed to encode full with correct crc; requesting
2015-07-29 03:46:08.591619 7f1fc844a700 0 log_channel(cluster) log [WRN] : failed to encode map e72 with expected crc
2015-07-29 03:46:08.591642 7f1fc844a700 1 -- 10.214.130.161:6801/16764 --> 10.214.130.105:6789/0 -- mon_get_osdmap(full 72-72) v1 -- ?+0 0x47b1e00 con 0x5124f20

that's why the osd tried to ask for a single full osdmap when it's one epoch behind.

Actions

Copy link

#10

Updated by Loïc Dachary over 8 years ago

Backport set to hammer

Actions

Copy link

#11

Updated by Loïc Dachary over 8 years ago

Project changed from rgw to Ceph
Subject changed from "Failure: 'wait_until_healthy'" stuck pgs peering in upgrade:hammer-x-next-distro-basic-vps to OSDMonitor::preprocess_get_osdmap: must send the last map as well
Category set to Monitor
Status changed from New to 7
Assignee set to Samuel Just