Project

General

Profile

Actions

Bug #38219

closed

rebuild-mondb hangs

Added by Sage Weil about 5 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus, octopus, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://pulpito.ceph.com/sage-2019-02-06_23:33:50-rados-master-distro-basic-smithi/

2019-02-06T23:50:53.045 INFO:tasks.ceph.mon.a.smithi097.stderr:/build/ceph-14.0.1-3328-g44ff972/src/mon/OSDMonitor.cc: In function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread 7f47cb9d9700 time 2019-02-06 23:50:53.044652
2019-02-06T23:50:53.045 INFO:tasks.ceph.mon.a.smithi097.stderr:/build/ceph-14.0.1-3328-g44ff972/src/mon/OSDMonitor.cc: 384: FAILED ceph_assert(version > osdmap.epoch)
2019-02-06T23:50:53.046 INFO:tasks.ceph.mon.a.smithi097.stderr: ceph version 14.0.1-3328-g44ff972 (44ff9720582cae805c727d210b665786cb7519a0) nautilus (dev)
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f47d870259a]
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f47d8702775]
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 3: (OSDMonitor::update_from_paxos(bool*)+0x3858) [0x56220e1fcfb8]
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 4: (PaxosService::refresh(bool*)+0x25a) [0x56220e17c8aa]
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 5: (Monitor::refresh_from_paxos(bool*)+0x10c) [0x56220e00516c]
2019-02-06T23:50:53.047 INFO:tasks.ceph.mon.a.smithi097.stderr: 6: (Paxos::do_refresh()+0x4f) [0x56220e16b65f]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 7: (Paxos::handle_last(boost::intrusive_ptr<MonOpRequest>)+0x8c0) [0x56220e1761c0]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 8: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x243) [0x56220e176f43]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x15c5) [0x56220e036c85]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 10: (Monitor::_ms_dispatch(Message*)+0x4d2) [0x56220e0372d2]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 11: (Monitor::ms_dispatch(Message*)+0x23) [0x56220e065263]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 12: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x56220e061716]
2019-02-06T23:50:53.048 INFO:tasks.ceph.mon.a.smithi097.stderr: 13: (DispatchQueue::entry()+0x1569) [0x7f47d8912089]

/a/sage-2019-02-06_23:33:50-rados-master-distro-basic-smithi/3557785


Related issues 3 (0 open3 closed)

Copied to RADOS - Backport #51237: nautilus: rebuild-mondb hangsResolvedKefu ChaiActions
Copied to RADOS - Backport #51552: octopus: rebuild-mondb hangsResolvedCory SnyderActions
Copied to RADOS - Backport #51553: pacific: rebuild-mondb hangsResolvedKefu ChaiActions
Actions #1

Updated by Sage Weil about 5 years ago

/a/sage-2019-03-03_23:01:07-rados-wip-sage3-testing-2019-03-03-1043-distro-basic-smithi/3664297

Actions #2

Updated by Neha Ojha about 5 years ago

rados:singleton/{all/rebuild-mondb.yaml msgr-failures/many.yaml msgr/async.yaml objectstore/bluestore-bitmap.yaml rados.yaml supported-random-distro$/{ubuntu_latest.yaml}}

Failed in 2 out 10 runs. http://pulpito.ceph.com/nojha-2019-03-06_01:06:50-rados:singleton-master-distro-basic-smithi/

Actions #3

Updated by Kefu Chai about 5 years ago

  • Assignee set to Kefu Chai
Actions #4

Updated by Sage Weil about 5 years ago

/a/sage-2019-03-26_03:52:56-rados-wip-sage-testing-2019-03-25-1934-distro-basic-smithi/3774206

Actions #5

Updated by Kefu Chai about 5 years ago

i am using following script to reproduce this issue locally, so far no luck

MDS=0 MGR=1 OSD=3 MON=3 ../src/vstart.sh -n --without-dashboard

sleep 5
monmap_fn=/tmp/monmap
bin/ceph mon getmap -o $monmap_fn
bin/init-ceph stop mon
for name in a b c; do
    if [ "$name" == "a" ]; then
        rm -rf dev/mon.$name/store.db
    else
        rm -rf dev/mon.$name
    fi
done

mon_store_dir=dev/mon.a
for osd in `seq 0 2`; do
    bin/init-ceph stop osd.$osd
    bin/ceph-objectstore-tool --no-mon-config \
                              --op update-mon-db \
                              --mon-store-path $mon_store_dir \
                              --data-path dev/osd$osd
done

key_fn=keyring
bin/ceph-authtool $key_fn -n 'mon.' --cap mon 'allow *'
bin/ceph-authtool $key_fn -n 'client.admin' \
                  '--cap' 'mon' 'allow *' \
                  '--cap' 'osd' 'allow *' \
                  '--cap' 'mds' 'allow *' \
                  '--cap' 'mgr' 'allow *'
bin/ceph-monstore-tool $mon_store_dir \
                       rebuild -- \
                       --keyring $key_fn \
                       --monmap $monmap_fn

for name in a b c; do
    bin/ceph-mon --cluster ceph --mkfs \
             -i $name \
             --keyring $key_fn \
             --monmap $monmap_fn
    bin/init-ceph start mon.$name
    sleep 5
done

for osd in `seq 0 2`; do
    bin/init-ceph start osd.$osd
done

for name in x; do
    bin/init-ceph stop mgr.$name
    key_fn=dev/mgr.$name/keyring
    bin/ceph -i $key_fn auth add mgr.$name \
         mon 'allow profile mgr' \
         mds 'allow *' \
         osd 'allow *'
    bin/init-ceph start mgr.$name
done

will rerun the test in sepia with more verbose logging messages,

Actions #8

Updated by Kefu Chai about 5 years ago

### on osd host

+ sudo adjust-ulimits ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --no-mon-config --op update-mon-db --mon-store-path /home/ubuntu/cephtest/mon-store
2019-04-04T09:10:09.073 INFO:teuthology.orchestra.run.smithi116.stdout:adding auth for osd.0
2019-04-04T09:10:09.141 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#1
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#1
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 1
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 1
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#2
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#2
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 2
2019-04-04T09:10:09.142 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 2
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#3
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#3
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 3
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 3
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#4
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#4
2019-04-04T09:10:09.143 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 4
2019-04-04T09:10:09.144 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 4
2019-04-04T09:10:09.144 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#5
2019-04-04T09:10:09.144 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#5
2019-04-04T09:10:09.144 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 5
2019-04-04T09:10:09.144 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 5
2019-04-04T09:10:09.145 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#6
2019-04-04T09:10:09.145 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#6
2019-04-04T09:10:09.145 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 6
2019-04-04T09:10:09.145 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 6
2019-04-04T09:10:09.145 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#7
2019-04-04T09:10:09.146 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#7
2019-04-04T09:10:09.146 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 7
2019-04-04T09:10:09.146 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 7
2019-04-04T09:10:09.146 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#8
2019-04-04T09:10:09.147 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#8
2019-04-04T09:10:09.147 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 8
2019-04-04T09:10:09.147 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 8
2019-04-04T09:10:09.179 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#9
2019-04-04T09:10:09.179 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#9
2019-04-04T09:10:09.179 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 9
2019-04-04T09:10:09.179 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 9
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#10
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#10
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 10
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 10
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#11
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#11
2019-04-04T09:10:09.180 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 11
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 11
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#12
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#12
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 12
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 12
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#13
2019-04-04T09:10:09.181 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#13
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 13
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 13
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#14
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#14
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 14
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 14
2019-04-04T09:10:09.182 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/inc#15
2019-04-04T09:10:09.183 INFO:teuthology.orchestra.run.smithi116.stdout:adding osdmap/full#15
2019-04-04T09:10:09.183 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/last_committed = 15
2019-04-04T09:10:09.183 INFO:teuthology.orchestra.run.smithi116.stdout:set osdmap/full_latest = 15
2019-04-04T09:10:09.189 INFO:teuthology.orchestra.run.smithi116.stdout:osd.0   : 0 osdmaps trimmed, 15 osdmaps added.

### on the monitor's host
# ceph-monstore-tool --debug-rocksdb=0 /var/lib/ceph/mon/ceph-a/ dump-keys
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option write_buffer_size = 33554432
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:43:31.531 7fcada309280  0  set rocksdb option write_buffer_size = 33554432
auth / 1
auth / 2
auth / 3
auth / 4
auth / first_committed
auth / format_version
auth / last_committed
config / first_committed
config / last_committed
health / auth
health / first_committed
health / health
health / last_committed
health / leader
health / mdsmap
health / mgr
health / mgrstat
health / osdmap
health / quorum
logm / 1
logm / first_committed
logm / full_0
logm / full_latest
logm / last_committed
mdsmap / 1
mdsmap / first_committed
mdsmap / last_committed
mgr / 1
mgr / first_committed
mgr / last_committed
mgr_command_desc /
mgr_command_descs /
mgrstat / 1
mgrstat / first_committed
mgrstat / last_committed
mkfs / monmap
mon_config_key / config-history/1/
monitor / cluster_uuid
monitor / election_epoch
monitor / election_writeable_test
monitor / feature_set
monitor / magic
monitor_store / last_metadata
osd_pg_creating / creating
osdmap / 1
osdmap / 10
osdmap / 11
osdmap / 12
osdmap / 13
osdmap / 14
osdmap / 15
osdmap / 2
osdmap / 3
osdmap / 4
osdmap / 5
osdmap / 6
osdmap / 7
osdmap / 8
osdmap / 9
osdmap / first_committed
osdmap / full_1
osdmap / full_10
osdmap / full_11
osdmap / full_12
osdmap / full_13
osdmap / full_14
osdmap / full_15
osdmap / full_2
osdmap / full_3
osdmap / full_4
osdmap / full_5
osdmap / full_6
osdmap / full_7
osdmap / full_8
osdmap / full_9
osdmap / full_latest
osdmap / last_committed
paxos / 1
paxos / 2
paxos / first_committed
paxos / last_committed
paxos / last_pn
paxos / pending_pn
paxos / pending_v

# ceph-monstore-tool --debug-rocksdb=0 /var/lib/ceph/mon/ceph-a/ show-versions -- --map-type osdmap
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option write_buffer_size = 33554432
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:43:09.084 7f004491d280  0  set rocksdb option write_buffer_size = 33554432
first committed:    1
last  committed:    1

# ceph-monstore-tool --debug-rocksdb=0 /var/lib/ceph/mon/ceph-a/ show-versions -- --map-type paxos
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option write_buffer_size = 33554432
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:44:20.202 7f5eb4f23280  0  set rocksdb option write_buffer_size = 33554432
first committed:    1
last  committed:    2

# ceph-monstore-tool --debug-rocksdb=0 /var/lib/ceph/mon/ceph-a/ show-versions -- --map-type auth
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option write_buffer_size = 33554432
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option compression = kNoCompression
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option level_compaction_dynamic_level_bytes = true
2019-04-04 09:44:51.982 7feb443cd280  0  set rocksdb option write_buffer_size = 33554432
first committed:    1
last  committed:    1
Actions #9

Updated by Kefu Chai about 5 years ago

  • Status changed from 12 to Fix Under Review
  • Backport set to luminous, mimic, nautilus
  • Pull request ID set to 27465
Actions #10

Updated by huang jun almost 5 years ago

@Kefu Chai Chai i don't understand why the osdmap/first_committed and osdmap/last_committed = 1?
it already set osdmap/last_committed=15 i think.

Actions #11

Updated by Greg Farnum over 4 years ago

  • Status changed from Fix Under Review to In Progress
  • Priority changed from Urgent to Normal

Demoting as if you're running this you already need manual intervention anyway.

Actions #12

Updated by Christoffer Anselm over 4 years ago

Just a quick note as this might be relevant for the decision whether or not to integrate this PR:

Running mimic 13.2.6 we had a total loss of all mon data. For recovery we attempted steps similar to those stated in https://tracker.ceph.com/issues/38219#note-5

We had to rebuild the ceph-13.2.6 RPMs with this issue's PR and inject them into a ceph/ceph docker container to prevent the recovered mon from crashing once other mons got added to the cluster.

This PR saved our day. Please consider merging it for the next release if it does not introduce new regressions as it seems necessary for successful recovery in total loss of mon data scenarios.

Actions #13

Updated by Neha Ojha about 4 years ago

Seen in nautilus: /a/yuriw-2020-02-15_16:49:25-rados-nautilus-distro-basic-smithi/4767419/

Actions #14

Updated by Deepika Upadhyay over 3 years ago

/a/yuriw-2020-09-16_01:27:14-rados-wip-yuri3-testing-2020-09-16-0014-nautilus-distro-basic-smithi/5437537/teuthology.log

Actions #15

Updated by Deepika Upadhyay over 3 years ago

description:

description: rados/singleton/{all/rebuild-mondb msgr-failures/many msgr/async-v1only
    objectstore/bluestore-stupid rados supported-random-distro$/{ubuntu_latest}}

http://qa-proxy.ceph.com/teuthology/yuriw-2020-11-11_16:17:30-rados-wip-yuri-testing-2020-11-09-0849-nautilus-distro-basic-smithi/5613354/teuthology.log

Actions #16

Updated by Deepika Upadhyay about 3 years ago

2021-03-23T23:43:26.440 INFO:tasks.ceph.mon.a.smithi071.stderr: ceph version 14.2.18-34-gd66d3f9494 (d66d3f949470f0f2acefe86ad416656aa6478c64) nautilus (stable)
2021-03-23T23:43:26.440 INFO:tasks.ceph.mon.a.smithi071.stderr: 1: (()+0x12980) [0x7f5ec914d980]
2021-03-23T23:43:26.440 INFO:tasks.ceph.mon.a.smithi071.stderr: 2: (gsignal()+0xc7) [0x7f5ec8245fb7]
2021-03-23T23:43:26.440 INFO:tasks.ceph.mon.a.smithi071.stderr: 3: (abort()+0x141) [0x7f5ec8247921]
2021-03-23T23:43:26.440 INFO:tasks.ceph.mon.a.smithi071.stderr: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x7f5eca2d3f43]
2021-03-23T23:43:26.441 INFO:tasks.ceph.mon.a.smithi071.stderr: 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f5eca2d40cd]
2021-03-23T23:43:26.441 INFO:tasks.ceph.mon.a.smithi071.stderr: 6: (OSDMonitor::update_from_paxos(bool*)+0x3c52) [0x5565c8222ea2]
2021-03-23T23:43:26.441 INFO:tasks.ceph.mon.a.smithi071.stderr: 7: (PaxosService::refresh(bool*)+0x25a) [0x5565c819ed9a]
2021-03-23T23:43:26.441 INFO:tasks.ceph.mon.a.smithi071.stderr: 8: (Monitor::refresh_from_paxos(bool*)+0x10c) [0x5565c807cc6c]
2021-03-23T23:43:26.441 INFO:tasks.ceph.mon.a.smithi071.stderr: 9: (Paxos::do_refresh()+0x4f) [0x5565c818ddcf]
2021-03-23T23:43:26.442 INFO:tasks.ceph.mon.a.smithi071.stderr: 10: (Paxos::handle_last(boost::intrusive_ptr<MonOpRequest>)+0x8d8) [0x5565c8198498]
2021-03-23T23:43:26.442 INFO:tasks.ceph.mon.a.smithi071.stderr: 11: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x24b) [0x5565c819924b]
2021-03-23T23:43:26.442 INFO:tasks.ceph.mon.a.smithi071.stderr: 12: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x15c5) [0x5565c80b10e5]
2021-03-23T23:43:26.442 INFO:tasks.ceph.mon.a.smithi071.stderr: 13: (Monitor::_ms_dispatch(Message*)+0x4e2) [0x5565c80b1742]
2021-03-23T23:43:26.443 INFO:tasks.ceph.mon.a.smithi071.stderr: 14: (Monitor::ms_dispatch(Message*)+0x26) [0x5565c80e2786]
2021-03-23T23:43:26.443 INFO:tasks.ceph.mon.a.smithi071.stderr: 15: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5565c80de846]
2021-03-23T23:43:26.443 INFO:tasks.ceph.mon.a.smithi071.stderr: 16: (DispatchQueue::entry()+0x1219) [0x7f5eca5037e9]
2021-03-23T23:43:26.443 INFO:tasks.ceph.mon.a.smithi071.stderr: 17: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f5eca5b469d]
2021-03-23T23:43:26.443 INFO:tasks.ceph.mon.a.smithi071.stderr: 18: (()+0x76db) [0x7f5ec91426db]
2021-03-23T23:43:26.444 INFO:tasks.ceph.mon.a.smithi071.stderr: 19: (clone()+0x3f) [0x7f5ec832871f]

/ceph/teuthology-archive/yuriw-2021-03-23_21:23:36-rados-wip-yuri7-testing-2021-03-23-1015-nautilus-distro-basic-smithi/5990910/teuthology.log

  description: rados/singleton/{all/rebuild-mondb msgr-failures/few msgr/async-v2only
    objectstore/bluestore-comp-snappy rados supported-random-distro$/{ubuntu_latest}}

Actions #17

Updated by Deepika Upadhyay about 3 years ago

  • Priority changed from Normal to High
 1: (()+0x12980) [0x7f5ec914d980]
 2: (gsignal()+0xc7) [0x7f5ec8245fb7]
 3: (abort()+0x141) [0x7f5ec8247921]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x7f5eca2d3f43]
 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f5eca2d40cd]
 6: (OSDMonitor::update_from_paxos(bool*)+0x3c52) [0x5565c8222ea2]
 7: (PaxosService::refresh(bool*)+0x25a) [0x5565c819ed9a]
 8: (Monitor::refresh_from_paxos(bool*)+0x10c) [0x5565c807cc6c]
 9: (Paxos::do_refresh()+0x4f) [0x5565c818ddcf]
 10: (Paxos::handle_last(boost::intrusive_ptr<MonOpRequest>)+0x8d8) [0x5565c8198498]
 11: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x24b) [0x5565c819924b]
 12: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x15c5) [0x5565c80b10e5]
 13: (Monitor::_ms_dispatch(Message*)+0x4e2) [0x5565c80b1742]
 14: (Monitor::ms_dispatch(Message*)+0x26) [0x5565c80e2786]
 15: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5565c80de846]
 16: (DispatchQueue::entry()+0x1219) [0x7f5eca5037e9]
 17: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f5eca5b469d]
 18: (()+0x76db) [0x7f5ec91426db]
 19: (clone()+0x3f) [0x7f5ec832871f]
1 jobs: ['5990910']
suites: ['msgr-failures/few', 'msgr/async-v2only', 'objectstore/bluestore-comp-snappy', 'rados', 'rados/singleton/{all/rebuild-mondb', 'supported-random-distro$/{ubuntu_latest}}']

http://qa-proxy.ceph.com/teuthology/yuriw-2021-03-23_21:23:36-rados-wip-yuri7-testing-2021-03-23-1015-nautilus-distro-basic-smithi/scrape.log

Actions #18

Updated by Deepika Upadhyay almost 3 years ago

2021-06-04T23:05:38.775 INFO:tasks.ceph.mon.a.smithi071.stderr:/build/ceph-14.2.21-305-gac8fcfa6/src/mon/OSDMonitor.cc: In function 'virtual void OSDMonitor::update_from_paxos(bool*)' thread 7f0af546c700 time 2021-06-04 23:05:38.774545
2021-06-04T23:05:38.775 INFO:tasks.ceph.mon.a.smithi071.stderr:/build/ceph-14.2.21-305-gac8fcfa6/src/mon/OSDMonitor.cc: 649: FAILED ceph_assert(version > osdmap.epoch)
2021-06-04T23:05:38.777 INFO:tasks.ceph.mon.a.smithi071.stderr: ceph version 14.2.21-305-gac8fcfa6c86 (ac8fcfa6c86d93ec53d63f626ba388b6914bdef7) nautilus (stable)
2021-06-04T23:05:38.777 INFO:tasks.ceph.mon.a.smithi071.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f0b04a05432]
2021-06-04T23:05:38.778 INFO:tasks.ceph.mon.a.smithi071.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f0b04a0560d]
2021-06-04T23:05:38.778 INFO:tasks.ceph.mon.a.smithi071.stderr: 3: (OSDMonitor::update_from_paxos(bool*)+0x3c52) [0x55a95c5a7be2]
2021-06-04T23:05:38.778 INFO:tasks.ceph.mon.a.smithi071.stderr: 4: (PaxosService::refresh(bool*)+0x25a) [0x55a95c52354a]
2021-06-04T23:05:38.778 INFO:tasks.ceph.mon.a.smithi071.stderr: 5: (Monitor::refresh_from_paxos(bool*)+0x10c) [0x55a95c4000fc]
2021-06-04T23:05:38.779 INFO:tasks.ceph.mon.a.smithi071.stderr: 6: (Paxos::do_refresh()+0x4f) [0x55a95c51259f]
2021-06-04T23:05:38.779 INFO:tasks.ceph.mon.a.smithi071.stderr: 7: (Paxos::commit_finish()+0x48c) [0x55a95c51e1ac]
2021-06-04T23:05:38.779 INFO:tasks.ceph.mon.a.smithi071.stderr: 8: (C_Committed::finish(int)+0x34) [0x55a95c521f64]
2021-06-04T23:05:38.779 INFO:tasks.ceph.mon.a.smithi071.stderr: 9: (Context::complete(int)+0x9) [0x55a95c43a839]
2021-06-04T23:05:38.780 INFO:tasks.ceph.mon.a.smithi071.stderr: 10: (MonitorDBStore::C_DoTransaction::finish(int)+0x8f) [0x55a95c521ccf]
2021-06-04T23:05:38.780 INFO:tasks.ceph.mon.a.smithi071.stderr: 11: (Context::complete(int)+0x9) [0x55a95c43a839]
2021-06-04T23:05:38.780 INFO:tasks.ceph.mon.a.smithi071.stderr: 12: (Finisher::finisher_thread_entry()+0x15e) [0x7f0b04a5074e]
2021-06-04T23:05:38.780 INFO:tasks.ceph.mon.a.smithi071.stderr: 13: (()+0x76db) [0x7f0b038736db]


might be worth having it in nautilus I guess, seeing it as frequent popping issue
2021-06-04T23:05:38.780 INFO:tasks.ceph.mon.a.smithi071.stderr: 14: (clone()+0x3f) [0x7f0b02a5971f]

Actions #20

Updated by Kefu Chai almost 3 years ago

  • Status changed from In Progress to Pending Backport
Actions #22

Updated by Backport Bot almost 3 years ago

Actions #24

Updated by Kefu Chai almost 3 years ago

  • Backport changed from luminous, mimic, nautilus to nautilus, octopus, pacific
Actions #25

Updated by Kefu Chai almost 3 years ago

Actions #26

Updated by Kefu Chai almost 3 years ago

Actions #27

Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF