Bug #10629
closedosd/PG.cc: 2775: FAILED assert(values.size() == 2)
0%
Description
2015-01-23T20:28:29.264 INFO:tasks.thrashosds.thrasher:Use ceph_objectstore_tool to remove past intervals 2015-01-23T20:28:29.264 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op list-pgs' 2015-01-23T20:28:31.577 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op rm-past-intervals --pgid 2.1' 2015-01-23T20:28:32.183 INFO:teuthology.orchestra.run.mira002.stderr:osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7f1c29671900 time 2015-01-23 20:28:32.181137 2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr:osd/PG.cc: 2775: FAILED assert(values.size() == 2) 2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: ceph version 0.91-716-gb782030 (b782030e872254f2175196f966863a26b20f7296) 2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc4ccb] 2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: 2: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x865) [0x6b42c5] 2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 3: (main()+0x45ea) [0x65990a] 2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 4: (__libc_start_main()+0xf5) [0x7f1c23f2fec5] 2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 5: ceph-objectstore-tool() [0x672e17] 2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2015-01-23T20:28:32.186 INFO:teuthology.orchestra.run.mira002.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion' 2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr:*** Caught signal (Aborted) ** 2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: in thread 7f1c29671900 2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: ceph version 0.91-716-gb782030 (b782030e872254f2175196f966863a26b20f7296) 2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: 1: ceph-objectstore-tool() [0xae3e6a] 2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 2: (()+0x10340) [0x7f1c28118340] 2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 3: (gsignal()+0x39) [0x7f1c23f44bb9] 2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 4: (abort()+0x148) [0x7f1c23f47fc8] 2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f1c248506b5] 2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 6: (()+0x5e836) [0x7f1c2484e836] 2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 7: (()+0x5e863) [0x7f1c2484e863] 2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 8: (()+0x5eaa2) [0x7f1c2484eaa2] 2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0xbc4eb8] 2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 10: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x865) [0x6b42c5] 2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 11: (main()+0x45ea) [0x65990a] 2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 12: (__libc_start_main()+0xf5) [0x7f1c23f2fec5] 2015-01-23T20:28:32.191 INFO:teuthology.orchestra.run.mira002.stderr: 13: ceph-objectstore-tool() [0x672e17]
ubuntu@teuthology:/a/samuelj-2015-01-23_06:57:45-rados-master-wip-testing-vanilla-fixes-basic-multi/719677
pg 2.1_head appears on disk, but never in the osd log. it was imported a bit earlier:
2015-01-23T20:28:04.706 INFO:tasks.thrashosds.thrasher:Moving pg 2.1 from osd.2 to osd.0 2015-01-23T20:28:04.706 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op import --file /home/ubun tu/cephtest/data/exp.2.1.2' 2015-01-23T20:28:04.829 INFO:teuthology.orchestra.run.mira002.stdout:Importing pgid 2.1
i tarred up the osd.0 data directories for reference. job is still running to, if you get to it before it gets cleaned up.
Updated by David Zafman about 9 years ago
I can't use the tool to operate on the extracted tars because they are incomplete (no journal nor superblock, for example).
Updated by David Zafman about 9 years ago
Interesting that both leveldb stores are empty in the 2 data directories. As far as we knew osd.2 should not have also been broken. We successfully did an export of pg 2.1 and then removed it from osd.2.
dzafman$ ./ceph-kvstore-tool ~/bug10629/var/lib/ceph/osd/ceph-0/current/omap list dzafman$ ./ceph-kvstore-tool ~/bug10629/var/lib/ceph/osd/ceph-2/current/omap list dzafman$ ./ceph-kvstore-tool ~/bug10629/var/lib/ceph/osd/ceph-0/current/omap get-size log - 0 misc - 73268 sst - 0 total - 73268 total: 73268 estimated store size: 73268 dzafman$ ./ceph-kvstore-tool ~/bug10629/var/lib/ceph/osd/ceph-2/current/omap get-size log - 0 misc - 96325 sst - 3965 total - 100290 total: 100290 estimated store size: 100290
Updated by Yuri Weinstein about 9 years ago
Seeing also in run: http://pulpito.ceph.com/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/
Job: ['728068']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797289-2015-01-28 22:57:27.269314 7f30fadea780 -1 *** Caught signal (Aborted) ** /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797363- in thread 7f30fadea780 /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797387- /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797388: ceph version 0.91-388-g5064787 (50647876971a2fe65a02e4de3c0bc62fec4887c4) /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797463- 1: ceph-osd() [0x9e8295] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797489- 2: (()+0xfcb0) [0x7f30fa316cb0] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797522- 3: (gsignal()+0x35) [0x7f30f89f9425] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797560- 4: (abort()+0x17b) [0x7f30f89fcb8b] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797597- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f30f934c69d] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797667- 6: (()+0xb5846) [0x7f30f934a846] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797701- 7: (()+0xb5873) [0x7f30f934a873] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797735- 8: (()+0xb596e) [0x7f30f934a96e] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797769- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x259) [0xadb5b9] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797861- 10: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0xaea) [0x783aca] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797946- 11: (OSD::load_pgs()+0x1624) [0x6766b4] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797987- 12: (OSD::init()+0x1448) [0x6858e8] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798024- 13: (main()+0x266e) [0x62341e] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798056- 14: (__libc_start_main()+0xed) [0x7f30f89e476d] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798105- 15: ceph-osd() [0x628989] /a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798132- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by David Zafman about 9 years ago
From teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068 logs I see that osd.0 was killed and revived 3 times. The final time it hit
the assert trying to load pg 0.22. The interesting thing is that the ceph-objectstore-tool was never run in the test case up to that point.
I suspect the write_info() was never called upon pg creation and subsequent handle_activate_map() calls didn't bother because it wasn't over 200 epochs
past. Later epochs think that info was persisted in epoch 649. Here are pg messages during pg creation in epoch 649.
I propose the following change. What isn't clear is why haven't we seen this before now.
diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc index 4d6a01d..179abab 100644 --- a/src/osd/OSD.cc +++ b/src/osd/OSD.cc @@ -7038,6 +7038,8 @@ void OSD::handle_pg_create(OpRequestRef op) pg->info.last_epoch_started = pg->info.history.last_epoch_started; creating_pgs.erase(pgid); pg->handle_create(&rctx); + pg->dirty_info = true; + pg->dirty_big_info = true; pg->write_if_dirty(*rctx.transaction); pg->publish_stats_to_osd(); pg->unlock();
2015-01-28 22:49:01.301575 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22(unlocked)] enter Initial 2015-01-28 22:49:01.301588 7f9ce059a700 20 osd.0 pg_epoch: 649 pg[0.22(unlocked)] enter NotTrimming 2015-01-28 22:49:01.301614 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0) [] r=0 lpr=0 crt=0'0 inactive] init role 0 up [0,10] acting [0,10] history ec=1 les/c 642/642 649/649/649 1 past_intervals 2015-01-28 22:49:01.301672 7f9ce059a700 7 osd.0 649 _create_lock_pg pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] 2015-01-28 22:49:01.301681 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] handle_create 2015-01-28 22:49:01.301694 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000119 0 0.000000 2015-01-28 22:49:01.301708 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Reset 2015-01-28 22:49:01.301718 7f9ce059a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] set_last_peering_reset 649 2015-01-28 22:49:01.301728 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] Clearing blocked outgoing recovery messages 2015-01-28 22:49:01.301741 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] Not blocking outgoing recovery messages 2015-01-28 22:49:01.301764 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] update_heartbeat_peers -> 0,10 2015-01-28 22:49:01.301790 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] take_waiters 2015-01-28 22:49:01.301805 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Reset 0.000097 1 0.000068 2015-01-28 22:49:01.301819 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started 2015-01-28 22:49:01.301829 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Start 2015-01-28 22:49:01.301838 7f9ce059a700 1 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] state<Start>: transitioning to Primary 2015-01-28 22:49:01.301851 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Start 0.000021 0 0.000000 2015-01-28 22:49:01.301864 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started/Primary 2015-01-28 22:49:01.301875 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started/Primary/Peering 2015-01-28 22:49:01.301892 7f9ce059a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetInfo 2015-01-28 22:49:01.301913 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] _calc_past_interval_range: already have past intervals back to 642 2015-01-28 22:49:01.301953 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] PriorSet: build_prior interval(641-648 up [10](10) acting [10](10) maybe_went_rw) 2015-01-28 22:49:01.301966 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] PriorSet: build_prior final: probe 0,10 down blocked_by {} 2015-01-28 22:49:01.301992 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] up_thru 627 < same_since 649, must notify monitor 2015-01-28 22:49:01.302005 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: querying info from osd.10 2015-01-28 22:49:01.302024 7f9ce059a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:1 2015-01-28 22:49:01.302156 7f9ce059a700 10 osd.0 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] is new 2015-01-28 22:49:01.529688 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 MNotifyRec from 10 notify: (query_epoch:649, epoch_sent:649, info:0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)) features: 0x3fffffffffff 2015-01-28 22:49:01.529709 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] got osd.10 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649) 2015-01-28 22:49:01.529734 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] update_heartbeat_peers 0,10 unchanged 2015-01-28 22:49:01.529747 7f9cd858a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 10 features: 3fffffffffff 2015-01-28 22:49:01.529758 7f9cd858a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common features: 3fffffffffff 2015-01-28 22:49:01.529771 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetInfo 0.227878 1 0.000280 2015-01-28 22:49:01.529789 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetLog 2015-01-28 22:49:01.529805 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting osd.0 0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) 2015-01-28 22:49:01.529818 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting osd.10 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649) 2015-01-28 22:49:01.529849 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) 2015-01-28 22:49:01.529864 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] actingbackfill is 0,10 2015-01-28 22:49:01.529874 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] choose_acting want [0,10] (== acting) backfill_targets 2015-01-28 22:49:01.529891 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetLog>: leaving GetLog 2015-01-28 22:49:01.529909 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetLog 0.000120 0 0.000000 2015-01-28 22:49:01.529923 7f9cd858a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:2 2015-01-28 22:49:01.538026 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetMissing 2015-01-28 22:49:01.538049 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetMissing>: still need up_thru update before going active 2015-01-28 22:49:01.538069 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetMissing 0.000044 0 0.000000 2015-01-28 22:49:01.538085 7f9cd858a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:3 2015-01-28 22:49:01.538097 7f9cd858a700 5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/WaitUpThru 2015-01-28 22:49:01.554565 7f9ce485c700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] flushed 2015-01-28 22:49:01.625987 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 MNotifyRec from 10 notify: (query_epoch:649, epoch_sent:649, info:0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)) features: 0x3fffffffffff 2015-01-28 22:49:01.626003 7f9cd8d8b700 7 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary>: handle_pg_notify from osd.10 2015-01-28 22:49:01.626015 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary>: pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] got dup osd.10 info 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649), identical to ours 2015-01-28 22:49:01.705367 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 FlushedEvt 2015-01-28 22:49:01.705392 7f9cd8d8b700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] requeue_ops 2015-01-28 22:49:02.851155 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] null 2015-01-28 22:49:02.861866 7f9cd858a700 20 osd.0 pg_epoch: 650 pg[0.22( empty local-les=650 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] handle_activate_map: Not dirtying info: last_persisted is 649 while current is 650 2015-01-28 22:49:04.697427 7f9cd8d8b700 20 osd.0 pg_epoch: 651 pg[0.22( empty local-les=650 n=0 ec=1 les/c 650/650 649/649/649) [0,10] r=0 lpr=649 crt=0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 650 while current is 651 ... 2015-01-28 22:55:33.006105 7f9cd858a700 20 osd.0 pg_epoch: 804 pg[0.22( empty local-les=650 n=0 ec=1 les/c 650/650 649/649/649) [0,10] r=0 lpr=649 crt=0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 650 while current is 804 osd.0 taken down
Updated by Samuel Just about 9 years ago
dirty_info and dirty_big_info get set in PG::init in _create_lock_pg(
Updated by David Zafman about 9 years ago
- Subject changed from ceph-objectstore-tool: osd/PG.cc: 2775: FAILED assert(values.size() == 2) to osd/PG.cc: 2775: FAILED assert(values.size() == 2)