Project

General

Profile

Actions

Bug #10629

closed

osd/PG.cc: 2775: FAILED assert(values.size() == 2)

Added by Sage Weil about 9 years ago. Updated about 9 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2015-01-23T20:28:29.264 INFO:tasks.thrashosds.thrasher:Use ceph_objectstore_tool to remove past intervals
2015-01-23T20:28:29.264 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op list-pgs'
2015-01-23T20:28:31.577 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op rm-past-intervals --pgid 2.1'
2015-01-23T20:28:32.183 INFO:teuthology.orchestra.run.mira002.stderr:osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, spg_t, ceph::bufferlist*)' thread 7f1c29671900 time 2015-01-23 20:28:32.181137
2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr:osd/PG.cc: 2775: FAILED assert(values.size() == 2)
2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: ceph version 0.91-716-gb782030 (b782030e872254f2175196f966863a26b20f7296)
2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc4ccb]
2015-01-23T20:28:32.184 INFO:teuthology.orchestra.run.mira002.stderr: 2: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x865) [0x6b42c5]
2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 3: (main()+0x45ea) [0x65990a]
2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 4: (__libc_start_main()+0xf5) [0x7f1c23f2fec5]
2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: 5: ceph-objectstore-tool() [0x672e17]
2015-01-23T20:28:32.185 INFO:teuthology.orchestra.run.mira002.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2015-01-23T20:28:32.186 INFO:teuthology.orchestra.run.mira002.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion'
2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr:*** Caught signal (Aborted) **
2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: in thread 7f1c29671900
2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: ceph version 0.91-716-gb782030 (b782030e872254f2175196f966863a26b20f7296)
2015-01-23T20:28:32.187 INFO:teuthology.orchestra.run.mira002.stderr: 1: ceph-objectstore-tool() [0xae3e6a]
2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 2: (()+0x10340) [0x7f1c28118340]
2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 3: (gsignal()+0x39) [0x7f1c23f44bb9]
2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 4: (abort()+0x148) [0x7f1c23f47fc8]
2015-01-23T20:28:32.188 INFO:teuthology.orchestra.run.mira002.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f1c248506b5]
2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 6: (()+0x5e836) [0x7f1c2484e836]
2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 7: (()+0x5e863) [0x7f1c2484e863]
2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 8: (()+0x5eaa2) [0x7f1c2484eaa2]
2015-01-23T20:28:32.189 INFO:teuthology.orchestra.run.mira002.stderr: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0xbc4eb8]
2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 10: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0x865) [0x6b42c5]
2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 11: (main()+0x45ea) [0x65990a]
2015-01-23T20:28:32.190 INFO:teuthology.orchestra.run.mira002.stderr: 12: (__libc_start_main()+0xf5) [0x7f1c23f2fec5]
2015-01-23T20:28:32.191 INFO:teuthology.orchestra.run.mira002.stderr: 13: ceph-objectstore-tool() [0x672e17]


ubuntu@teuthology:/a/samuelj-2015-01-23_06:57:45-rados-master-wip-testing-vanilla-fixes-basic-multi/719677

pg 2.1_head appears on disk, but never in the osd log. it was imported a bit earlier:


2015-01-23T20:28:04.706 INFO:tasks.thrashosds.thrasher:Moving pg 2.1 from osd.2 to osd.0
2015-01-23T20:28:04.706 INFO:teuthology.orchestra.run.mira002:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op import --file /home/ubun
tu/cephtest/data/exp.2.1.2'
2015-01-23T20:28:04.829 INFO:teuthology.orchestra.run.mira002.stdout:Importing pgid 2.1

i tarred up the osd.0 data directories for reference. job is still running to, if you get to it before it gets cleaned up.


Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #6003: journal Unable to read past sequence 406 ...ResolvedSamuel Just08/15/2013

Actions
Actions #1

Updated by David Zafman about 9 years ago

I can't use the tool to operate on the extracted tars because they are incomplete (no journal nor superblock, for example).

Actions #2

Updated by David Zafman about 9 years ago

Interesting that both leveldb stores are empty in the 2 data directories. As far as we knew osd.2 should not have also been broken. We successfully did an export of pg 2.1 and then removed it from osd.2.

dzafman$ ./ceph-kvstore-tool  ~/bug10629/var/lib/ceph/osd/ceph-0/current/omap list
dzafman$ ./ceph-kvstore-tool  ~/bug10629/var/lib/ceph/osd/ceph-2/current/omap list
dzafman$ ./ceph-kvstore-tool  ~/bug10629/var/lib/ceph/osd/ceph-0/current/omap get-size
log - 0
misc - 73268
sst - 0
total - 73268
total: 73268
estimated store size: 73268
dzafman$ ./ceph-kvstore-tool  ~/bug10629/var/lib/ceph/osd/ceph-2/current/omap get-size
log - 0
misc - 96325
sst - 3965
total - 100290
total: 100290
estimated store size: 100290

Actions #3

Updated by Yuri Weinstein about 9 years ago

Seeing also in run: http://pulpito.ceph.com/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/
Job: ['728068']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/

/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797289-2015-01-28 22:57:27.269314 7f30fadea780 -1 *** Caught signal (Aborted) **
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797363- in thread 7f30fadea780
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797387-
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797388: ceph version 0.91-388-g5064787 (50647876971a2fe65a02e4de3c0bc62fec4887c4)
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797463- 1: ceph-osd() [0x9e8295]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797489- 2: (()+0xfcb0) [0x7f30fa316cb0]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797522- 3: (gsignal()+0x35) [0x7f30f89f9425]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797560- 4: (abort()+0x17b) [0x7f30f89fcb8b]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797597- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f30f934c69d]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797667- 6: (()+0xb5846) [0x7f30f934a846]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797701- 7: (()+0xb5873) [0x7f30f934a873]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797735- 8: (()+0xb596e) [0x7f30f934a96e]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797769- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x259) [0xadb5b9]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797861- 10: (PG::peek_map_epoch(ObjectStore*, spg_t, ceph::buffer::list*)+0xaea) [0x783aca]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797946- 11: (OSD::load_pgs()+0x1624) [0x6766b4]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290797987- 12: (OSD::init()+0x1448) [0x6858e8]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798024- 13: (main()+0x266e) [0x62341e]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798056- 14: (__libc_start_main()+0xed) [0x7f30f89e476d]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798105- 15: ceph-osd() [0x628989]
/a/teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068/remote/plana69/log/ceph-osd.0.log.gz:290798132- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions #4

Updated by David Zafman about 9 years ago

From teuthology-2015-01-28_17:05:01-upgrade:giant-x-next-distro-basic-multi/728068 logs I see that osd.0 was killed and revived 3 times. The final time it hit
the assert trying to load pg 0.22. The interesting thing is that the ceph-objectstore-tool was never run in the test case up to that point.

I suspect the write_info() was never called upon pg creation and subsequent handle_activate_map() calls didn't bother because it wasn't over 200 epochs
past. Later epochs think that info was persisted in epoch 649. Here are pg messages during pg creation in epoch 649.

I propose the following change. What isn't clear is why haven't we seen this before now.


diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc
index 4d6a01d..179abab 100644
--- a/src/osd/OSD.cc
+++ b/src/osd/OSD.cc
@@ -7038,6 +7038,8 @@ void OSD::handle_pg_create(OpRequestRef op)
       pg->info.last_epoch_started = pg->info.history.last_epoch_started;
       creating_pgs.erase(pgid);
       pg->handle_create(&rctx);
+      pg->dirty_info = true;
+      pg->dirty_big_info = true;
       pg->write_if_dirty(*rctx.transaction);
       pg->publish_stats_to_osd();
       pg->unlock();


2015-01-28 22:49:01.301575 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22(unlocked)] enter Initial
2015-01-28 22:49:01.301588 7f9ce059a700 20 osd.0 pg_epoch: 649 pg[0.22(unlocked)] enter NotTrimming
2015-01-28 22:49:01.301614 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( DNE empty local-les=0 n=0 ec=0 les/c 0/0 0/0/0) [] r=0 lpr=0 crt=0'0 inactive] init role 0 up [0,10] acting [0,10] history ec=1 les/c 642/642 649/649/649 1 past_intervals
2015-01-28 22:49:01.301672 7f9ce059a700  7 osd.0 649 _create_lock_pg pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive]
2015-01-28 22:49:01.301681 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] handle_create
2015-01-28 22:49:01.301694 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Initial 0.000119 0 0.000000
2015-01-28 22:49:01.301708 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Reset
2015-01-28 22:49:01.301718 7f9ce059a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=0 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] set_last_peering_reset 649
2015-01-28 22:49:01.301728 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] Clearing blocked outgoing recovery messages
2015-01-28 22:49:01.301741 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] Not blocking outgoing recovery messages
2015-01-28 22:49:01.301764 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] update_heartbeat_peers  -> 0,10
2015-01-28 22:49:01.301790 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] take_waiters
2015-01-28 22:49:01.301805 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Reset 0.000097 1 0.000068
2015-01-28 22:49:01.301819 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started
2015-01-28 22:49:01.301829 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Start
2015-01-28 22:49:01.301838 7f9ce059a700  1 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] state<Start>: transitioning to Primary
2015-01-28 22:49:01.301851 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] exit Start 0.000021 0 0.000000
2015-01-28 22:49:01.301864 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started/Primary
2015-01-28 22:49:01.301875 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] enter Started/Primary/Peering
2015-01-28 22:49:01.301892 7f9ce059a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetInfo
2015-01-28 22:49:01.301913 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] _calc_past_interval_range: already have past intervals back to 642
2015-01-28 22:49:01.301953 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(641-648 up [10](10) acting [10](10) maybe_went_rw)
2015-01-28 22:49:01.301966 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering]  PriorSet: build_prior final: probe 0,10 down  blocked_by {}
2015-01-28 22:49:01.301992 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] up_thru 627 < same_since 649, must notify monitor
2015-01-28 22:49:01.302005 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.10
2015-01-28 22:49:01.302024 7f9ce059a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:1
2015-01-28 22:49:01.302156 7f9ce059a700 10 osd.0 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] is new
2015-01-28 22:49:01.529688 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 MNotifyRec from 10 notify: (query_epoch:649, epoch_sent:649, info:0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)) features: 0x3fffffffffff
2015-01-28 22:49:01.529709 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering]  got osd.10 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)
2015-01-28 22:49:01.529734 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] update_heartbeat_peers 0,10 unchanged
2015-01-28 22:49:01.529747 7f9cd858a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 10 features: 3fffffffffff
2015-01-28 22:49:01.529758 7f9cd858a700 20 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common features: 3fffffffffff
2015-01-28 22:49:01.529771 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetInfo 0.227878 1 0.000280
2015-01-28 22:49:01.529789 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetLog
2015-01-28 22:49:01.529805 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting osd.0 0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649)
2015-01-28 22:49:01.529818 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting osd.10 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)
2015-01-28 22:49:01.529849 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] calc_acting newest update on osd.0 with 0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649)
2015-01-28 22:49:01.529864 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] actingbackfill is 0,10
2015-01-28 22:49:01.529874 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] choose_acting want [0,10] (== acting) backfill_targets
2015-01-28 22:49:01.529891 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetLog>: leaving GetLog
2015-01-28 22:49:01.529909 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetLog 0.000120 0 0.000000
2015-01-28 22:49:01.529923 7f9cd858a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:2
2015-01-28 22:49:01.538026 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetMissing
2015-01-28 22:49:01.538049 7f9cd858a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetMissing>:  still need up_thru update before going active
2015-01-28 22:49:01.538069 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetMissing 0.000044 0 0.000000
2015-01-28 22:49:01.538085 7f9cd858a700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] publish_stats_to_osd 649:3
2015-01-28 22:49:01.538097 7f9cd858a700  5 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] enter Started/Primary/Peering/WaitUpThru
2015-01-28 22:49:01.554565 7f9ce485c700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] flushed
2015-01-28 22:49:01.625987 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 MNotifyRec from 10 notify: (query_epoch:649, epoch_sent:649, info:0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649)) features: 0x3fffffffffff
2015-01-28 22:49:01.626003 7f9cd8d8b700  7 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary>: handle_pg_notify from osd.10
2015-01-28 22:49:01.626015 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] state<Started/Primary>: pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] got dup osd.10 info 0.22( empty local-les=642 n=0 ec=1 les/c 642/642 649/649/649), identical to ours
2015-01-28 22:49:01.705367 7f9cd8d8b700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 649 epoch_requested: 649 FlushedEvt
2015-01-28 22:49:01.705392 7f9cd8d8b700 15 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering]  requeue_ops
2015-01-28 22:49:02.851155 7f9ce059a700 10 osd.0 pg_epoch: 649 pg[0.22( empty local-les=0 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 peering] null

2015-01-28 22:49:02.861866 7f9cd858a700 20 osd.0 pg_epoch: 650 pg[0.22( empty local-les=650 n=0 ec=1 les/c 642/642 649/649/649) [0,10] r=0 lpr=649 pi=641-648/1 crt=0'0 mlcod 0'0 inactive] handle_activate_map: Not dirtying info: last_persisted is 649 while current is 650
2015-01-28 22:49:04.697427 7f9cd8d8b700 20 osd.0 pg_epoch: 651 pg[0.22( empty local-les=650 n=0 ec=1 les/c 650/650 649/649/649) [0,10] r=0 lpr=649 crt=0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 650 while current is 651
...
2015-01-28 22:55:33.006105 7f9cd858a700 20 osd.0 pg_epoch: 804 pg[0.22( empty local-les=650 n=0 ec=1 les/c 650/650 649/649/649) [0,10] r=0 lpr=649 crt=0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 650 while current is 804
osd.0 taken down

Actions #5

Updated by Samuel Just about 9 years ago

dirty_info and dirty_big_info get set in PG::init in _create_lock_pg(

Actions #6

Updated by David Zafman about 9 years ago

  • Subject changed from ceph-objectstore-tool: osd/PG.cc: 2775: FAILED assert(values.size() == 2) to osd/PG.cc: 2775: FAILED assert(values.size() == 2)
Actions #7

Updated by Samuel Just about 9 years ago

  • Status changed from New to 12

This may also be 6003.

Actions #8

Updated by David Zafman about 9 years ago

  • Status changed from 12 to Duplicate
Actions

Also available in: Atom PDF