Backport #14592
Updated by Kefu Chai about 8 years ago
see http://tracker.ceph.com/issues/13990#note-39. reproduce steps: # monitor sends pg-create messages, so the @pg_create.created@ is @pool.last_change@, while the newly pool.last_change is the OSDMonitor.pending_inc.epoch at that moment. but somehow these PGs fail to create because * some osd is down but not out, or * some osd's osd_debug_drop_pg_create_probability is 1.0 // this option is only available in hammer # and some changes are happening in the meantime, which update the osdmap. once the number osdmap epochs reach the threshold, monitor starts to trim them # the the OSDs are back to business, they start to process pg-create, and these pg-create messages carry old osdmaps which were already trimmed by mon and osd, so when osd try to build the prior set, they are missing. so assert failure! a possible fix could be: monitor should not trim the osdmaps until the pg-create which references them gets processed by osd.