https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2018-07-30T17:08:15ZCeph RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1180292018-07-30T17:08:15ZPatrick Donnellypdonnell@redhat.com
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>RADOS</i></li><li><strong>Component(RADOS)</strong> <i>Monitor</i> added</li></ul> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1182172018-08-01T06:28:41ZKefu Chaitchaikov@gmail.com
<ul></ul><p>i create a vstart.sh cluster using mimic branch, and ceph-monstore-tool from master is able to open it just fine.</p>
<pre>
$ bin/ceph-monstore-tool ~/dev/ceph/build/dev/mon.a/ dump-keys
...
2018-08-01 14:27:22.935 7f9e16c6df80 1 rocksdb: do_open column families: [default]
2018-08-01 14:27:22.935 7f9e16c6df80 4 rocksdb: RocksDB version: 5.14.0
2018-08-01 14:27:22.935 7f9e16c6df80 4 rocksdb: Git sha rocksdb_build_git_sha:@9090ae3ecfbf9b50a398a5d8b178f14b88dc047e@
2018-08-01 14:27:22.935 7f9e16c6df80 4 rocksdb: Compile date Jul 27 2018
...
2018-08-01 14:27:22.935 7f9e16c6df80 4 rocksdb: SST files in /home/kefu/dev/ceph/build/dev/mon.a/store.db dir, Total Num: 3, files: 000004.sst 000007.sst 000
013.sst
...
2018-08-01 14:27:23.067 7f9e16c6df80 4 rocksdb: [/var/ssd/ceph/src/rocksdb/db/db_impl_open.cc:1219] DB pointer 0x559081bc1000
auth / 1
auth / 2
auth / 3
auth / 4
auth / 5
auth / 6
auth / 7
auth / 8
auth / first_committed
auth / format_version
...
</pre>
<p>rerunning the test at <a class="external" href="http://pulpito.ceph.com/kchai-2018-08-01_06:21:53-upgrade:mimic-x:parallel-master-distro-basic-smithi/">http://pulpito.ceph.com/kchai-2018-08-01_06:21:53-upgrade:mimic-x:parallel-master-distro-basic-smithi/</a></p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1182272018-08-01T12:45:58ZKefu Chaitchaikov@gmail.com
<ul></ul><p>it's a regression in rocksdb. the rocksdb in mimic (eaee6d3beab3429232ceb188377a3f94e844fca7) is f4a857da0b720691effc524469f6db895ad00d8e, which contains <a class="external" href="https://github.com/facebook/rocksdb/commit/73f21a7b2177aeb82b9f518222e2b9ea8fbb7c4f">https://github.com/facebook/rocksdb/commit/73f21a7b2177aeb82b9f518222e2b9ea8fbb7c4f</a>. this commit is fine per se. but older rocksdb not containing this commit will not be able to open the db file created by this change.</p>
<p>because 73f21a7b2177aeb82b9f518222e2b9ea8fbb7c4f creates dummy entry in the manifest to record the deleted WALs, expecting that the recovery will skip them. so this change is <strong>not</strong> forward compatible. i.e. old rocksdb is not necessarily able to open store created by new rocksdb.</p>
<p>the dummy entry looks like:<br /><pre>
$20 = {fd = {table_reader = 0x0, packed_number_and_path_id = 0, file_size = 0}, smallest = {rep_ = "dummy_key\001\000\000\000\000\000\000"},
largest = {rep_ = "dummy_key\001\000\000\000\000\000\000"}, smallest_seqno = 0, largest_seqno = 0, table_reader_handle = 0x0, stats = {
num_reads_sampled = {<std::__atomic_base<unsigned long>> = {static _S_alignment = 8, _M_i = 0}, <No data fields>}},
compensated_file_size = 0, num_entries = 0, num_deletions = 0, raw_key_size = 0, raw_value_size = 0, refs = 1, being_compacted = false,
init_stats_from_file = false, marked_for_compaction = false}
</pre><br />in the context of Ceph's use case, it's <strong>sort of</strong> fine. because, we do not support downgrade.</p>
<p>but when rocksdb identified that this change is not forward compatible, they decided to revert this change in <a class="external" href="https://github.com/facebook/rocksdb/pull/3762">https://github.com/facebook/rocksdb/pull/3762</a>. so, this change makes rocksdb <strong>not backward</strong> compatible. in other words, new rocksdb is not able to open store created by old rocksdb.</p>
<p>and we do have this change in master! that's why we are suffering from a fix from rocksdb upstream.</p>
<p>this is embarrassing. because mimic is released. i don't really want our user to rebuild their OSDs or monitors to be forward compatible with master. we probably have to bite the bullet to keep the reverted change in our fork and maintain it, unless we can persuade the upstream to revert <a class="external" href="https://github.com/facebook/rocksdb/pull/3762">https://github.com/facebook/rocksdb/pull/3762</a> .</p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1182382018-08-01T14:03:10ZKefu Chaitchaikov@gmail.com
<ul></ul><p>an alternative option is to whip up a tool to rebuild the manifest to remove the dummy File4 with kDeletedLogNumberHack custom_tag. see also rocksdb/db/repair.cc .</p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1182412018-08-01T14:23:54ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Subject</strong> changed from <i>"rocksdb: Corruption: Can't access /000000.sst" in pgrade:mimic-x:parallel-master-distro-basic-smithi</i> to <i>"rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithi</i></li></ul> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1183192018-08-01T22:18:21ZSage Weilsage@newdream.net
<ul></ul><p>another option would be to only partially revert, and keep just the bits that ignore the older deleted log files.</p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1187192018-08-08T21:42:43ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>12</i></li><li><strong>Assignee</strong> deleted (<del><i>Sage Weil</i></del>)</li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Urgent</i></li></ul><p>I think we need to fix this sooner rather than later. My suggestion is to incorporate enough of the original rocksdb changes to interpret the newer MANIFEST entries, but do not generate new ones, so that we can silently "upgrade" from the mimic store.dbs with the patch back to the traditional format.</p>
<p>My guess is that rocksdb would take such a patch upstream, too?</p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1187592018-08-09T14:55:33ZSage Weilsage@newdream.net
<ul><li><strong>Assignee</strong> set to <i>Radoslaw Zarzynski</i></li></ul> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1192222018-08-21T20:22:41ZRadoslaw Zarzynskirzarzyns@redhat.com
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>In Progress</i></li></ul><p>Very early fix: <a class="external" href="https://github.com/rzarzynski/rocksdb/tree/wip-bug-25146">https://github.com/rzarzynski/rocksdb/tree/wip-bug-25146</a>.</p>
<p>The case appears more complicated as the change has been reintroduced to RocksDB (see <a class="external" href="https://github.com/facebook/rocksdb/pull/3765">https://github.com/facebook/rocksdb/pull/3765</a>) but in a modified form. <strong>VersionEdit</strong> uses now different dencoding that doesn't understand the original format of <a class="external" href="https://github.com/facebook/rocksdb/pull/3488">https://github.com/facebook/rocksdb/pull/3488</a>. The value <strong>0x03</strong> of <strong>CustomTag</strong> has been reused. All of these variants were/are in our master: <a class="external" href="https://gist.github.com/rzarzynski/24a753176f7cf2d2c1fc173d8da763dc">https://gist.github.com/rzarzynski/24a753176f7cf2d2c1fc173d8da763dc</a>.</p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1215922018-09-27T00:17:28ZRadoslaw Zarzynskirzarzyns@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p>ceph/rocksdb: <a class="external" href="https://github.com/ceph/rocksdb/pull/40">https://github.com/ceph/rocksdb/pull/40</a></p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1244682018-11-12T07:05:54ZBrad Hubbardbhubbard@redhat.com
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-6 priority-high2 closed" href="/issues/36758">Bug #36758</a>: aborts in rocksdb::TableFileName() in mimic-x upgrade test suite</i> added</li></ul> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1244862018-11-13T04:55:48ZKefu Chaitchaikov@gmail.com
<ul></ul><p><a class="external" href="https://github.com/ceph/ceph/pull/25070">https://github.com/ceph/ceph/pull/25070</a></p> RADOS - Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-distro-basic-smithihttps://tracker.ceph.com/issues/25146?journal_id=1246242018-11-15T13:16:52ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li></ul>