Bug #17562
closedbacktrace check fails when scrubbing directory created by fsstress
0%
Description
when scrubbing created by fsstress, there are lots of false backtrace errors. The bug is in inode_backtrace_t::compare. but the code is so confusing. I can't understand Greg's intention
One example of failure
[root@zhyan-desk1 mds]# ceph daemon mds.a scrub_path /fsstress/pf/d8/d62/d764/dc6d/d14b4/d20d6/d2831/d3a7f/d488/d511/d1df5/d1d00/d4821/d50b7/d1146/dc59/ repair { "performed_validation": true, "passed_validation": false, "backtrace": { "checked": true, "passed": false, "read_ret_val": 0, "ondisk_value": "(269)1000000cd38:[<10000008937\/dc59 v1218>,<10000007705\/d54c v170>,<1000000723d\/d3cf v2426>,<100000051de\/d366 v2068>,<10000004b63\/db0 v4387>,<10000004b60\/d25 v3759>,<10000004a79\/d21 v4482>,<1000000002a\/d8 v2286>,<10000000000\/pf v16768>,<1\/fsstress v2348>]\/\/", "memoryvalue": "(269)1000000cd38:[<10000008937\/dc59 v19537>,<1000000af34\/d1146 v3524>,<10000039adc\/d50b7 v1321>,<10000019d91\/d4821 v288>,<1000000b3cc\/d1d00 v4928>,<100000086d2\/d1df5 v34596>,<100000081cd\/d511 v2529>,<10000005d8d\/d488 v28540>,<1000000947f\/d3a7f v28103>,<1000001cb01\/d2831 v19700>,<1000000a413\/d20d6 v28872>,<1000000ce1e\/d14b4 v13328>,<10000005552\/dc6d v23056>,<10000004dfd\/d764 v21852>,<10000004a79\/d62 v57237>,<1000000002a\/d8 v24004>,<10000000000\/pf v285463>,<1\/fsstress v42491>]\/\/", "error_str": "" }, "raw_stats": { "checked": true, "passed": true, "read_ret_val": 0, "ondisk_value.dirstat": "f(v0 m2016-10-13 17:01:16.208620 8=6+2)", "ondisk_value.rstat": "n(v0 rc2016-10-13 17:01:16.208620 b4457089 19=15+4)", "memory_value.dirrstat": "f(v0 m2016-10-13 17:01:16.208620 8=6+2)", "memory_value.rstat": "n(v7 rc2016-10-13 17:01:16.208620 b4457089 19=15+4)", "error_str": "" }, "return_code": 0 }
Breakpoint 1, inode_backtrace_t::compare (this=this@entry=0x55f3dd638c80, other=..., equivalent=equivalent@entry=0x7f08ee22690e, divergent=divergent@entry=0x7f08ee22690f) at /home/zhyan/Ceph/ceph-2/src/mds/inode_backtrace.cc:124 124 int min_size = MIN(ancestors.size(),other.ancestors.size()); (gdb) n 125 *divergent = false; (gdb) 126 if (min_size == 0) (gdb) 123 { (gdb) 129 if (ancestors[0].version > other.ancestors[0].version) (gdb) 130 comparator = 1; (gdb) 133 for (int i = 1; i < min_size; ++i) { (gdb) 134 if (*divergent) { (gdb) 141 if (ancestors[i].dirino != other.ancestors[i].dirino) { (gdb) 142 *equivalent = false; (gdb) 143 if (ancestors[i-1].version < other.ancestors[i-1].version) { (gdb) 147 } else if (ancestors[i-1].version > other.ancestors[i-1].version) { (gdb) 148 if (comparator < 0) (gdb) 150 return 1; (gdb) 168 } (gdb) CInode::ValidationContinuation::_backtrace (this=0x55f3dd417ac0, rval=0) at /home/zhyan/Ceph/ceph-2/src/mds/CInode.cc:3843 3843 if (equivalent) { (gdb) 3846 if (divergent || memory_newer <= 0) { (gdb) p equivalent $1 = false (gdb) p divergent $2 = false
Updated by Greg Farnum over 7 years ago
Well, in inode_backtrace_t::compare() we're trying to determine and return three different things, as described in the header comments.
1) Which version is newer (if either); that's the comparator and return value.
2) If the two backtraces both have the same dentries (regardless of versions on each); that's the "equivalent" pointer. (So if one says "a1/b1/c1" and the other says "a2/b2/c1" they would be equivalent.)
3) If the two backtraces have different dentries but the versions stuck on the dentries mean they shouldn't. (So if one says "a1/b1/c1" and the other says "a1/b1/d1" they would be divergent, but if the other said "a1/b2/d1" they would not.)
Does that make sense? The code is a little annoying but it's basically just walking through the dentry vectors of each backtrace and comparing their versions (until they differ) and making sure any difference is acceptable given the associated versions. Apparently it's not quite correct, but that's the idea.
Updated by Greg Farnum over 7 years ago
- Assignee changed from Greg Farnum to Zheng Yan
It looks to me like compare() is doing the right thing, but that the scrubbing code is declaring an error if the on-disk backtrace is out-of-date. (results->backtrace.passed = true; only if (equivalent)) I can't recall off-hand what the behavior we want there is, but probably it should be adjusted. (We have or had some cases where we wanted failures and flushed the journal to backing objects before scrubbing; that isn't the only case we need to handle for online scrub though!)
Updated by Zheng Yan over 7 years ago
- Status changed from New to Fix Under Review
Updated by Zheng Yan over 7 years ago
- Status changed from Fix Under Review to Resolved