Bug #39705
closedqa: Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205
0%
Description
2019-05-09T23:33:08.964 INFO:tasks.workunit.client.0.smithi173.stdout:[ RUN ] LibCephFS.SnapXattrs 2019-05-09T23:33:09.014 INFO:tasks.workunit.client.0.smithi173.stdout:/build/ceph-15.0.0-1031-gc04a3cc/src/test/libcephfs/test.cc:2258: Failure 2019-05-09T23:33:09.015 INFO:tasks.workunit.client.0.smithi173.stdout:Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205 2019-05-09T23:33:09.016 INFO:tasks.workunit.client.0.smithi173.stdout:[ FAILED ] LibCephFS.SnapXattrs (47 ms) 2019-05-09T23:33:09.016 INFO:tasks.workunit.client.0.smithi173.stdout:[----------] 55 tests from LibCephFS (72641 ms total) ... 2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout:[ FAILED ] 1 test, listed below: 2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout:[ FAILED ] LibCephFS.SnapXattrs 2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout: 2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout: 1 FAILED TEST 2019-05-09T23:33:11.594 DEBUG:teuthology.orchestra.run:got remote process result: 1 2019-05-09T23:33:11.594 INFO:tasks.workunit:Stopping ['libcephfs/test.sh'] on client.0... 2019-05-09T23:33:11.594 INFO:teuthology.orchestra.run.smithi173:Running: 2019-05-09T23:33:11.594 INFO:teuthology.orchestra.run.smithi173:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 2019-05-09T23:33:11.756 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 86, in run_tasks manager = run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 65, in run_one_task return task(**kwargs) File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-parial-recovery-2019-05-09/qa/tasks/workunit.py", line 123, in task timeout=timeout,cleanup=cleanup) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 85, in __exit__ for result in self: File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 99, in next resurrect_traceback(result) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 22, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-parial-recovery-2019-05-09/qa/tasks/workunit.py", line 409, in _run_tests label="workunit test {workunit}".format(workunit=workunit) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 205, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 437, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 162, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 184, in _raise_for_status node=self.hostname, label=self.label CommandFailedError: Command failed (workunit test libcephfs/test.sh) on smithi173 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=c04a3ccdd2e25c2075c7680ed3203151b99f2333 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/libcephfs/test.sh'
From: /ceph/teuthology-archive/nojha-2019-05-09_22:58:42-fs:basic_workload-wip-parial-recovery-2019-05-09-distro-basic-smithi/3942897/teuthology.log
David, any idea?
Updated by Patrick Donnelly almost 5 years ago
- Related to Feature #38838: Expose CephFS snapshot creation time to clients added
Updated by David Disseldorp almost 5 years ago
Patrick Donnelly wrote:
[...]
From: /ceph/teuthology-archive/nojha-2019-05-09_22:58:42-fs:basic_workload-wip-parial-recovery-2019-05-09-distro-basic-smithi/3942897/teuthology.log
David, any idea?
Thanks for the ping.
For the test we have a single client creating two sequential snapshots and confirming that the second snapshot has a btime (MDS SnapInfo.stamp) value higher than that of the first. In the failure the values are (btime=2019-05-09 23:33:09.400554) < (new_btime=2019-05-09 23:33:09.094205), so we have a small delta in the wrong direction.
Without knowing too much about how snapshots are processed by the MDS, could it be that a snapshot's SnapInfo.stamp value only gets committed after the lazy asynchronous client flush has completed? If so, out of order completion may not be unexpected for sequential snapshots.
Updated by Patrick Donnelly almost 5 years ago
- Assignee changed from David Disseldorp to Zheng Yan
Updated by Zheng Yan almost 5 years ago
2019-05-09 23:33:09.010 7f3f71911700 15 mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280) ? snap(2 0x1 'test_snap_xattr_snap_15872' 2019-05-09 23:31:42.199137) 2019-05-09 23:33:09.010 7f3f71911700 15 mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280) ? snap(3 0x1 'test_snap_xattr_snap2_15872' 2019-05-09 23:31:42.208800) 2019-05-09 23:33:09.010 7f3f71911700 15 mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280) ? snap(4 0x1 'test_snap_xattr_snap_91533' 2019-05-09 23:33:08.990488) 2019-05-09 23:33:09.010 7f3f71911700 15 mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280) ? snap(5 0x1 'test_snap_xattr_snap2_91533' 2019-05-09 23:33:09.004205)
here is btime of snapshots, looks like the test case does not parse time stamp correctly
Updated by Zheng Yan almost 5 years ago
- Assignee changed from Zheng Yan to David Disseldorp
Updated by David Disseldorp almost 5 years ago
This bug is due to incorrect placement of the pad/width specifier in:
11809 size_t Client::_vxattrcb_snap_btime(Inode *in, char *val, size_t size)
11810 {
11811 return snprintf(val, size, "%llu.09%lu",
The "09" string is incorrectly provided as a prefix for the nanoseconds value, instead of setting zero-padding width. The same bug is present in the
(not yet merged) snap.btime kernel patchset.
This bug is a cut-n-paste bug from:
11790 size_t Client::_vxattrcb_dir_rctime(Inode *in, char *val, size_t size)
11791 {
11792 return snprintf(val, size, "%ld.09%ld", (long)in->rstat.rctime.sec(),
11793 (long)in->rstat.rctime.nsec());
...I'll raise a separate bug for "dir.rctime", which also affects kernel.
Updated by David Disseldorp almost 5 years ago
- Copied to Bug #39943: client: ceph.dir.rctime xattr value incorrectly prefixes "09" to the nanoseconds component added
Updated by Patrick Donnelly almost 5 years ago
- Status changed from New to Pending Backport
Updated by Nathan Cutler almost 5 years ago
- Copied to Backport #40169: nautilus: qa: Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205 added
Updated by Nathan Cutler almost 5 years ago
- Status changed from Pending Backport to Resolved