Project

General

Profile

Bug #39705

qa: Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205

Added by Patrick Donnelly almost 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
snapshots
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2019-05-09T23:33:08.964 INFO:tasks.workunit.client.0.smithi173.stdout:[ RUN      ] LibCephFS.SnapXattrs
2019-05-09T23:33:09.014 INFO:tasks.workunit.client.0.smithi173.stdout:/build/ceph-15.0.0-1031-gc04a3cc/src/test/libcephfs/test.cc:2258: Failure
2019-05-09T23:33:09.015 INFO:tasks.workunit.client.0.smithi173.stdout:Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205
2019-05-09T23:33:09.016 INFO:tasks.workunit.client.0.smithi173.stdout:[  FAILED  ] LibCephFS.SnapXattrs (47 ms)
2019-05-09T23:33:09.016 INFO:tasks.workunit.client.0.smithi173.stdout:[----------] 55 tests from LibCephFS (72641 ms total)
...
2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout:[  FAILED  ] 1 test, listed below:
2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout:[  FAILED  ] LibCephFS.SnapXattrs
2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout:
2019-05-09T23:33:11.586 INFO:tasks.workunit.client.0.smithi173.stdout: 1 FAILED TEST
2019-05-09T23:33:11.594 DEBUG:teuthology.orchestra.run:got remote process result: 1
2019-05-09T23:33:11.594 INFO:tasks.workunit:Stopping ['libcephfs/test.sh'] on client.0...
2019-05-09T23:33:11.594 INFO:teuthology.orchestra.run.smithi173:Running:
2019-05-09T23:33:11.594 INFO:teuthology.orchestra.run.smithi173:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2019-05-09T23:33:11.756 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 86, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 65, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-parial-recovery-2019-05-09/qa/tasks/workunit.py", line 123, in task
    timeout=timeout,cleanup=cleanup)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 85, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 99, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 22, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-parial-recovery-2019-05-09/qa/tasks/workunit.py", line 409, in _run_tests
    label="workunit test {workunit}".format(workunit=workunit)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 205, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 437, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 162, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 184, in _raise_for_status
    node=self.hostname, label=self.label
CommandFailedError: Command failed (workunit test libcephfs/test.sh) on smithi173 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=c04a3ccdd2e25c2075c7680ed3203151b99f2333 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/libcephfs/test.sh'

From: /ceph/teuthology-archive/nojha-2019-05-09_22:58:42-fs:basic_workload-wip-parial-recovery-2019-05-09-distro-basic-smithi/3942897/teuthology.log

David, any idea?


Related issues

Related to CephFS - Feature #38838: Expose CephFS snapshot creation time to clients Resolved
Copied to CephFS - Bug #39943: client: ceph.dir.rctime xattr value incorrectly prefixes "09" to the nanoseconds component Resolved
Copied to CephFS - Backport #40169: nautilus: qa: Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205 Resolved

History

#1 Updated by Patrick Donnelly almost 5 years ago

  • Related to Feature #38838: Expose CephFS snapshot creation time to clients added

#2 Updated by David Disseldorp almost 5 years ago

Patrick Donnelly wrote:

[...]

From: /ceph/teuthology-archive/nojha-2019-05-09_22:58:42-fs:basic_workload-wip-parial-recovery-2019-05-09-distro-basic-smithi/3942897/teuthology.log

David, any idea?

Thanks for the ping.

For the test we have a single client creating two sequential snapshots and confirming that the second snapshot has a btime (MDS SnapInfo.stamp) value higher than that of the first. In the failure the values are (btime=2019-05-09 23:33:09.400554) < (new_btime=2019-05-09 23:33:09.094205), so we have a small delta in the wrong direction.

Without knowing too much about how snapshots are processed by the MDS, could it be that a snapshot's SnapInfo.stamp value only gets committed after the lazy asynchronous client flush has completed? If so, out of order completion may not be unexpected for sequential snapshots.

#3 Updated by Patrick Donnelly almost 5 years ago

  • Assignee changed from David Disseldorp to Zheng Yan

#4 Updated by Zheng Yan almost 5 years ago


2019-05-09 23:33:09.010 7f3f71911700 15  mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280)  ? snap(2 0x1 'test_snap_xattr_snap_15872' 2019-05-09 23:31:42.199137)
2019-05-09 23:33:09.010 7f3f71911700 15  mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280)  ? snap(3 0x1 'test_snap_xattr_snap2_15872' 2019-05-09 23:31:42.208800)
2019-05-09 23:33:09.010 7f3f71911700 15  mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280)  ? snap(4 0x1 'test_snap_xattr_snap_91533' 2019-05-09 23:33:08.990488)
2019-05-09 23:33:09.010 7f3f71911700 15  mds.0.cache.snaprealm(0x1 seq 5 0x55d157888280)  ? snap(5 0x1 'test_snap_xattr_snap2_91533' 2019-05-09 23:33:09.004205)

here is btime of snapshots, looks like the test case does not parse time stamp correctly

#5 Updated by Zheng Yan almost 5 years ago

  • Assignee changed from Zheng Yan to David Disseldorp

#6 Updated by David Disseldorp almost 5 years ago

This bug is due to incorrect placement of the pad/width specifier in:

11809 size_t Client::_vxattrcb_snap_btime(Inode *in, char *val, size_t size)
11810 {
11811 return snprintf(val, size, "%llu.09%lu",

The "09" string is incorrectly provided as a prefix for the nanoseconds value, instead of setting zero-padding width. The same bug is present in the
(not yet merged) snap.btime kernel patchset.

This bug is a cut-n-paste bug from:

11790 size_t Client::_vxattrcb_dir_rctime(Inode *in, char *val, size_t size)
11791 {
11792 return snprintf(val, size, "%ld.09%ld", (long)in->rstat.rctime.sec(),
11793 (long)in->rstat.rctime.nsec());

...I'll raise a separate bug for "dir.rctime", which also affects kernel.

#7 Updated by David Disseldorp almost 5 years ago

  • Copied to Bug #39943: client: ceph.dir.rctime xattr value incorrectly prefixes "09" to the nanoseconds component added

#8 Updated by David Disseldorp almost 5 years ago

  • Pull request ID set to 28116

#9 Updated by Patrick Donnelly almost 5 years ago

  • Status changed from New to Pending Backport

#10 Updated by Nathan Cutler almost 5 years ago

  • Copied to Backport #40169: nautilus: qa: Expected: (btime) < (new_btime), actual: 2019-05-09 23:33:09.400554 vs 2019-05-09 23:33:09.094205 added

#11 Updated by Nathan Cutler almost 5 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF