Bug #854: unsynchronized clocks between kernel-client/cmds cause PJD fstest failures - CephFS - Ceph

Actions

Copy link

Bug #854

closed

unsynchronized clocks between kernel-client/cmds cause PJD fstest failures

Added by Brian Chrisman about 13 years ago. Updated about 10 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I'm seeing a varied number (generally 5-8) of POSIX tests within the PJD fstest suite failing when the tests are being run on a node (atop the ceph kernel client) where that node's clock is not synchronized with the node hosting the active MDS.
Synchronizing the clocks in the cluster using ntpdate/xntpd returns PJD fstests to full success.

This is not likely critical because:
- failures are small corner cases (unexpected ctimes during operations like lchown of a symlink, for example)
- workaround of having clocks set correctly is reasonable
- may be a known design issue (mds creating a timestamp instead of taking the client's time stamp?) for some reason

Here's a histogram of the test failures (out of 21 repeat runs)
(count) (test number) (status) (line number) (test filename)

19 102:fail:[218 /opt/scale/lib/pjdfstests/tests/chown/00.t]
     21 112:fail:[236 /opt/scale/lib/pjdfstests/tests/chown/00.t]
     13 141:fail:[287 /opt/scale/lib/pjdfstests/tests/chown/00.t]
     17 145:fail:[302 /opt/scale/lib/pjdfstests/tests/chown/00.t]
     21 153:fail:[332 /opt/scale/lib/pjdfstests/tests/chown/00.t]
     13 27:fail:[70 /opt/scale/lib/pjdfstests/tests/chmod/00.t]
     18 31:fail:[78 /opt/scale/lib/pjdfstests/tests/chmod/00.t]
     11 97:fail:[209 /opt/scale/lib/pjdfstests/tests/chown/00.t]

Different tests will fail in different runs... a few (with 21/21) fail consistently.

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Ah, that makes sense. This is something we're unlikely to fix -- currently a lot of operations occur "on" the MDS (renames, creates, etc) and so sending a client time along for those wouldn't make much sense. But we need to refer to both the kernel client's time and the MDS' time since other operations (like inode data changes) occur "on" the kernel client and are reported to the MDS in batches.
But maybe some brilliant idea will come up in the future!

Actions

Copy link

Updated by Sage Weil about 13 years ago

The only reasonably sane idea I have here is for the client/mds to compare clocks to estimate skew and have some sort of auto-adjustment going on. It's hard to say what that adjustment should be, though. Maybe just periodically spamming the console when the skew is significant is the thing to do.

Actions

Copy link