Fix #5399
closedtimestamp changes on replayed mds request (pjd link 71)
Added by Sage Weil almost 11 years ago. Updated almost 10 years ago.
0%
Description
teuthology-2013-06-19_10:46:59-fs-cuttlefish-master-basic 40138 40141
Updated by Greg Farnum almost 11 years ago
- Status changed from New to Rejected
It's a time stamp check for things going backwards, and is failing due to out-of-sync clocks (over a network) being handled in ways we know about and have no intention of changing.
Updated by Greg Farnum almost 11 years ago
- Status changed from Rejected to New
Hmm, Sage points out this might be something else; reopening.
Updated by Sage Weil almost 11 years ago
- Subject changed from pjd link test failure on cuttlefish to timestamp changes on replayed mds request (pjd link 71)
- Status changed from New to 12
- we send a create to mds
- get an ack, but it isn't journaled
- pjd stats the mtime/ctime/ec.
- mds restarts
- we replay the request
- new mds replies, this time with a different mtime/ctime/etc
- ... pjd complains
in reality, the directory ctime/mtime are also different, but the client doesn't notice because it only restats the file (and not the dir) on the traceless reply.
from pjd output
2013-06-20T18:34:12.177 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest create fstest_28f94f92cf9776f020b4e141e70ad403 0644 2013-06-20T18:34:12.177 INFO:teuthology.task.workunit.client.0.err:++ tail -1 2013-06-20T18:34:12.186 INFO:teuthology.task.workunit.client.0.err:+ r=0 2013-06-20T18:34:12.186 INFO:teuthology.task.workunit.client.0.err:+ echo 0 2013-06-20T18:34:12.187 INFO:teuthology.task.workunit.client.0.err:+ egrep '^0$' 2013-06-20T18:34:12.188 INFO:teuthology.task.workunit.client.0.err:+ '[' 0 -eq 0 ']' 2013-06-20T18:34:12.188 INFO:teuthology.task.workunit.client.0.err:+ echo 'ok 68' 2013-06-20T18:34:12.189 INFO:teuthology.task.workunit.client.0.err:++ expr 68 + 1 2013-06-20T18:34:12.190 INFO:teuthology.task.workunit.client.0.err:+ ntest=69 2013-06-20T18:34:12.190 INFO:teuthology.task.workunit.client.0.err:+ expect 0 -- chown fstest_28f94f92cf9776f020b4e141e70ad403 65534 -1 2013-06-20T18:34:12.190 INFO:teuthology.task.workunit.client.0.err:+ e=0 2013-06-20T18:34:12.190 INFO:teuthology.task.workunit.client.0.err:+ shift 2013-06-20T18:34:12.191 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest -- chown fstest_28f94f92cf9776f020b4e141e70ad403 65534 -1 2013-06-20T18:34:12.191 INFO:teuthology.task.workunit.client.0.err:++ tail -1 2013-06-20T18:34:12.195 INFO:teuthology.task.workunit.client.0.err:+ r=0 2013-06-20T18:34:12.195 INFO:teuthology.task.workunit.client.0.err:+ echo 0 2013-06-20T18:34:12.195 INFO:teuthology.task.workunit.client.0.err:+ egrep '^0$' 2013-06-20T18:34:12.197 INFO:teuthology.task.workunit.client.0.err:+ '[' 0 -eq 0 ']' 2013-06-20T18:34:12.197 INFO:teuthology.task.workunit.client.0.err:+ echo 'ok 69' 2013-06-20T18:34:12.197 INFO:teuthology.task.workunit.client.0.err:++ expr 69 + 1 2013-06-20T18:34:12.198 INFO:teuthology.task.workunit.client.0.err:+ ntest=70 2013-06-20T18:34:12.199 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest stat fstest_28f94f92cf9776f020b4e141e70ad403 ctime 2013-06-20T18:34:12.202 INFO:teuthology.task.workunit.client.0.err:+ ctime1=1371778452 2013-06-20T18:34:12.202 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest stat . ctime 2013-06-20T18:34:12.205 INFO:teuthology.task.workunit.client.0.err:+ dctime1=1371778452 2013-06-20T18:34:12.205 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest stat . mtime 2013-06-20T18:34:12.208 INFO:teuthology.task.workunit.client.0.err:+ dmtime1=1371778452 2013-06-20T18:34:12.208 INFO:teuthology.task.workunit.client.0.err:+ sleep 1 2013-06-20T18:34:12.407 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:12.416 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:12.416 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:12.597 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:12.608 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:kill mds.a (rank=0) 2013-06-20T18:34:12.608 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:12.608 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:12.789 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:12.800 DEBUG:teuthology.task.ceph.mds.a:waiting for process to exit 2013-06-20T18:34:12.809 INFO:teuthology.task.ceph.mds.a:Stopped 2013-06-20T18:34:12.809 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:12.809 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:12.994 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:13.005 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:13.005 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:13.188 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:13.199 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:waiting till mds map indicates mds.a is laggy/crashed, in failed state, or mds.a is removed from mdsmap 2013-06-20T18:34:13.210 INFO:teuthology.task.workunit.client.0.err:+ expect EACCES -u 65534 link fstest_28f94f92cf9776f020b4e141e70ad403 fstest_eb61070000c32b1cf879744fc2fdbb9d 2013-06-20T18:34:13.210 INFO:teuthology.task.workunit.client.0.err:+ e=EACCES 2013-06-20T18:34:13.210 INFO:teuthology.task.workunit.client.0.err:+ shift 2013-06-20T18:34:13.211 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest -u 65534 link fstest_28f94f92cf9776f020b4e141e70ad403 fstest_eb61070000c32b1cf879744fc2fdbb9d 2013-06-20T18:34:13.211 INFO:teuthology.task.workunit.client.0.err:++ tail -1 2013-06-20T18:34:15.199 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:15.199 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:15.380 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:15.391 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:15.392 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:15.738 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 2013-06-20T18:34:15.749 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:waiting till mds map indicates mds.a is laggy/crashed, in failed state, or mds.a is removed from mdsmap 2013-06-20T18:34:17.748 DEBUG:teuthology.misc:with jobid basedir: 41286 2013-06-20T18:34:17.749 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: '/home/ubuntu/cephtest/41286/enable-coredump ceph-coverage /home/ubuntu/cephtest/41286/archive/coverage ceph mds dump --format=json' 2013-06-20T18:34:17.933 INFO:teuthology.orchestra.run.err:dumped mdsmap epoch 17 ... 2013-06-20T18:34:31.306 INFO:teuthology.task.workunit.client.0.err:+ r=EACCES 2013-06-20T18:34:31.306 INFO:teuthology.task.workunit.client.0.err:+ echo EACCES 2013-06-20T18:34:31.306 INFO:teuthology.task.workunit.client.0.err:+ egrep '^EACCES$' 2013-06-20T18:34:31.308 INFO:teuthology.task.workunit.client.0.err:+ '[' 0 -eq 0 ']' 2013-06-20T18:34:31.308 INFO:teuthology.task.workunit.client.0.err:+ echo 'ok 70' 2013-06-20T18:34:31.308 INFO:teuthology.task.workunit.client.0.err:++ expr 70 + 1 2013-06-20T18:34:31.309 INFO:teuthology.task.workunit.client.0.err:+ ntest=71 2013-06-20T18:34:31.310 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest stat fstest_28f94f92cf9776f020b4e141e70ad403 ctime 2013-06-20T18:34:31.396 INFO:teuthology.task.workunit.client.0.err:+ ctime2=1371778471 2013-06-20T18:34:31.396 INFO:teuthology.task.workunit.client.0.err:+ test_check 1371778452 -eq 1371778471 2013-06-20T18:34:31.396 INFO:teuthology.task.workunit.client.0.err:+ '[' 1371778452 -eq 1371778471 ']' 2013-06-20T18:34:31.396 INFO:teuthology.task.workunit.client.0.err:+ echo 'not ok 71' 2013-06-20T18:34:31.397 INFO:teuthology.task.workunit.client.0.err:++ expr 71 + 1 2013-06-20T18:34:31.397 INFO:teuthology.task.workunit.client.0.err:+ ntest=72 2013-06-20T18:34:31.398 INFO:teuthology.task.workunit.client.0.err:++ /home/ubuntu/cephtest/41286/mnt.0/client.0/tmp/tmp/../pjd-fstest-20090130-RC-open24/tests/link/../../fstest stat . ctime 2013-06-20T18:34:31.401 INFO:teuthology.task.workunit.client.0.err:+ dctime2=1371778452 2013-06-20T18:34:31.402 INFO:teuthology.task.workunit.client.0.err:+ test_check 1371778452 -eq 1371778452
from original mds
2013-06-20 18:34:12.385655 7f73adc35700 1 -- 10.214.132.10:6813/18513 <== client.4129 10.214.132.11:0/19549 368 ==== client_request(client.4129:709 create #1000000014b/fstest_28f94f92cf9776f020b4e141e70ad403) v1 ==== 197+0+0 (1269283097 0 0) 0x322a280 con 0x32da420 2013-06-20 18:34:12.388626 7f73adc35700 1 -- 10.214.132.10:6813/18513 --> 10.214.132.11:0/19549 -- client_reply(???:709 = 0 Success unsafe) v1 -- ?+0 0x2e85b00 con 0x32da420 (then it restarts)
ubuntu@teuthology:/a/sage-2013-06-20_18:13:54-fs-wip-mds-testing-basic/41286
Updated by Sage Weil almost 11 years ago
- Priority changed from Urgent to High
probably need to extend the replayed request message to include the timestamps we got for the inode and dir so that the mds can reuse them.
Updated by Sage Weil almost 10 years ago
- Status changed from 12 to Resolved
commit:f81c53a716f43c8a36d6b2aea88cbebe961502d8