Bug #12971
closedTestQuotaFull fails
0%
Description
2015-09-04T08:46:56.854 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2015-09-04T08:46:56.854 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/test_full.py", line 202, in test_full_different_file 2015-09-04T08:46:56.854 INFO:tasks.cephfs_test_runner: self._test_full(True) 2015-09-04T08:46:56.854 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/test_full.py", line 187, in _test_full 2015-09-04T08:46:56.855 INFO:tasks.cephfs_test_runner: timeout=osd_mon_report_interval_max * 5) 2015-09-04T08:46:56.855 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/cephfs_test_case.py", line 205, in wait_until_true 2015-09-04T08:46:56.855 INFO:tasks.cephfs_test_runner: raise RuntimeError("Timed out after {0} seconds".format(elapsed)) 2015-09-04T08:46:56.855 INFO:tasks.cephfs_test_runner:RuntimeError: Timed out after 25 seconds 2015-09-04T08:46:56.855 INFO:tasks.cephfs_test_runner: 2015-09-04T08:46:56.856 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2015-09-04T08:46:56.856 INFO:tasks.cephfs_test_runner:Ran 7 tests in 518.185s 2015-09-04T08:46:56.856 INFO:tasks.cephfs_test_runner: 2015-09-04T08:46:56.856 INFO:tasks.cephfs_test_runner:FAILED (errors=1) 2015-09-04T08:46:56.856 INFO:tasks.cephfs_test_runner: 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner:====================================================================== 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner:ERROR: test_full_different_file (tasks.cephfs.test_full.TestQuotaFull) 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/test_full.py", line 202, in test_full_different_file 2015-09-04T08:46:56.857 INFO:tasks.cephfs_test_runner: self._test_full(True) 2015-09-04T08:46:56.858 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/test_full.py", line 187, in _test_full 2015-09-04T08:46:56.858 INFO:tasks.cephfs_test_runner: timeout=osd_mon_report_interval_max * 5) 2015-09-04T08:46:56.858 INFO:tasks.cephfs_test_runner: File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/cephfs/cephfs_test_case.py", line 205, in wait_until_true 2015-09-04T08:46:56.858 INFO:tasks.cephfs_test_runner: raise RuntimeError("Timed out after {0} seconds".format(elapsed)) 2015-09-04T08:46:56.858 INFO:tasks.cephfs_test_runner:RuntimeError: Timed out after 25 seconds
Updated by Zheng Yan over 8 years ago
2015-09-04 11:46:30.042620 7f00e7929700 10 mds.0.cache.strays _consume: allocating allowance 5 to 0 in flight 2015-09-04 11:46:30.042624 7f00e7929700 10 mds.0.cache.strays purge [dentry #100/stray1/10000000000 [2,head] auth (dversion lock) v=23 inode=0x7f00f439c320 state=new | request=0 lock=0 inodepin=1 purging=1 dirty=1 authpin=0 0x7f00f43d12c0] [inode 10000000000 [2,head] ~mds0/stray1/10000000000 auth v23 s=16777216 nl=0 n(v0 b16777216 1=1+0) (iversion lock) | ptrwaiter=0 request=0 lock=0 caps=0 dirtyparent=0 dirty=1 waiter=0 authpin=0 0x7f00f439c320] 2015-09-04 11:46:30.042658 7f00e7929700 10 mds.0.cache.strays realm snaprealm(100 seq 1 lc 0 cr 0 cps 1 snaps={} 0x7f00f421ba80) 2015-09-04 11:46:30.042662 7f00e7929700 10 mds.0.cache.strays purge 0~16777216 objects 0~4 snapc 0x7f00f421bc00 on [inode 10000000000 [2,head] ~mds0/stray1/10000000000 auth v23 s=16777216 nl=0 n(v0 b16777216 1=1+0) (iversion lock) | ptrwaiter=0 request=0 lock=0 caps=0 dirtyparent=0 dirty=1 waiter=0 authpin=0 0x7f00f439c320] 2015-09-04 11:46:30.042696 7f00e7929700 0 mds.0.objecter FULL, paused modify 0x7f00f421e300 tid 55 2015-09-04 11:46:30.042709 7f00e7929700 0 mds.0.objecter FULL, paused modify 0x7f00f421e540 tid 56 2015-09-04 11:46:30.042719 7f00e7929700 0 mds.0.objecter FULL, paused modify 0x7f00f4219f80 tid 57 2015-09-04 11:46:30.042728 7f00e7929700 0 mds.0.objecter FULL, paused modify 0x7f00f421e0c0 tid 58
looks like objector pause osd requests that delete objects.
Updated by Zheng Yan over 8 years ago
- Assignee set to John Spray
the objecter pauses all write osd request (including delete request) when cluster is full or pool-quota is reached. Test case 'test_full_different_file' first writes data to files until pool-quota is reached. then delete the files and expects pool's full flag get cleared. This does not suppose to work because MDS's objecter should pauses requests that purge strays.
Updated by Greg Farnum over 8 years ago
That's not quite the problem. The MDS calls Objecter::unset_honor_osdmap_full(), at which point the Objecter should let write operations pass through. Perhaps we're not setting the flag early enough in startup and this was during a replay or something?
Updated by John Spray over 8 years ago
Greg's theory sounds plausible, will look
Updated by John Spray over 8 years ago
- Status changed from New to In Progress
Looks like this commit might have broken it:
commit dbcf2e40d3e8a92f280879c74b7ca954a902b2d1 Author: xinxin shu <xinxin.shu@intel.com> Date: Thu Jun 18 09:44:47 2015 +0800 Fixes : #12018 resend writes after pool loses full flag Signed-off-by: xinxin shu <xinxin.shu@intel.com>
as it introduced an unguarded read of FULL flag that didn't respect the honor_osdmap_full setting
Updated by John Spray over 8 years ago
- Status changed from In Progress to Fix Under Review
Updated by Greg Farnum over 8 years ago
- Status changed from Fix Under Review to Resolved