Bug #52487
qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation)
% Done:
100%
Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
See this in https://pulpito.ceph.com/yuriw-2021-08-31_15:01:19-fs-wip-yuri8-testing-2021-08-30-0930-pacific-distro-basic-smithi/6368798/ when testing https://trello.com/c/2djzuDQp/1324-wip-yuri8-testing-2021-08-30-0930-pacific-old-wip-yuri8-testing-2021-08-26-1210-pacific
2021-08-31T22:10:51.883 INFO:tasks.cephfs_test_runner:test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) ... FAIL 2021-08-31T22:10:51.890 INFO:tasks.cephfs_test_runner: 2021-08-31T22:10:51.891 INFO:tasks.cephfs_test_runner:====================================================================== 2021-08-31T22:10:51.891 INFO:tasks.cephfs_test_runner:FAIL: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) 2021-08-31T22:10:51.892 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2021-08-31T22:10:51.892 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2021-08-31T22:10:51.893 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_d209ab89b7d476beb1df3653db8af7dab734e135/qa/tasks/cephfs/test_fragment.py", line 231, in test_deep_split 2021-08-31T22:10:51.894 INFO:tasks.cephfs_test_runner: self.assertListEqual(frag_objs, []) 2021-08-31T22:10:51.895 INFO:tasks.cephfs_test_runner:AssertionError: Lists differ: ['10000000000.05980000', '10000000000.05a0[745 chars]000'] != [] 2021-08-31T22:10:51.895 INFO:tasks.cephfs_test_runner: 2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:First list contains 33 additional elements. 2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:First extra element 0: 2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:'10000000000.05980000' 2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner: 2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner:Diff is 896 characters long. Set self.maxDiff to None to see it. 2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner: 2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2021-08-31T22:10:51.898 INFO:tasks.cephfs_test_runner:Ran 1 test in 98.090s
Related issues
History
#1 Updated by Patrick Donnelly about 2 years ago
- Status changed from New to Triaged
- Assignee set to Venky Shankar
- Target version set to v17.0.0
- Backport set to pacific
#2 Updated by Venky Shankar about 2 years ago
- Status changed from Triaged to In Progress
The check here0 results in `num_strays` being zero right after the journal was flushed::
2021-08-31T22:10:41.847 DEBUG:teuthology.orchestra.run.smithi096:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 900 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.c.asok --format=json perf dump mds_cache 2021-08-31T22:10:42.035 INFO:teuthology.orchestra.run.smithi096.stdout:{"mds_cache":{"num_strays":0,"num_strays_delayed":0,"num_strays_enqueuing":0,"strays_created":2401,"strays_enqueued":2401,"strays_reintegrated":0,"strays_migrated":0,"num_recovering_processing":0,"num_recovering_enqueued":0,"num_recovering_prioritized":0,"recovery_started":0,"recovery_completed":0,"ireq_enqueue_scrub":0,"ireq_exportdir":0,"ireq_flush":0,"ireq_fragmentdir":32,"ireq_fragstats":0,"ireq_inodestats":0}} 2021-08-31T22:10:42.035 DEBUG:tasks.cephfs.filesystem:_json_asok output { "mds_cache": { "ireq_enqueue_scrub": 0, "ireq_exportdir": 0, "ireq_flush": 0, "ireq_fragmentdir": 32, "ireq_fragstats": 0, "ireq_inodestats": 0, "num_recovering_enqueued": 0, "num_recovering_prioritized": 0, "num_recovering_processing": 0, "num_strays": 0, "num_strays_delayed": 0, "num_strays_enqueuing": 0, "recovery_completed": 0, "recovery_started": 0, "strays_created": 2401, "strays_enqueued": 2401, "strays_migrated": 0, "strays_reintegrated": 0 } }
Which means either the check was done too soo (I.e., replying on `num_strays` from `perf dump` can be misleading or there is a bug in accounting `num_strays`).
The objects were in midst of getting purged though as the subsequent object list check was ongoing (mds.c.log)::
... ... 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05b80000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05b00000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05a80000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05a00000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05980000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05900000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05880000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05800000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05780000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05700000 2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item: remove dirfrag 10000000000.05680000 ... ...
[0]: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_fragment.py#L219
#3 Updated by Venky Shankar about 2 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 44063
#4 Updated by Venky Shankar almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
#5 Updated by Backport Bot almost 2 years ago
- Copied to Backport #53458: pacific: pacific: qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) added
#6 Updated by Backport Bot over 1 year ago
- Tags set to backport_processed
#7 Updated by Venky Shankar about 1 year ago
- Subject changed from pacific: qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) to qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation)
- Status changed from Pending Backport to Resolved
#8 Updated by Konstantin Shalygin about 1 year ago
- % Done changed from 0 to 100
- Tags deleted (
backport_processed)