Project

General

Profile

Bug #52487

qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation)

Added by Ramana Raja over 1 year ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

100%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

See this in https://pulpito.ceph.com/yuriw-2021-08-31_15:01:19-fs-wip-yuri8-testing-2021-08-30-0930-pacific-distro-basic-smithi/6368798/ when testing https://trello.com/c/2djzuDQp/1324-wip-yuri8-testing-2021-08-30-0930-pacific-old-wip-yuri8-testing-2021-08-26-1210-pacific

2021-08-31T22:10:51.883 INFO:tasks.cephfs_test_runner:test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) ... FAIL
2021-08-31T22:10:51.890 INFO:tasks.cephfs_test_runner:
2021-08-31T22:10:51.891 INFO:tasks.cephfs_test_runner:======================================================================
2021-08-31T22:10:51.891 INFO:tasks.cephfs_test_runner:FAIL: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation)
2021-08-31T22:10:51.892 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2021-08-31T22:10:51.892 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2021-08-31T22:10:51.893 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_d209ab89b7d476beb1df3653db8af7dab734e135/qa/tasks/cephfs/test_fragment.py", line 231, in test_deep_split
2021-08-31T22:10:51.894 INFO:tasks.cephfs_test_runner:    self.assertListEqual(frag_objs, [])
2021-08-31T22:10:51.895 INFO:tasks.cephfs_test_runner:AssertionError: Lists differ: ['10000000000.05980000', '10000000000.05a0[745 chars]000'] != []
2021-08-31T22:10:51.895 INFO:tasks.cephfs_test_runner:
2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:First list contains 33 additional elements.
2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:First extra element 0:
2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:'10000000000.05980000'
2021-08-31T22:10:51.896 INFO:tasks.cephfs_test_runner:
2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner:Diff is 896 characters long. Set self.maxDiff to None to see it.
2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner:
2021-08-31T22:10:51.897 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2021-08-31T22:10:51.898 INFO:tasks.cephfs_test_runner:Ran 1 test in 98.090s

Related issues

Copied to CephFS - Backport #53458: pacific: pacific: qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) Resolved

History

#1 Updated by Patrick Donnelly about 1 year ago

  • Status changed from New to Triaged
  • Assignee set to Venky Shankar
  • Target version set to v17.0.0
  • Backport set to pacific

#2 Updated by Venky Shankar about 1 year ago

  • Status changed from Triaged to In Progress

The check here0 results in `num_strays` being zero right after the journal was flushed::

2021-08-31T22:10:41.847 DEBUG:teuthology.orchestra.run.smithi096:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 900 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.c.asok --format=json perf dump mds_cache
2021-08-31T22:10:42.035 INFO:teuthology.orchestra.run.smithi096.stdout:{"mds_cache":{"num_strays":0,"num_strays_delayed":0,"num_strays_enqueuing":0,"strays_created":2401,"strays_enqueued":2401,"strays_reintegrated":0,"strays_migrated":0,"num_recovering_processing":0,"num_recovering_enqueued":0,"num_recovering_prioritized":0,"recovery_started":0,"recovery_completed":0,"ireq_enqueue_scrub":0,"ireq_exportdir":0,"ireq_flush":0,"ireq_fragmentdir":32,"ireq_fragstats":0,"ireq_inodestats":0}}
2021-08-31T22:10:42.035 DEBUG:tasks.cephfs.filesystem:_json_asok output
{
  "mds_cache": {
    "ireq_enqueue_scrub": 0,
    "ireq_exportdir": 0,
    "ireq_flush": 0,
    "ireq_fragmentdir": 32,
    "ireq_fragstats": 0,
    "ireq_inodestats": 0,
    "num_recovering_enqueued": 0,
    "num_recovering_prioritized": 0,
    "num_recovering_processing": 0,
    "num_strays": 0,
    "num_strays_delayed": 0,
    "num_strays_enqueuing": 0,
    "recovery_completed": 0,
    "recovery_started": 0,
    "strays_created": 2401,
    "strays_enqueued": 2401,
    "strays_migrated": 0,
    "strays_reintegrated": 0
  }
}

Which means either the check was done too soo (I.e., replying on `num_strays` from `perf dump` can be misleading or there is a bug in accounting `num_strays`).

The objects were in midst of getting purged though as the subsequent object list check was ongoing (mds.c.log)::

...
...
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05b80000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05b00000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05a80000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05a00000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05980000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05900000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05880000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05800000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05780000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05700000
2021-08-31T22:10:43.080+0000 7fdf1a391700 10 mds.0.purge_queue _execute_item:  remove dirfrag 10000000000.05680000
...
...

[0]: https://github.com/ceph/ceph/blob/master/qa/tasks/cephfs/test_fragment.py#L219

#3 Updated by Venky Shankar about 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 44063

#4 Updated by Venky Shankar about 1 year ago

  • Status changed from Fix Under Review to Pending Backport

#5 Updated by Backport Bot about 1 year ago

  • Copied to Backport #53458: pacific: pacific: qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) added

#6 Updated by Backport Bot 4 months ago

  • Tags set to backport_processed

#7 Updated by Venky Shankar 3 months ago

  • Subject changed from pacific: qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation) to qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragmentation)
  • Status changed from Pending Backport to Resolved

#8 Updated by Konstantin Shalygin 2 months ago

  • % Done changed from 0 to 100
  • Tags deleted (backport_processed)

Also available in: Atom PDF