Project

General

Profile

Bug #27053

qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"

Added by Yuri Weinstein over 5 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
kcephfs, rados
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is for 12.2.8

Run: http://pulpito.ceph.com/yuriw-2018-08-21_16:17:40-rados-luminous-distro-basic-smithi/
Job: 2932441
Logs: http://qa-proxy.ceph.com/teuthology/yuriw-2018-08-21_16:17:40-rados-luminous-distro-basic-smithi/2932441/teuthology.log

2018-08-21T22:49:38.349 INFO:tasks.ceph.osd.3.smithi196.stderr:2018-08-21 22:49:38.351396 7f34b3ccd700 -1 received  signal: Hangup from  PID: 34513 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 3  UID: 0
2018-08-21T22:49:38.361 INFO:teuthology.orchestra.run.smithi196:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok dump_historic_ops'
2018-08-21T22:49:38.455 INFO:teuthology.orchestra.run.smithi143.stderr:2018-08-21 22:49:38.456908 7f0bb1fa1700 -1 WARNING: all dangerous and experimental features are enabled.
2018-08-21T22:49:38.467 INFO:tasks.ceph.osd.4.smithi196.stderr:2018-08-21 22:49:38.469281 7f59672ad700 -1 received  signal: Hangup from  PID: 12522 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  UID: 0
2018-08-21T22:49:38.472 INFO:teuthology.orchestra.run.smithi143.stderr:2018-08-21 22:49:38.474731 7f0bb1fa1700 -1 WARNING: all dangerous and experimental features are enabled.
2018-08-21T22:49:38.494 INFO:teuthology.orchestra.run.smithi143.stderr:osd.0: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.510 INFO:teuthology.orchestra.run.smithi143.stderr:osd.1: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.521 INFO:teuthology.orchestra.run.smithi143.stderr:osd.2: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.537 INFO:teuthology.orchestra.run.smithi143.stderr:osd.3: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.551 INFO:tasks.ceph.osd.3.smithi196.stderr:2018-08-21 22:49:38.553689 7f34b3ccd700 -1 received  signal: Hangup from  PID: 34513 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 3  UID: 0
2018-08-21T22:49:38.553 INFO:teuthology.orchestra.run.smithi143.stderr:osd.4: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.556 INFO:teuthology.orchestra.run.smithi143:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok dump_ops_in_flight'
2018-08-21T22:49:38.563 INFO:tasks.radosbench.radosbench.0.smithi196.stdout:22266      16    648074    648058   29.1013         0           -   0.0123366
2018-08-21T22:49:38.566 INFO:teuthology.orchestra.run.smithi143.stderr:osd.5: osd_enable_op_tracker = 'true'
2018-08-21T22:49:38.653 INFO:tasks.ceph.osd.4.smithi196.stderr:2018-08-21 22:49:38.654849 7f59672ad700 -1 received  signal: Hangup from  PID: 12522 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  UID: 0
2018-08-21T22:49:38.682 INFO:teuthology.orchestra.run.smithi143:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok dump_blocked_ops'
2018-08-21T22:49:38.686 ERROR:teuthology.run_tasks:Manager failed: radosbench
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 159, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/src/github.com_ceph_ceph_luminous/qa/tasks/radosbench.py", line 132, in task
    run.wait(radosbench.itervalues(), timeout=timeout)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 441, in wait
    check_time()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/contextutil.py", line 132, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: reached maximum tries (3650) after waiting for 21900 seconds
2018-08-21T22:49:38.687 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2018-08-21T22:49:38.927 INFO:tasks.thrashosds:joining thrashosds
2018-08-21T22:49:38.928 INFO:tasks.ceph.osd.3.smithi196.stderr:2018-08-21 22:49:38.890808 7f34c1dea700 -1 log_channel(cluster) log [ERR] : 2.0 has 1 objects unfound and apparently lost


Related issues

Duplicated by CephFS - Bug #46608: qa: thrashosds: log [ERR] : 4.0 has 3 objects unfound and apparently lost Duplicate

History

#1 Updated by Yuri Weinstein over 5 years ago

  • Project changed from CephFS to RADOS

#2 Updated by Neha Ojha over 5 years ago

Similar failure seen in mimic: /a/yuriw-2018-08-21_23:27:39-rados-wip-yuri5-testing-2018-08-21-2033-mimic-distro-basic-smithi/2933806/

Test: rados/singleton/{all/thrash-eio.yaml msgr-failures/many.yaml msgr/random.yaml objectstore/bluestore.yaml rados.yaml}

#3 Updated by Neha Ojha over 5 years ago

  • Priority changed from Urgent to Normal

#4 Updated by Patrick Donnelly over 3 years ago

  • Duplicated by Bug #46608: qa: thrashosds: log [ERR] : 4.0 has 3 objects unfound and apparently lost added

#5 Updated by Patrick Donnelly over 3 years ago

  • Subject changed from "[ERR] : 2.0 has 1 objects unfound and apparently lost" in rados to qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
  • ceph-qa-suite kcephfs added

/ceph/teuthology-archive/pdonnell-2020-07-17_01:54:54-kcephfs-wip-pdonnell-testing-20200717.003135-distro-basic-smithi/5233528/teuthology.log

seen in master integration branch.

#6 Updated by Brad Hubbard over 3 years ago

  • Priority changed from Normal to High
  • Backport set to octopus

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224148

#7 Updated by Neha Ojha over 3 years ago

  • Priority changed from High to Normal

#8 Updated by Deepika Upadhyay over 2 years ago

2021-10-19T15:06:11.610 INFO:tasks.ceph.osd.4.smithi157.stderr:2021-10-19T15:06:11.605+0000 7f0129f1f700 -1 received  signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  (PID: 28388) UID: 0
2021-10-19T15:06:11.610 INFO:tasks.radosbench.radosbench.0.smithi157.stdout:22325       1     11689     11688 0.0327173         0           -     2.10788
2021-10-19T15:06:11.648 INFO:teuthology.orchestra.run.smithi157.stdout:ERROR: (22) Invalid argument
2021-10-19T15:06:11.648 INFO:teuthology.orchestra.run.smithi157.stdout:op_tracker tracking is not enabled now, so no ops are tracked currently, even those get stuck. Please enable "osd_enable_op_tracker", and the tracker will start to track new ops received afterwards.
2021-10-19T15:06:11.658 DEBUG:teuthology.orchestra.run.smithi157:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight
2021-10-19T15:06:11.663 INFO:tasks.ceph.osd.0.smithi061.stderr:2021-10-19T15:06:11.659+0000 7fade0c56700 -1 log_channel(cluster) log [ERR] : 2.e has 1 objects unfound and apparently lost

/ceph/teuthology-archive/yuriw-2021-10-18_19:03:43-rados-wip-yuri5-testing-2021-10-18-0906-octopus-distro-basic-smithi/6449404/teuthology.log

#9 Updated by Neha Ojha over 2 years ago

Deepika Upadhyay wrote:

[...]

/ceph/teuthology-archive/yuriw-2021-10-18_19:03:43-rados-wip-yuri5-testing-2021-10-18-0906-octopus-distro-basic-smithi/6449404/teuthology.log

The issue is https://tracker.ceph.com/issues/49888.

Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_b35344e81ac507b6cad7c1cae575ce08b9c766f2/teuthology/run_tasks.py", line 176, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
    next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_cd530917c8d3341fa3f414f3db51aa7b9cdf2d6a/qa/tasks/radosbench.py", line 141, in task
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_b35344e81ac507b6cad7c1cae575ce08b9c766f2/teuthology/orchestra/run.py", line 473, in wait
    check_time()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_b35344e81ac507b6cad7c1cae575ce08b9c766f2/teuthology/contextutil.py", line 133, in __call__
    raise MaxWhileTries(error_msg)
teuthology.exceptions.MaxWhileTries: reached maximum tries (3650) after waiting for 21900 seconds

Also available in: Atom PDF