Project

General

Profile

Bug #56097

Timeout on `sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell osd.1 flush_pg_stats`

Added by Laura Flores 3 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2022-06-16_18:33:18-rados-wip-yuri5-testing-2022-06-16-0649-distro-default-smithi/6882594

2022-06-16T19:51:06.998 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:50:44.256+0000 7f9b141f8700 -1 bdev(0x5613befcfc00 /var/lib/ceph/osd/ceph-1/block) aio_write objectstore_blackhole=true, throwing out IO
2022-06-16T19:51:06.999 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:50:44.256+0000 7f9b141f8700 -1 bdev(0x5613befcfc00 /var/lib/ceph/osd/ceph-1/block) aio_write objectstore_blackhole=true, throwing out IO
2022-06-16T19:51:06.999 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:50:44.256+0000 7f9b141f8700 -1 bdev(0x5613befcfc00 /var/lib/ceph/osd/ceph-1/block) aio_write objectstore_blackhole=true, throwing out IO
2022-06-16T19:51:06.999 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:50:44.256+0000 7f9b141f8700 -1 bdev(0x5613befcfc00 /var/lib/ceph/osd/ceph-1/block) aio_write objectstore_blackhole=true, throwing out IO

...

:43.880388+0000 (oldest deadline 2022-06-16T19:51:09.780411+0000)
2022-06-16T19:51:18.463 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:51:18.453+0000 7f9e45af84c0 -1 bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file)
2022-06-16T19:51:18.921 INFO:tasks.ceph.osd.0.smithi160.stderr:2022-06-16T19:51:18.909+0000 7fe03df39700 -1 osd.0 20 heartbeat_check: no reply from 172.21.15.160:6806 osd.1 since back 2022-06-16T19:50:43.880447+0000 front 2022-06-16T19:50:43.880388+0000 (oldest deadline 2022-06-16T19:51:09.780411+0000)
2022-06-16T19:51:19.976 INFO:tasks.ceph.osd.0.smithi160.stderr:2022-06-16T19:51:19.937+0000 7fe03df39700 -1 osd.0 20 heartbeat_check: no reply from 172.21.15.160:6806 osd.1 since back 2022-06-16T19:50:43.880447+0000 front 2022-06-16T19:50:43.880388+0000 (oldest deadline 2022-06-16T19:51:09.780411+0000)
2022-06-16T19:51:20.468 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:51:20.469+0000 7f9e45af84c0 -1 osd.1 20 log_to_monitors true
2022-06-16T19:51:20.946 INFO:tasks.ceph.osd.0.smithi160.stderr:2022-06-16T19:51:20.917+0000 7fe03df39700 -1 osd.0 20 heartbeat_check: no reply from 172.21.15.160:6806 osd.1 since back 2022-06-16T19:50:43.880447+0000 front 2022-06-16T19:50:43.880388+0000 (oldest deadline 2022-06-16T19:51:09.780411+0000)
2022-06-16T19:51:21.396 INFO:tasks.ceph.osd.1.smithi160.stderr:2022-06-16T19:51:21.373+0000 7f9e3b03e700 -1 osd.1 20 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
2022-06-16T19:51:21.919 INFO:tasks.ceph.osd.0.smithi160.stderr:2022-06-16T19:51:21.913+0000 7fe03df39700 -1 osd.0 21 heartbeat_check: no reply from 172.21.15.160:6806 osd.1 since back 2022-06-16T19:50:43.880447+0000 front 2022-06-16T19:50:43.880388+0000 (oldest deadline 2022-06-16T19:51:09.780411+0000)
2022-06-16T19:51:22.830 DEBUG:teuthology.orchestra.run.smithi160:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 0 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok dump_ops_in_flight
2022-06-16T19:51:22.963 INFO:teuthology.orchestra.run.smithi160.stdout:{
2022-06-16T19:51:22.963 INFO:teuthology.orchestra.run.smithi160.stdout:    "ops": [],
2022-06-16T19:51:22.963 INFO:teuthology.orchestra.run.smithi160.stdout:    "num_ops": 0
2022-06-16T19:51:22.965 INFO:teuthology.orchestra.run.smithi160.stdout:}
2022-06-16T19:51:22.965 INFO:tasks.osd_recovery:err is 0
2022-06-16T19:51:22.966 DEBUG:teuthology.orchestra.run.smithi160:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell osd.0 flush_pg_stats
2022-06-16T19:51:22.967 DEBUG:teuthology.orchestra.run.smithi160:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell osd.1 flush_pg_stats
2022-06-16T19:51:23.077 INFO:teuthology.orchestra.run.smithi160.stderr:Error ENXIO: problem getting command descriptions from osd.1
2022-06-16T19:51:23.094 DEBUG:teuthology.orchestra.run:got remote process result: 6
2022-06-16T19:51:23.095 ERROR:teuthology.run_tasks:Saw exception from tasks.

According to the Sentry history, the first sighting of this failure occurred on Jun 4, 2022 3:21:22 PM UTC on a Pacific test branch:
/a/yuriw-2022-06-04_14:50:12-rados-wip-yuri2-testing-2022-06-03-1350-pacific-distro-default-smithi/6863230

History

#1 Updated by Laura Flores 3 months ago

  • Backport set to pacific

#2 Updated by Kamoltat Sirivadhna about 2 months ago

  • Related to Bug #57015: bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file) added

#3 Updated by Laura Flores about 2 months ago

This one went dead after awhile:
/a/yuriw-2022-08-04_20:43:31-rados-wip-yuri6-testing-2022-08-04-0617-pacific-distro-default-smithi/6958619

#4 Updated by Laura Flores about 2 months ago

  • Assignee set to Adam Kupczyk

@Adam maybe you'd have an idea of what's going on here?

#5 Updated by Neha Ojha about 1 month ago

  • Related to deleted (Bug #57015: bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file))

#6 Updated by Neha Ojha about 1 month ago

NCB was added in Quincy, so breaking the relationship.

Also available in: Atom PDF