Project

General

Profile

Actions

Bug #43887

closed

ceph_test_rados_delete_pools_parallel failure

Added by Sage Weil about 4 years ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-01-29T03:10:09.647 INFO:teuthology.orchestra.run.smithi006:> sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph_test_rados_delete_pools_parallel
2020-01-29T03:10:12.569 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: starting.
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: creating pool ceph_test_rados_delete_pools_parallel.smithi006-26070
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: created object 0...
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: created object 25...
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: created object 49...
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: finishing.
2020-01-29T03:10:12.570 INFO:teuthology.orchestra.run.smithi006.stdout:process_1_[26071]: shutting down.
2020-01-29T03:10:13.367 INFO:tasks.mon_thrash.mon_thrasher:killing mon.d
2020-01-29T03:10:13.367 INFO:tasks.mon_thrash.mon_thrasher:reviving mon.d
2020-01-29T03:10:13.368 INFO:tasks.ceph.mon.d:Restarting daemon
2020-01-29T03:10:13.368 INFO:teuthology.orchestra.run.smithi062:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f --cluster ceph -i d
2020-01-29T03:10:13.371 INFO:tasks.ceph.mon.d:Started
2020-01-29T03:10:13.371 INFO:tasks.mon_thrash:Sending STOP to mon f
2020-01-29T03:10:13.371 INFO:tasks.ceph.mon.f:Sent signal 19
2020-01-29T03:10:13.371 INFO:tasks.mon_thrash.mon_thrasher:waiting for 20.0 secs to unfreeze mons
2020-01-29T03:10:33.372 INFO:tasks.mon_thrash:Sending CONT to mon f
2020-01-29T03:10:33.373 INFO:tasks.ceph.mon.f:Sent signal 18
2020-01-29T03:10:33.373 INFO:tasks.mon_thrash.ceph_manager:waiting for quorum size 9
2020-01-29T03:10:33.373 INFO:teuthology.orchestra.run.smithi062:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph quorum_status
2020-01-29T03:10:38.470 INFO:teuthology.orchestra.run.smithi006.stdout:process_2_[26072]: starting.
2020-01-29T03:10:38.470 INFO:teuthology.orchestra.run.smithi006.stdout:process_2_[26072]: deleting pool ceph_test_rados_delete_pools_parallel.smithi006-26070
2020-01-29T03:10:38.470 INFO:teuthology.orchestra.run.smithi006.stdout:process_2_[26072]: shutting down.
2020-01-29T03:10:38.494 INFO:teuthology.orchestra.run.smithi006.stdout:*******************************
2020-01-29T03:10:38.494 INFO:teuthology.orchestra.run.smithi006.stdout:process_3_[26112]: starting.
2020-01-29T03:10:38.494 INFO:teuthology.orchestra.run.smithi006.stdout:process_3_[26112]: creating pool ceph_test_rados_delete_pools_parallel.smithi006-26070
2020-01-29T03:10:38.495 INFO:teuthology.orchestra.run.smithi006.stdout:process_3_[26112]: rados_write(0.obj) failed with error: -2
2020-01-29T03:10:38.495 INFO:teuthology.orchestra.run.smithi006.stdout:process_3_[26112]: finishing.
2020-01-29T03:10:38.495 INFO:teuthology.orchestra.run.smithi006.stdout:process_3_[26112]: shutting down.
2020-01-29T03:10:38.495 INFO:teuthology.orchestra.run.smithi006.stdout:*******************************
2020-01-29T03:10:38.495 INFO:teuthology.orchestra.run.smithi006.stdout:test2: got error: run_until_finished: runnable process_3: got error: [26070] returned exit_status (254) Unknown error 254

/a/sage-2020-01-28_23:42:06-rados-wip-sage-testing-2020-01-28-1413-distro-basic-smithi/4715529
description: rados/monthrash/{ceph.yaml clusters/9-mons.yaml msgr-failures/mon-delay.yaml
msgr/async.yaml objectstore/bluestore-low-osd-mem-target.yaml rados.yaml supported-random-distro$/{centos_8.yaml}
thrashers/many.yaml workloads/pool-create-delete.yaml}

Related issues 3 (2 open1 closed)

Related to RADOS - Bug #51246: error in open_pools_parallel: rados_write(0.obj) failed with error: -2New

Actions
Related to RADOS - Bug #46318: mon_recovery: quorum_status times outNeed More InfoSage Weil

Actions
Has duplicate RADOS - Bug #45948: ceph_test_rados_delete_pools_parallel failed with error -2 on nautilusDuplicate

Actions
Actions #1

Updated by Brad Hubbard almost 4 years ago

  • Has duplicate Bug #45948: ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus added
Actions #2

Updated by Deepika Upadhyay over 3 years ago

rados/monthrash/{ceph clusters/3-mons msgr-failures/few msgr/async objectstore/filestore-xfs
rados supported-random-distro$/{centos_latest} thrashers/sync-many workloads/pool-create-delete}

also, seen on octopus
/a/yuriw-2020-10-05_22:17:06-rados-wip-yuri7-testing-2020-10-05-1338-octopus-distro-basic-smithi/5500559/teuthology.log

Actions #3

Updated by Sage Weil almost 3 years ago

  • Project changed from Ceph to RADOS
Actions #4

Updated by Deepika Upadhyay almost 3 years ago

  • Related to Bug #51246: error in open_pools_parallel: rados_write(0.obj) failed with error: -2 added
Actions #5

Updated by Aishwarya Mathuria over 2 years ago

/a/yuriw-2022-01-13_18:06:52-rados-wip-yuri3-testing-2022-01-13-0809-distro-default-smithi/6614510

Actions #6

Updated by Nitzan Mordechai about 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Nitzan Mordechai
Actions #7

Updated by Radoslaw Zarzynski about 2 years ago

  • Priority changed from High to Normal

Lowering the priority as the issue is neither:

  • causing data loss,
  • a frequent thing.
Actions #8

Updated by Nitzan Mordechai almost 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 46099
Actions #9

Updated by Kamoltat (Junior) Sirivadhna about 1 year ago

Encountered this error in: yuriw-2023-03-02_00:09:05-rados-wip-yuri11-testing-2023-03-01-1424-distro-default-smithi/7191503 PTAL

Actions #10

Updated by Laura Flores about 1 year ago

Kamoltat (Junior) Sirivadhna wrote:

Encountered this error in: yuriw-2023-03-02_00:09:05-rados-wip-yuri11-testing-2023-03-01-1424-distro-default-smithi/7191503 PTAL

From the teuthology log (31080 got the error):

2023-03-02T06:58:22.634 INFO:teuthology.orchestra.run.smithi150.stdout:*******************************
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: starting.
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: creating pool ceph_test_rados_delete_pools_parallel.smithi150-31080
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: created object 0...
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: rados_write(1.obj) failed with error: -2
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: finishing.
2023-03-02T06:58:22.635 INFO:teuthology.orchestra.run.smithi150.stdout:process_3_[31117]: shutting down.
2023-03-02T06:58:22.637 INFO:teuthology.orchestra.run.smithi150.stdout:*******************************
2023-03-02T06:58:22.638 INFO:teuthology.orchestra.run.smithi150.stdout:test2: got error: run_until_finished: runnable process_3: got error: [31080] returned exit_status (254) Unknown error 254

From log mon.a.log.gz near the time of the failure, associated with 31080:

2023-03-02T06:58:22.505+0000 7f6458e7e700 20 mon.a@0(leader).mgrstat health checks:
{
    "POOL_APP_NOT_ENABLED": {
        "severity": "HEALTH_WARN",
        "summary": {
            "message": "1 pool(s) do not have an application enabled",
            "count": 1
        },
        "detail": [
            {
                "message": "application not enabled on pool 'ceph_test_rados_delete_pools_parallel.smithi150-31080'" 
            },
            {
                "message": "use 'ceph osd pool application enable <pool-name> <app-name>', where <app-name> is 'cephfs', 'rbd', 'rgw', or freeform for custom applications." 
            }
        ]
    }
}

Actions #11

Updated by Nitzan Mordechai 9 months ago

  • Status changed from Fix Under Review to Resolved
Actions #12

Updated by Radoslaw Zarzynski 5 months ago

  • Related to Bug #46318: mon_recovery: quorum_status times out added
Actions #13

Updated by Laura Flores 3 months ago

/a/yuriw-2024-02-03_16:26:04-rados-wip-yuri10-testing-2024-02-02-1149-pacific-distro-default-smithi/7545622

The fix does not exist in pacific, but this isn't a blocker as it is just a test issue.

Actions

Also available in: Atom PDF