Project

General

Profile

Bug #44631

ceph pg dump error code 124

Added by Sage Weil over 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific, octopus, nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-02-25T18:25:09.386 INFO:teuthology.orchestra.run.smithi156:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph --log-early pg dump --format=json
2020-02-25T18:27:09.619 DEBUG:teuthology.orchestra.run:got remote process result: 124
2020-02-25T18:27:09.620 ERROR:teuthology.run_tasks:Saw exception from tasks.

/a/sage-2020-02-25_15:51:04-rados-wip-sage2-testing-2020-02-25-0704-distro-basic-smithi/4801824

no useful log. AFAICS prevoius pg dump commands hit the mon (get_command_descriptions) and then mgr (pg dump), but this particular command hit neither.

then again:

2020-03-16T19:06:26.014 INFO:teuthology.orchestra.run.smithi072:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph --log-early pg dump --format=json
2020-03-16T19:08:26.035 DEBUG:teuthology.orchestra.run:got remote process result: 124
2020-03-16T19:08:26.035 ERROR:teuthology.contextutil:Saw exception from nested tasks

/a/sage-2020-03-16_13:50:19-rados-wip-sage-testing-2020-03-15-1802-distro-basic-smithi/4860316


Related issues

Related to RADOS - Bug #45190: osd dump times out New

History

#1 Updated by Sage Weil over 3 years ago

124 -> process was killed by SIGTERM (according to https://www.howtogeek.com/423286/how-to-use-the-timeout-command-on-linux/)

#2 Updated by Sage Weil over 3 years ago

/a/sage-2020-03-27_13:32:58-rados-wip-sage3-testing-2020-03-26-1757-distro-basic-smithi/4895381

#3 Updated by Neha Ojha over 3 years ago

I think the pg dump command is timing out for some reason. The timestamps between the following log lines indicate that it waited for 120 seconds. Unfortunately on both occasions the job went dead, we don't have any logs.

2020-02-25T18:25:09.386 INFO:teuthology.orchestra.run.smithi156:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph --log-early pg dump --format=json
2020-02-25T18:27:09.619 DEBUG:teuthology.orchestra.run:got remote process result: 124
2020-02-25T18:27:09.620 ERROR:teuthology.run_tasks:Saw exception from tasks.

#4 Updated by Neha Ojha over 3 years ago

  • Status changed from New to Can't reproduce

#5 Updated by Brad Hubbard about 3 years ago

/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5290923

#6 Updated by Neha Ojha about 3 years ago

  • Status changed from Can't reproduce to New
  • Priority changed from Urgent to Normal

#7 Updated by Brad Hubbard about 3 years ago

#8 Updated by Neha Ojha over 2 years ago

  • Backport set to pacific, octopus, nautilus

/a/yuriw-2021-03-02_20:59:34-rados-wip-yuri7-testing-2021-03-02-1118-nautilus-distro-basic-smithi/5928174

#9 Updated by Patrick Donnelly over 2 years ago

/ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/5932220/teuthology.log

Also available in: Atom PDF