Project

General

Profile

Bug #21474

bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart"

Added by Kefu Chai almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Tests
Target version:
-
Start date:
09/20/2017
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

/a/kchai-2017-09-19_14:50:44-rados-wip-kefu-testing-2017-09-19-1954-distro-basic-mira/1648644

2017-09-19 21:32:12.622065 7f2c502010c0  1 bluestore(/var/lib/ceph/osd/ceph-1) _fsck finish with 0 errors, 0 repaired, 0 remaining in 224.778802 seconds

osd registers the commands after mounting the store. so:

2017-09-19T21:31:01.475 INFO:teuthology.orchestra.run.mira077:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 0 ceph --cluster ceph --adm
in-daemon /var/run/ceph/ceph-osd.1.asok dump_ops_in_flight'
2017-09-19T21:31:01.590 INFO:teuthology.orchestra.run.mira077.stderr:no valid command found; 10 closest matches:
2017-09-19T21:31:01.591 INFO:teuthology.orchestra.run.mira077.stderr:perf histogram schema
2017-09-19T21:31:01.604 INFO:teuthology.orchestra.run.mira077.stderr:perf histogram dump {<logger>} {<counter>}
2017-09-19T21:31:01.604 INFO:teuthology.orchestra.run.mira077.stderr:log reopen
2017-09-19T21:31:01.604 INFO:teuthology.orchestra.run.mira077.stderr:log flush
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:perf dump {<logger>} {<counter>}
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:objecter_requests
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:git_version
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:get_command_descriptions
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:log dump
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:help
2017-09-19T21:31:01.605 INFO:teuthology.orchestra.run.mira077.stderr:admin_socket: invalid command

hence:
Exception: timed out waiting for admin_socket to appear after osd.1 restart

Related issues

Copied to RADOS - Backport #21543: luminous: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" Resolved
Copied to RADOS - Backport #22166: luminous: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" Resolved

History

#1 Updated by Kefu Chai almost 2 years ago

Sage, if you believe that it's normal for bluestore to take around 4 minutes to complete a deep fsck. i will prolong the timeout for waiting for the asok commands in our test.

#2 Updated by Kefu Chai almost 2 years ago

  • Subject changed from fsck takes 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" to fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart"

#3 Updated by Kefu Chai almost 2 years ago

  • Subject changed from fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" to bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart"

#4 Updated by Sage Weil almost 2 years ago

Sigh.. yeah. I can't decide if we should stop doing these fsck's entirely, or reduce the debug level just for fsck, or... The other part of the problem is it's a 'deep' fsck, which means it reads all data too (to verify checksums).

#5 Updated by Kefu Chai almost 2 years ago

  • Status changed from New to Verified
  • Assignee changed from Sage Weil to Kefu Chai

cool. will update the test.

#6 Updated by Kefu Chai almost 2 years ago

  • Category set to Tests
  • Status changed from Verified to Need Review

#7 Updated by Kefu Chai almost 2 years ago

2017-09-21 21:48:57.547815 7f90a0a3b1c0  1 bluestore(/var/lib/ceph/osd/ceph-4) _fsck finish with 0 errors, 0 repaired, 0 remaining in 156.998835 seconds

/a/kchai-2017-09-21_06:22:45-rados-wip-kefu-testing-2017-09-21-1013-distro-basic-mira/1654961$ zless remote/mira033/log/ceph-osd.4.log.gz

#8 Updated by Sage Weil almost 2 years ago

  • Status changed from Need Review to Pending Backport
  • Backport set to luminous

#9 Updated by Nathan Cutler almost 2 years ago

  • Copied to Backport #21543: luminous: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" added

#11 Updated by Kefu Chai almost 2 years ago

  • Copied to Backport #22166: luminous: bluestore fsck took 224.778802 seconds to complete which caused "timed out waiting for admin_socket to appear after osd.1 restart" added

#12 Updated by Kefu Chai almost 2 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF