Project

General

Profile

Actions

Bug #9389

closed

ec pg stuck peering, did not send query for one shard

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  "recovery_state": [
        { "name": "Started\/Primary\/Peering\/GetInfo",
          "enter_time": "2014-09-08 08:10:05.258543",
          "requested_info_from": [
                { "osd": "2(0)"}]},
...
of           "probing_osds": [
                "0(1)",
                "1(2)",
                "2(0)",
                "4(0)",
                "5(3)"],

and it tries to send it:

2014-09-08 08:10:05.258639 7f8545562700 10 osd.5 pg_epoch: 825 pg[1.1es3( v 785'235 (0'0,785'235] local-les=812 n=0 ec=11 les/c 812/808 822/822/818) [2,0,1,5] r=3 lpr=825 pi=730-821/8 crt=753'233 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.2(0)

but when the pg_query goes it out it only has
2014-09-08 08:10:05.259864 7f8545562700  7 osd.5 825 do_queries querying osd.2 on 2 PGs
2014-09-08 08:10:05.259865 7f8545562700  1 -- 10.214.136.6:6811/49073 --> 10.214.133.10:6801/51165 -- pg_query(1.21s3,1.fds1 epoch 825) v3 -- ?+0 0x7b14d00 con 0x7137a20

ubuntu@teuthology:/a/teuthology-2014-09-08_02:32:01-rados-master-testing-basic-multi/472310


Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #9821: failed to recover before timeout expiredResolvedSamuel Just10/19/2014

Actions
Actions

Also available in: Atom PDF