Project

General

Profile

Bug #12577

Inconsistent PGs that ceph pg repair does not fix

Added by Andras Pataki about 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
08/03/2015
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

Summary: I am having problems with inconsistent PG's that the 'ceph pg repair' command does not fix.  Below are the details.  Any help would be appreciated.

# I am using ceph 0.94.2 on all machines:
~# ceph-osd -v
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)

# Find the inconsistent PG's
~# ceph pg dump | grep inconsistent
dumped all in format plain
2.439 4208
0 0
0 0
17279507143 3103
3103 active+clean+inconsistent
2015-08-03 14:49:17.292884
77323'2250145  77480:890566 [78,54]
78 [78,54]
78 77323'2250145
2015-08-03 14:49:17.292538
77323'2250145  2015-08-03 14:49:17.292538
2.8b9 4083
0 0
0 0
16669590823 3051
3051 active+clean+inconsistent
2015-08-03 14:46:05.140063
77323'2249886  77473:897325 [7,72]
7 [7,72]
7 77323'2249886
2015-08-03 14:22:47.834063
77323'2249886  2015-08-03 14:22:47.834063

# Look at the first one:
~# ceph pg deep-scrub 2.439
instructing pg 2.439 on osd.78 to deep-scrub

# The logs of osd.78 show:
2015-08-03 15:16:34.409738 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-08-03 15:16:51.364229 7f09ec04a700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:16:52.763977 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors

# Finding the object in question:
~# find ~ceph/osd/ceph-78/current/2.439_head -name 10000022d93.00000f0c* -ls
21510412310 4100 -rw-r--r--   1 root     root      4194304 Jun 30 17:09 /var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2
~# md5sum /var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2
4e4523244deec051cfe53dd48489a5db  /var/lib/ceph/osd/ceph-78/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2

# The object on the backup osd:
~# find ~ceph/osd/ceph-54/current/2.439_head -name 10000022d93.00000f0c* -ls
6442614367 4100 -rw-r--r--   1 root     root      4194304 Jun 30 17:09 /var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2
~# md5sum /var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2
4e4523244deec051cfe53dd48489a5db  /var/lib/ceph/osd/ceph-54/current/2.439_head/DIR_9/DIR_3/DIR_4/DIR_E/10000022d93.00000f0c__head_B029E439__2

# They don't seem to be different.
# When I try repair:
~# ceph pg repair 2.439
instructing pg 2.439 on osd.78 to repair

# The osd.78 logs show:
2015-08-03 15:19:21.775933 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 repair starts
2015-08-03 15:19:38.088673 7f09ec04a700 -1 log_channel(cluster) log [ERR] : repair 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:19:39.958019 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 repair 1 errors, 0 fixed
2015-08-03 15:19:39.962406 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-08-03 15:19:56.510874 7f09ec04a700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:19:58.348083 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors

The inconsistency is not fixed.
I have tried  a few other things:
 * Stop the primary osd, remove the object from the filesystem, restart the OSD and issue a repair.  It didn't work - it says that one object is missing, but did not copy it from the backup.
 * I tried the same on the backup (remove the file) - it also didn't get copied back from the primary in a repair.

Any help would be appreciated.

Thanks,

Andras
apataki@simonsfoundation.org

osd-78.log.gz (982 KB) Andras Pataki, 08/03/2015 10:03 PM


Related issues

Copied to Ceph - Backport #12583: Inconsistent PGs that ceph pg repair does not fix Resolved 08/03/2015

History

#1 Updated by Samuel Just about 3 years ago

  • Priority changed from Normal to Urgent

#2 Updated by Samuel Just about 3 years ago

  • Assignee set to David Zafman

#3 Updated by David Zafman about 3 years ago

Could you attach a portion of the log of osd.78 around the time of a repair?

You probably need to bump the osd logging to 20. That is configuration "debug_osd".

I'm not sure if I have the correct location of the socket file, but the command would be something like this.

$ ceph --admin-daemon /var/log/ceph/osd.78.asok config get debug_osd
{
    "debug_osd": "5\/5" 
}
$ ceph --admin-daemon /var/log/ceph/osd.78.asok config set debug_osd 20
{
    "success": "" 
}

Then run the repair and capture that part of the log. If your system is in production set the debugging back to what it was. For example, if it was 5 before.

$ ceph --admin-daemon /var/log/ceph/osd.78.asok config set debug_osd 5
{
    "success": "" 
}

#4 Updated by Andras Pataki about 3 years ago

I've tried this a few times, here are couple of instances:

2015-08-03 14:34:39.260203 7f962d9a3700  0 -- 10.4.36.107:6876/115675 >> 10.4.36.107:6817/41981 pipe(0x7153000 sd=132 :6876 s=0 pgs=0 cs=0 l=0 c=0x183542c0).accept connect_seq 24 vs existing 23 state standby
2015-08-03 14:34:39.263897 7f96649a4700  0 -- 10.4.36.107:6876/115675 >> 10.4.36.107:6866/115285 pipe(0x3ac06000 sd=139 :6876 s=0 pgs=0 cs=0 l=0 c=0x18357440).accept connect_seq 34 vs existing 33 state standby
2015-08-03 14:34:39.267379 7f965f355700  0 -- 10.4.36.107:6876/115675 >> 10.4.36.107:6809/39540 pipe(0x7be2000 sd=86 :6876 s=0 pgs=0 cs=0 l=0 c=0x19008f20).accept connect_seq 20 vs existing 19 state standby
2015-08-03 14:35:41.963032 7f966c3e6700  0 log_channel(cluster) log [INF] : 2.439 repair starts
2015-08-03 14:36:11.932527 7f966c3e6700 -1 log_channel(cluster) log [ERR] : repair 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 14:36:14.838909 7f966c3e6700 -1 log_channel(cluster) log [ERR] : 2.439 repair 1 errors, 0 fixed
2015-08-03 14:36:14.846786 7f966c3e6700  0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-08-03 14:36:42.511570 7f966c3e6700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 14:36:45.058865 7f966c3e6700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors
2015-08-03 14:37:12.268767 7f9661570700 -1 osd.78 77396 *** Got signal Terminated ***
2015-08-03 14:37:12.268847 7f9661570700  0 osd.78 77396 prepare_to_stop telling mon we are shutting down

2015-08-03 15:04:40.544819 7f09e2df0700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6841/5536 pipe(0x22639000 sd=130 :6861 s=0 pgs=0 cs=0 l=0 c=0x20b82aa0).accept connect_seq 2 vs existing 1 state standby
2015-08-03 15:04:41.739995 7f09e7a3c700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6853/6388 pipe(0x1feec000 sd=156 :6861 s=0 pgs=0 cs=0 l=0 c=0x20b8b2e0).accept connect_seq 2 vs existing 1 state standby
2015-08-03 15:08:31.331873 7f09e6f31700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6845/6054 pipe(0x1fcfb000 sd=39 :51551 s=2 pgs=7689 cs=1 l=0 c=0x1fcef340).fault with nothing to send, going to standby
2015-08-03 15:08:49.051002 7f09e6325700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.106:6957/117556 pipe(0x1fcd9000 sd=60 :6861 s=0 pgs=0 cs=0 l=0 c=0x204927e0).accept connect_seq 2 vs existing 1 state standby
2015-08-03 15:13:30.611031 7f09e6d2f700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.105:6805/124382 pipe(0x1fd22000 sd=41 :56628 s=2 pgs=25551 cs=1 l=0 c=0x1fceffa0).fault with nothing to send, going to standby
2015-08-03 15:14:49.291239 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 scrub starts
2015-08-03 15:15:06.791373 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 scrub ok ( 1 remaining deep scrub error(s) )
2015-08-03 15:16:34.409738 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-08-03 15:16:51.364229 7f09ec04a700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:16:52.763977 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors
2015-08-03 15:18:20.902251 7f09c426b700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6821/4448 pipe(0x22634000 sd=71 :6861 s=2 pgs=7271 cs=3 l=0 c=0x1fea3a20).fault with nothing to send, going to standby
2015-08-03 15:19:21.775933 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 repair starts
2015-08-03 15:19:38.088673 7f09ec04a700 -1 log_channel(cluster) log [ERR] : repair 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:19:39.958019 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 repair 1 errors, 0 fixed
2015-08-03 15:19:39.962406 7f09ec04a700  0 log_channel(cluster) log [INF] : 2.439 deep-scrub starts
2015-08-03 15:19:41.275711 7f09e2df0700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6841/5536 pipe(0x22639000 sd=130 :6861 s=2 pgs=7875 cs=3 l=0 c=0x66879c0).fault with nothing to send, going to standby
2015-08-03 15:19:56.510874 7f09ec04a700 -1 log_channel(cluster) log [ERR] : deep-scrub 2.439 b029e439/10000022d93.00000f0c/head//2 on disk data digest 0xb3d78a6e != 0xa3944ad0
2015-08-03 15:19:56.892689 7f09e7a3c700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.108:6853/6388 pipe(0x1feec000 sd=156 :6861 s=2 pgs=7320 cs=3 l=0 c=0x1fda0000).fault with nothing to send, going to standby
2015-08-03 15:19:58.348083 7f09ec04a700 -1 log_channel(cluster) log [ERR] : 2.439 deep-scrub 1 errors
2015-08-03 15:23:49.156038 7f09e6325700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.106:6957/117556 pipe(0x1fcd9000 sd=60 :6861 s=2 pgs=52 cs=3 l=0 c=0x1fea3600).fault with nothing to send, going to standby
2015-08-03 15:26:40.391520 7f09e31f4700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.106:6925/115293 pipe(0x1fd81000 sd=98 :6861 s=0 pgs=0 cs=0 l=0 c=0x1ffe06e0).accept connect_seq 2 vs existing 1 state standby
2015-08-03 15:34:58.451081 7f09e1cdf700  0 -- 10.4.36.107:6861/146801 >> 10.4.36.106:6901/113986 pipe(0x1ff4b000 sd=135 :59057 s=2 pgs=55 cs=3 l=0 c=0x1ff255a0).fault with nothing to send, going to standby

#5 Updated by David Zafman about 3 years ago

I need the logs with increased log level, as I had updated by comment above.

#6 Updated by Andras Pataki about 3 years ago

Sorry - I ran:

~# ceph --admin-daemon /var/run/ceph/ceph-osd.78.asok config get debug_osd
{
    "debug_osd": "0\/5" 
}

~# ceph --admin-daemon /var/run/ceph/ceph-osd.78.asok config set debug_osd 20
{
    "success": "" 
}

~# ceph --admin-daemon /var/run/ceph/ceph-osd.78.asok config get debug_osd
{
    "debug_osd": "20\/20" 
}

gzipped log attached with the repair after this.

Andras

#7 Updated by David Zafman about 3 years ago

I'm pretty sure we have a fix for this in the development branch already (pull request 4365), but it will need to be back-ported to hammer.

#8 Updated by David Zafman about 3 years ago

  • Status changed from New to Verified

#9 Updated by David Zafman about 3 years ago

  • Status changed from Verified to Pending Backport
  • Backport set to Hammer

#10 Updated by Andras Pataki about 3 years ago

Hi David,

Thanks very much for the quick response/resolution. A couple of questions if you don't mind regarding the nature of the problem. We are considering adding more disk to the system, which will cause large scale rebalancing of data. Should we be worried about this issue making things worse (spreadng/causing worse corruptions/inconsistencies)? If so, anything we could do to mitigate?

Thanks again,

Andras

#11 Updated by Loic Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

#12 Updated by Jérôme Poulin over 2 years ago

I'd need to apply this on Firefly 0.80.11, is there a hash or branch for the backported Hammer version?

Also available in: Atom PDF