Bug #16567: Ceph raises scrub errors on pgs with objects being overwritten - RADOS - Ceph

Actions

Copy link

Bug #16567

open

Ceph raises scrub errors on pgs with objects being overwritten

Added by Goncalo Borges almost 8 years ago. Updated almost 7 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Correctness/Safety

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.2.2

ceph-qa-suite:

Component(RADOS):

FileStore, OSD

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I am submitting a bug following the advise from Ceph experts in the user mailing list.

1./ In jewel 10.2.2 I got some scrub errors

ceph health detail
HEALTH_ERR 2 pgs inconsistent; 2 scrub errors; crush map has legacy tunables (require bobtail, min is firefly)
pg 6.39c is active+clean+inconsistent, acting [2,60,32]
pg 6.263 is active+clean+inconsistent, acting [56,39,6]
2 scrub errors
crush map has legacy tunables (require bobtail, min is firefly); see http://ceph.com/docs/master/rados/operations/crush-map/#tunables

--- * ---

2./ I started to look to pg 6.263. Errors were only appearing in osd.56 logs but not in others. The log seems to report an error between the real disk size of the object and what ceph thinks it is supposed to be:

cat ceph-osd.56.log-20160629 | grep -Hn 'ERR'
(standard input):8569:2016-06-29 08:09:50.952397 7fd023322700 -1 log_channel(cluster) log [ERR] : scrub 6.263 6:c645f18e:::100002a343d.00000000:head on disk size (1836) does not match object info size (41242) adjusted for ondisk to (41242)
(standard input):8602:2016-06-29 08:11:11.227865 7fd023322700 -1 log_channel(cluster) log [ERR] : 6.263 scrub 1 errors

--- * ---

3./ I performed a 'ceph pg repair 6.263'. Eventually, that pg went back to 'active+clean'

ceph pg dump | grep ^6.263
dumped all in format plain
6.263 10845 0 0 0 0 39592671010 3037 3037 active+clean 2016-06-30 02:13:00.455293 1005'2126237 1005:2795768 [56,39,6] 56 [56,39,6] 56 1005'2076134 2016-06-30 02:13:00.455256 1005'2076134 2016-06-30 02:13:00.455256

However, in the logs i found

2016-06-30 02:03:03.992240 osd.56 192.231.127.226:6801/21569 278 : cluster [INF] 6.263 repair starts
2016-06-30 02:13:00.455237 osd.56 192.231.127.226:6801/21569 279 : cluster [INF] 6.263 repair ok, 0 fixed

--- * ---

4./ Inspecting a bit more, I found that the object inside the pg in all involved osds is changing size. I think this is because the object is being overwritten by a user application. Check the size of the object between consecutive 'ls -l' commands in osd.56 (but the same thing is true in 39 and 6)

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 8602 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
[root@rccephosd8 ceph]# ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 170 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 15436 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 26044 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 14076 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 31110 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 20230 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 23392 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 0 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

ls l /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r- 1 ceph ceph 41412 Jun 30 02:53 /var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

--- * ---

5./ I am wondering If the scrub error I got is not really a consequence of the scrub procedure not handling properly with objects in pgs changing very fast (in case of overwrites)

Actions

Copy link

Updated by Samuel Just almost 8 years ago

Can you reproduce this with

debug osd = 20
debug filestore = 20
debug ms = 1

on the primary and replicas from when the pg is clean until it is detected as inconsistent? We need logging during the IO which causes the inconsistency to determine what is going on. From the description, it could actually just be disk/xfs issue that got propagated during recovery. If we can't get more information, I'll probably just close it as an xfs/disk issue.

Actions

Copy link

Updated by Goncalo Borges almost 8 years ago

This happen for 3 pgs (all of those had exactly the same symptoms). Once I repaired those 3 pgs, the cluster seems to be in good shape and the problem did not reappeared anymore (a week has passed). So, it is not a systematic thing and it is difficult to understand if / when it will raise again.

What you are asking is to run everything in debug mode and just expect for the problem to appear again. Probably that is only worthwhile to do if this becomes a recurrent issue.

Actions

Copy link