Bug #40000
openosds do not bound xattrs and/or aggregate xattr data in pg log
0%
Description
Currently we are having our cluster in an HEALTH_ERR state with 4 PGs inactive (3 of which are "peering" and 4th is "activating+degraded") as seen below.
32.160 peering [395,172,321,335,152,77] 395 [395,172,321,335,152,77] 395
32.756 peering [197,395,50,65,384,369] 197 [197,395,50,65,384,369] 197
32.dd1 peering [276,306,152,40,214,245] 276 [276,306,152,40,214,245] 276
32.1329 activating+degraded [306,276,129,43,186,241] 306 [306,276,129,43,186,241] 306
The primary OSDs all show a stream of following errors, that we think is causing to stay the OSDs in its inactive state.
2019-05-22 14:08:46.588585 7f47e9ba1700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:48.482542 7f47ea3a2700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:50.412120 7f47e9ba1700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:52.265173 7f47e9ba1700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:54.191465 7f47eaba3700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:56.022749 7f47eaba3700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
2019-05-22 14:08:57.742697 7f47ea3a2700 -1 failed to decode message of type 83 v5: buffer::end_of_buffer
We attempted to inject "ms_dump_corrupt_message_level = -1" to capture the hexdump of the message which is attached herewith. Would it be possible to see what might be causing this issue and have a possible workaround?
One thing to note is this issue seems to have started after we saw a lot of calls to "refcount.get" hung on certain OSDs. We could see these made from RGWs in their "objecter_requests". We could also see that these originated from S3 PutObjectCopy requests that were issued to those RGWs.
Another thing to note is, all these OSDs that currently show this message are provisioned on Filestore. We have a small number of BlueStore OSDs in the mix but they are not showing this issue (could be because we only have 4 PGs stuck in this state). The pool on these OSDs is erasure-coded.
Files
Updated by Sage Weil almost 5 years ago
- Status changed from New to Need More Info
The message dump is 260M (once de-hexified), but the decode of the pg_log_t in the message indicates it is 2484154195 bytes (~2.4GB) long. The message is incomplete. Probably it was truncated in the messaging layer somewhere?
In any case, the bug is something else.. probably abuse/misused of object xattrs, inflating the pg log size?
Updated by Sage Weil almost 5 years ago
from the ML,
Date: Thu, 23 May 2019 19:47:13 -0700 From: Alex Marangone <amarangone@digitalocean.com> To: Sage Weil <sage@newdream.net> Cc: Vaibhav Bhembre <vaibhav@digitalocean.com>, Al Sene <asene@digitalocean.com>, Vaibhav Bhembre <vb@digitalocean.com>, nojha@redhat.com, jdurgin@redhat.com Subject: Re: Bug #40000: OSDs throw failed to decode message of type 83 - RADOS - Ceph Parts/Attachments: 1 Shown 162 lines Text 2 OK ~206 lines Text ---------------------------------------- We fixed it last night. We dumped the pglog on one of these OSD and noticed a lot of setattr entries with refcounts. They were all for one customer, one bucket. After dumping some of these pglog entries we saw some were huge, close to 1MB. Since that customer caused the initial issue that resulted in the flapping OSDs and stuck peering, we decided to delete all pglog entries referring to this operation on that bucket. Tried it on a OSD where the PG was safe to rm if things went wrong, after bringing it back, PG peered and recovered. Because of this, I'm not sure we can find more info in the xattr of the object but I'll try to give it a shot tomorrow. Alex
Updated by Josh Durgin over 4 years ago
- Status changed from Need More Info to 12
Updated by Sage Weil over 4 years ago
- Subject changed from OSDs throw failed to decode message of type 83 to osds do not bound xattrs and/or aggregate xattr data in pg log
- Assignee deleted (
Sage Weil) - Priority changed from Urgent to High