Project

General

Profile

Bug #9806

Objecter: resend linger ops on split

Added by Samuel Just about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
10/17/2014
Due date:
% Done:

0%

Source:
other
Tags:
conflict
Backport:
firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Otherwise, we can lose notifies.

cb9262abd7fd5f0a9f583bd34e4c425a049e56ce


Related issues

Copied to Ceph - Backport #11699: Objecter: resend linger ops on split Resolved 10/17/2014

Associated revisions

Revision cb9262ab (diff)
Added by Josh Durgin about 4 years ago

Objecter: resend linger ops on any interval change

Watch/notify ops need to be resent after a pg split occurs, as well as
a few other circumstances that the existing objecter checks did not
catch.

Refactor the check the OSD uses for this to add a version taking the
more basic types instead of the whole OSD map, and stash the needed
info when an op is sent.

Fixes: #9806
Backport: giant, firefly, dumpling
Signed-off-by: Josh Durgin <>

Revision d296120c (diff)
Added by Josh Durgin over 3 years ago

Objecter: resend linger ops on any interval change

Watch/notify ops need to be resent after a pg split occurs, as well as
a few other circumstances that the existing objecter checks did not
catch.

Refactor the check the OSD uses for this to add a version taking the
more basic types instead of the whole OSD map, and stash the needed
info when an op is sent.

Fixes: #9806
Backport: giant, firefly, dumpling
Signed-off-by: Josh Durgin <>
(cherry picked from commit cb9262abd7fd5f0a9f583bd34e4c425a049e56ce)

Conflicts:
src/osd/osd_types.cc
src/osdc/Objecter.cc
Minor differences.

History

#1 Updated by Samuel Just about 4 years ago

  • Description updated (diff)

#2 Updated by Josh Durgin about 4 years ago

  • Backport set to giant, firefly, dumpling

#3 Updated by Josh Durgin about 4 years ago

  • Status changed from New to Testing
  • Assignee set to Josh Durgin

#4 Updated by Sage Weil almost 4 years ago

  • Status changed from Testing to Resolved

#5 Updated by Sage Weil almost 4 years ago

  • Status changed from Resolved to Pending Backport

#6 Updated by Loic Dachary almost 4 years ago

  • Description updated (diff)

The cb9262abd7fd5f0a9f583bd34e4c425a049e56ce does not apply cleanly on dumpling which suggests more should be backported for it to make sense. Should this be backported for v0.67.12 or can it wait ?

#7 Updated by Loic Dachary over 3 years ago

It won't be in dumpling v0.67.12 but ... it could be in v0.80.10 ;-) It looks like an important fix.

#8 Updated by Loic Dachary over 3 years ago

  • Backport changed from giant, firefly, dumpling to firefly, dumpling

already in giant

#9 Updated by Loic Dachary over 3 years ago

  • Backport changed from firefly, dumpling to firefly

dumpling is end of life

#10 Updated by Loic Dachary over 3 years ago

  • Tags set to conflict
  • Regression set to No

#11 Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

#12 Updated by Christian Theune over 3 years ago

As far as I understand this hurts snapshots. I'm on Firefly and getting bitten by this. Is there a work-around to get back to a usable snapshot state once this has kicked in?

#13 Updated by Josh Durgin over 3 years ago

A workaround is to detach and reattach your images. This reopens them and reestablishes the watch.

#14 Updated by Christian Theune over 3 years ago

Ah. So in that case restarting Qemu would be the specific action for that, right?

I'm currently trying this out. I'm a bit unclear on the specifics of the trigger. We have some automation code that causes pg_num and pgp_num to be automatically (slowly) adjusted for growing pools.

However, this was running for a while without the cluster exhibiting the issue in a way that we would notice. The specific point when we noticed was when we updated our tunables to the recommended settings for Firefly (and caused a large CRUSH rearrangement).

Would you think that in running operations stopping our automatic pg_num, pgp_num adaption would be sufficient to avoid this bug?

For further clarification: does this bug apply on a per-pool basis, per-image basis or cluster-wide? My guess would be this applies on a per-pool basis.

Thanks for the hint!

#15 Updated by Christian Theune over 3 years ago

Ok, so I restarted one of the VMs exiting Qemu and starting afresh. Took a snapshot immediately after that and it's been giving a consistent hash of the mapped rbd device multiple times after that.

#16 Updated by Josh Durgin over 3 years ago

Yes, restarting qemu will fix it. The trigger for the issue is pg split, so it would only affect pools where you had increased pg_num and pgp_num. If you avoid splitting, you avoid this bug. Other crush changes like straw2 or new tunables should not cause this issue.

#17 Updated by Christian Theune over 3 years ago

That's a relief! Thanks for the explanation, I hope other people stumbling over this bug will find this helpful, too. :)

Also available in: Atom PDF