Project

General

Profile

Feature #10098

wanted: command to clear 'incomplete' PGs

Added by c sights about 8 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
hammer
Reviewed:
Affected Versions:
Pull request ID:

Description

Hello,
Please create a command that would clear 'incomplete' PGs.

Perhaps ceph pg force_create_pg could be extended to recreate incomplete PGs. (Recreate means create blank PG.)

This would be helpful for recovering the cluster to working order and with less data loss than entirely deleting the pool with incomplete PGs.

Midway through this November, I see three people on the mailing list wishing for this tool!

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-November/044451.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-November/044540.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-November/044533.html

Thanks for a great project with a lot of potential!
Chad.


Related issues

Copied to Ceph - Backport #14899: hammer: wanted: command to clear 'incomplete' PGs Resolved

Associated revisions

Revision 6907778d (diff)
Added by Mykola Golub about 7 years ago

ceph-objectstore-tool: add mark-complete operation

It is supposed to be used as a last resort to fix a cluster that has
PGs in 'incomplete' state, using the following procedure:

1) stop the osd that is primary for the incomplete PG;
2) run:
ceph-objectstore-tool --data-path ... --journal-path ... --pgid $PGID --op mark-complete
3) start the osd.

Fixes: #10098
Signed-off-by: Mykola Golub <>

Revision 0fe3dfe8 (diff)
Added by Mykola Golub almost 7 years ago

ceph-objectstore-tool: add mark-complete operation

It is supposed to be used as a last resort to fix a cluster that has
PGs in 'incomplete' state, using the following procedure:

1) stop the osd that is primary for the incomplete PG;
2) run:
ceph-objectstore-tool --data-path ... --journal-path ... --pgid $PGID --op mark-complete
3) start the osd.

Fixes: #10098
Signed-off-by: Mykola Golub <>
(cherry picked from commit 6907778d767ba08bb80c495785056ed122b023fe)

Conflicts:
src/test/ceph_objectstore_tool.py (trivial)
src/tools/ceph_objectstore_tool.cc (trivial)

History

#1 Updated by Samuel Just about 8 years ago

  • Tracker changed from Bug to Feature

#2 Updated by Jifeng Yin over 7 years ago

Wait, 'force_create_pg' cannot recreate incomplete PGs? If that, could you please let me know how to deal with the incomplete PGs?

#3 Updated by Sage Weil over 7 years ago

  • Target version set to v9.0.4

#4 Updated by Samuel Just over 7 years ago

Probably ceph_objectstore_tool --[unsafe?]-mark-complete <pgid> on a down osd.

Definitely need to set:
info.last_epoch_started to current
info.history.last_epoch_started current
info.history.last_epoch_clean to current
info.last_backfill = MAX

#5 Updated by Aaron T over 7 years ago

I have this exact problem on my 0.94.1.2 cluster. It's production but the data is mostly-replaceable, so I'm willing to test patches if you need testers. ceph_objectstore_tool --mark-complete <pgid> gets my vote :)

#6 Updated by Samuel Just over 7 years ago

  • Target version deleted (v9.0.4)

#7 Updated by Mykola Golub over 7 years ago

I have a patch for ceph-objectstore-tool, which adds mark-complete operation:

https://github.com/ceph/ceph/pull/5031

Aaron, you might want to try it on your own risk.

#8 Updated by Aaron T over 7 years ago

Mykola Golub wrote:

I have a patch for ceph-objectstore-tool, which adds mark-complete operation:

https://github.com/ceph/ceph/pull/5031

Aaron, you might want to try it on your own risk.

Mykola,

Thank you for the opportunity to help test -- unfortunately, we just finished rebuilding our cluster not 24 hours before your comment. We're running 9.0.1 now and playing with erasure coding :) Hopefully no more incomplete PGs will occur, but if they do, we'll be more than happy to test your patch.

-Aaron

#9 Updated by Loïc Dachary over 7 years ago

  • Status changed from New to Fix Under Review

#10 Updated by David Zafman about 7 years ago

  • Status changed from Fix Under Review to Resolved
  • Assignee set to David Zafman

6907778d

#11 Updated by Loïc Dachary almost 7 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to hammer

#12 Updated by Loïc Dachary almost 7 years ago

  • Copied to Backport #14899: hammer: wanted: command to clear 'incomplete' PGs added

#13 Updated by Loïc Dachary almost 7 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF