Bug #10435
ceph-osd stops with "Caught signal (Aborted)" or "osd/PG.cc: 2683: FAILED assert(values.size() == 1)"
0%
Description
While my production ceph cluster was recovering from a power outage, a few of my OSDs started flapping and eventually went down. Previously, I've simply completely removed the OSDs and re-added them fresh and allowed the cluster to recover. However, the cluster is currently reporting a few items are "unfound" (3/939435 unfound (0.000%)) and I'm leery of completely removing OSDs in this state as I don't want to incur any data loss.
Digging through the archives and bug reports I've found a similar case1 with a request for reproduction with increased logging levels. I believe I've managed to gather the requested level of detail and will attach it to this report.
[1] - https://www.mail-archive.com/ceph-users@lists.ceph.com/msg01034.html
History
#1 Updated by Jamin Collins over 8 years ago
- File ceph-osd.11.log.lzma added
#2 Updated by Jamin Collins over 8 years ago
- File ceph-locate-unfound added
Near as I can tell, all the unfound objects reside on osd.6:
$ ./ceph-locate-unfound
/var/lib/ceph/osd/ceph-6/current/3.2ba_head/DIR_A/DIR_B/DIR_2/rb.0.1da2e.238e1f29.000000000178__head_F23D22BA__3
/var/lib/ceph/osd/ceph-6/current/3.25f_head/DIR_F/DIR_5/DIR_E/rb.0.1175.2ae8944a.0000000024e0__head_B0B2CE5F__3
/var/lib/ceph/osd/ceph-6/current/3.199_head/DIR_9/DIR_9/DIR_D/rb.0.1da2e.238e1f29.0000000000b3__head_76DA7D99__3
Is there any way to move these objects to a working OSD or get osd.6 back to a point where ceph-osd can start on it?
#3 Updated by Jamin Collins over 8 years ago
I've removed, erased, and re-added osd.11 to the ceph cluster.
#4 Updated by Jamin Collins over 8 years ago
Having determined which RBD volumes these unfound OIDs belonged to, I've decided to remove osd.6, zero the drive, and re-add it.
#5 Updated by Sage Weil over 8 years ago
- Status changed from New to Closed
in certain cases it is possible to move the file, but in general no. we're working on a tool to move entire pgs at a time.