Project

General

Profile

Actions

Bug #22847

closed

ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again

Added by Frank Li over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

during the course of trouble-shooting an osd issue, I ran this command:
ceph osd force-create-pg 1.ace11d67
then all of the ceph-mon crashed with this error:
--- begin dump of recent events ---
0> 2018-01-31 22:47:22.959665 7fc64350e700 -1 ** Caught signal (Aborted) *
in thread 7fc64350e700 thread_name:cpu_tp

ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)
1: (()+0x8eae11) [0x55f1113fae11]
2: (()+0xf5e0) [0x7fc64aafa5e0]
3: (gsignal()+0x37) [0x7fc647fca1f7]
4: (abort()+0x148) [0x7fc647fcb8e8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x55f1110fa4a4]
6: (()+0x2ccc4e) [0x55f110ddcc4e]
7: (OSDMonitor::update_creating_pgs()+0x98b) [0x55f11102232b]
8: (C_UpdateCreatingPGs::finish(int)+0x79) [0x55f1110777b9]
9: (Context::complete(int)+0x9) [0x55f110ed30c9]
10: (ParallelPGMapper::WQ::_process(ParallelPGMapper::Item*, ThreadPool::TPHandle&)+0x7f) [0x55f111204e1f]
11: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55f111100f1e]
12: (ThreadPool::WorkThread::entry()+0x10) [0x55f111101e00]
13: (()+0x7e25) [0x7fc64aaf2e25]
14: (clone()+0x6d) [0x7fc64808d34d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

here is the current ceph -s output:
[root@dl1-kaf101 frli]# ceph -s
cluster:
id: 021a1428-fea5-4697-bcd0-a45c1c2ca80b
health: HEALTH_WARN
nodown flag(s) set
32 osds down
4 hosts (32 osds) down
2 racks (32 osds) down
1 row (32 osds) down
Reduced data availability: 10240 pgs inactive, 5 pgs down, 3431 pgs peering, 3643 pgs stale
Degraded data redundancy: 215798/552339 objects degraded (39.070%), 10240 pgs unclean, 4666 pgs degraded, 4666 pgs undersized

services:
mon: 5 daemons, quorum dl1-kaf101,dl1-kaf201,dl1-kaf301,dl1-kaf302,dl1-kaf401
mgr: dl1-kaf101(active), standbys: dl1-kaf201
osd: 64 osds: 19 up, 51 in
flags nodown
data:
pools: 3 pools, 10240 pgs
objects: 179k objects, 712 GB
usage: 2135 GB used, 461 TB / 463 TB avail
pgs: 20.879% pgs unknown
79.121% pgs not active
215798/552339 objects degraded (39.070%)
3381 stale+undersized+degraded+peered
3169 peering
2138 unknown
1285 undersized+degraded+peered
262 stale+peering
5 down

Files

ceph.versions (518 Bytes) ceph.versions ceph versions Frank Li, 02/01/2018 04:28 AM
ceph.osd.tree.log (5.46 KB) ceph.osd.tree.log ceph osd tree output Frank Li, 02/01/2018 04:28 AM
ceph_health_detail.log (13.3 KB) ceph_health_detail.log ceph health details Frank Li, 02/01/2018 04:28 AM
dl1approd-mon.dl1-kaf101.log.prob.1.gz (960 KB) dl1approd-mon.dl1-kaf101.log.prob.1.gz Frank Li, 02/06/2018 01:19 AM
dl1approd-mon.dl1-kaf101.log.prob.2.gz (902 KB) dl1approd-mon.dl1-kaf101.log.prob.2.gz Frank Li, 02/06/2018 01:22 AM

Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #22942: luminous: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up againResolvedNathan CutlerActions
Actions

Also available in: Atom PDF