Project

General

Profile

Actions

Bug #47207

open

Mon crashes during adding osd

Added by 伟 宋 over 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Mon crashes during adding osd:

Found that osd was not successfully added to the crush map
mon log:

debug 2020-08-26 13:30:07.716406 7fe48528c000 1 mon.b@-1(probing).paxosservice(auth 4251..4410) refresh upgraded, format 0 > 2
debug 2020-08-26 13:30:07.720053 7fe478aba700 0 -
10.246.12.67:0/1 >> 10.246.12.67:6800/1 conn(0x55a270b16000 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=0).handle_connect_reply connect got BADAUTHORIZER
debug 2020-08-26 13:30:07.720654 7fe48528c000 0 mon.b@-1(probing) e3 my rank is now 1 (was 1)
debug 2020-08-26 13:30:07.722648 7fe478aba700 0 -
10.246.12.67:0/1 >> 10.246.12.67:6800/1 conn(0x55a270b16000 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=0).handle_connect_reply connect got BADAUTHORIZER
debug 2020-08-26 13:30:07.728686 7fe478aba700 0 -- 10.246.12.67:6789/0 >> 10.246.12.68:6789/0 conn(0x55a270b1a800 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=422196 cs=1 l=0).process missed message? skipped from seq 0 to 1996856113
debug 2020-08-26 13:30:07.728908 7fe47bac0700 0 log_channel(cluster) log [INF] : mon.b calling monitor election
debug 2020-08-26 13:30:07.728991 7fe47bac0700 1 mon.b@1(electing).elector(104) init, last seen epoch 104
debug 2020-08-26 13:30:09.223424 7fe47bac0700 0 log_channel(cluster) log [INF] : mon.b calling monitor election
debug 2020-08-26 13:30:09.223537 7fe47bac0700 1 mon.b@1(electing).elector(109) init, last seen epoch 109, mid-election, bumping
debug 2020-08-26 13:30:14.226478 7fe47e2c5700 0 log_channel(cluster) log [INF] : mon.b is new leader, mons b,c in quorum (ranks 1,2)
debug 2020-08-26 13:30:14.235083 7fe47bac0700 0 log_channel(cluster) log [DBG] : monmap e3: 3 mons at {a=10.246.12.66:6789/0,b=10.246.12.67:6789/0,c=10.246.12.68:6789/0}
debug 2020-08-26 13:30:14.235131 7fe47bac0700 0 log_channel(cluster) log [DBG] : fsmap
debug 2020-08-26 13:30:14.239030 7fe47bac0700 0 log_channel(cluster) log [DBG] : osdmap e9855: 76 total, 67 up, 67 in
debug 2020-08-26 13:30:14.239320 7fe47bac0700 0 log_channel(cluster) log [DBG] : mgrmap e18: a(active)
debug 2020-08-26 13:30:14.239888 7fe47bac0700 0 log_channel(cluster) log [WRN] : Health check update: 1/3 mons down, quorum b,c (MON_DOWN)
debug 2020-08-26 13:30:14.264895 7fe4772b7700 0 log_channel(cluster) log [WRN] : overall HEALTH_WARN noout,nobackfill,norecover flag(s) set; 4024544/10408170 objects misplaced (38.667%); Reduced data availability: 155 pgs inactive, 151 pgs peering; Degraded data redundancy: 4836/10408170 objects degraded (0.046%), 417 pgs degraded; 1497 slow requests are blocked > 32 sec. Implicated osds 0,1,2,3,5,6,7,8,9,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29,30,32,33,34,36,37,38,39; 1/3 mons down, quorum b,c
audit 2020-08-26 13:25:02.157074 mon.c mon.2 10.246.12.68:6789/0 145554 : audit [INF] from='osd.71 10.246.13.8:6800/18091' entity='osd.71' cmd=[{"prefix": "osd crush create-or-move", "id": 71, "weight":1.6374, "args": ["host=storage08", "root=default"]}]: dispatch
audit 2020-08-26 13:25:02.157364 mon.c mon.2 10.246.12.68:6789/0 145555 : audit [INF] from='osd.63 10.246.13.8:6804/18936' entity='osd.63' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["63"]}]: dispatch
audit 2020-08-26 13:25:02.157473 mon.c mon.2 10.246.12.68:6789/0 145556 : audit [INF] from='osd.42 10.246.13.8:6802/18921' entity='osd.42' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["42"]}]: dispatch
audit 2020-08-26 13:25:02.157580 mon.c mon.2 10.246.12.68:6789/0 145557 : audit [INF] from='osd.67 10.246.13.8:6810/18961' entity='osd.67' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["67"]}]: dispatch
cluster 2020-08-26 13:29:40.168133 mon.c mon.2 10.246.12.68:6789/0 145558 : cluster [INF] mon.c calling monitor election
audit 2020-08-26 13:29:45.212443 mon.c mon.2 10.246.12.68:6789/0 145559 : audit [INF] from='osd.47 10.246.13.8:6816/19406' entity='osd.47' cmd=[{"prefix": "osd crush create-or-move", "id": 47, "weight":1.6374, "args": ["host=storage08", "root=default"]}]: dispatch
audit 2020-08-26 13:29:47.212581 mon.c mon.2 10.246.12.68:6789/0 145560 : audit [INF] from='osd.51 10.246.13.8:6812/19310' entity='osd.51' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["51"]}]: dispatch
audit 2020-08-26 13:29:48.234172 mon.c mon.2 10.246.12.68:6789/0 145561 : audit [DBG] from='client.? 10.246.12.66:0/11210713' entity='client.admin' cmd=[{"prefix": "osd tree"}]: dispatch
audit 2020-08-26 13:29:49.680686 mon.c mon.2 10.246.12.68:6789/0 145562 : audit [INF] from='osd.75 10.246.13.8:6814/19404' entity='osd.75' cmd=[{"prefix": "osd crush create-or-move", "id": 75, "weight":1.6374, "args": ["host=storage08", "root=default"]}]: dispatch
cluster 2020-08-26 13:30:07.728918 mon.b mon.1 10.246.12.67:6789/0 1 : cluster [INF] mon.b calling monitor election
cluster 2020-08-26 13:30:09.212353 mon.c mon.2 10.246.12.68:6789/0 145563 : cluster [INF] mon.c calling monitor election
cluster 2020-08-26 13:30:09.223434 mon.b mon.1 10.246.12.67:6789/0 2 : cluster [INF] mon.b calling monitor election
cluster 2020-08-26 13:30:14.226487 mon.b mon.1 10.246.12.67:6789/0 3 : cluster [INF] mon.b is new leader, mons b,c in quorum (ranks 1,2)
cluster 2020-08-26 13:30:14.235085 mon.b mon.1 10.246.12.67:6789/0 4 : cluster [DBG] monmap e3: 3 mons at {a=10.246.12.66:6789/0,b=10.246.12.67:6789/0,c=10.246.12.68:6789/0}
cluster 2020-08-26 13:30:14.235132 mon.b mon.1 10.246.12.67:6789/0 5 : cluster [DBG] fsmap
cluster 2020-08-26 13:30:14.239109 mon.b mon.1 10.246.12.67:6789/0 6 : cluster [DBG] osdmap e9855: 76 total, 67 up, 67 in
cluster 2020-08-26 13:30:14.239374 mon.b mon.1 10.246.12.67:6789/0 7 : cluster [DBG] mgrmap e18: a(active)
cluster 2020-08-26 13:30:14.239953 mon.b mon.1 10.246.12.67:6789/0 8 : cluster [WRN] Health check update: 1/3 mons down, quorum b,c (MON_DOWN)
cluster 2020-08-26 13:30:14.264899 mon.b mon.1 10.246.12.67:6789/0 9 : cluster [WRN] overall HEALTH_WARN noout,nobackfill,norecover flag(s) set; 4024544/10408170 objects misplaced (38.667%); Reduced data availability: 155 pgs inactive, 151 pgs peering; Degraded data redundancy: 4836/10408170 objects degraded (0.046%), 417 pgs degraded; 1497 slow requests are blocked > 32 sec. Implicated osds 0,1,2,3,5,6,7,8,9,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29,30,32,33,34,36,37,38,39; 1/3 mons down, quorum b,c
debug 2020-08-26 13:30:14.356554 7fe47bac0700 0 mon.b@1(leader) e3 handle_command mon_command({"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["51"]} v 0) v1
debug 2020-08-26 13:30:14.356647 7fe47bac0700 0 log_channel(audit) log [INF] : from='osd.51 -' entity='osd.51' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["51"]}]: dispatch
debug 2020-08-26 13:30:14.358055 7fe47bac0700 0 mon.b@1(leader) e3 handle_command mon_command({"prefix": "osd crush create-or-move", "id": 47, "weight":1.6374, "args": ["host=storage08", "root=default"]} v 0) v1
debug 2020-08-26 13:30:14.358160 7fe47bac0700 0 log_channel(audit) log [INF] : from='osd.47 10.246.13.8:6816/19406' entity='osd.47' cmd=[{"prefix": "osd crush create-or-move", "id": 47, "weight":1.6374, "args": ["host=storage08", "root=default"]}]: dispatch
debug 2020-08-26 13:30:14.358536 7fe47bac0700 0 mon.b@1(leader).osd e9855 create-or-move crush item name 'osd.47' initial_weight 1.6374 at location {host=storage08,root=default}
src/tcmalloc.cc:284] Attempt to free invalid pointer 0x3ffddff8eeacfffb

Actions #1

Updated by Greg Farnum almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (Monitor)
Actions

Also available in: Atom PDF