Actions
Bug #13067
closedMDSRank unhealthy on hammer -> infernalis upgrade
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2015-09-11 21:58:47.474680 7f3230796700 1 mds.0.2 handle_mds_map i am now mds.0.2 2015-09-11 21:58:47.474683 7f3230796700 1 mds.0.2 handle_mds_map state change up:rejoin --> up:active 2015-09-11 21:58:47.474692 7f3230796700 1 mds.0.2 recovery_done -- successful recovery! 2015-09-11 21:58:47.474822 7f3230796700 1 mds.0.2 active_start 2015-09-11 21:58:47.474843 7f3230796700 1 mds.0.2 cluster recovered. 2015-09-11 22:12:26.319204 7f3230796700 0 monclient: hunting for new mon 2015-09-11 22:12:46.434194 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:46.434209 7f322d68f700 1 mds.beacon.a _send skipping beacon, heartbeat map not healthy 2015-09-11 22:12:50.119904 7f323279a700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:50.434380 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:50.434389 7f322d68f700 1 mds.beacon.a _send skipping beacon, heartbeat map not healthy 2015-09-11 22:12:54.434563 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:54.434579 7f322d68f700 1 mds.beacon.a _send skipping beacon, heartbeat map not healthy 2015-09-11 22:12:55.120096 7f323279a700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:58.434754 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:12:58.434770 7f322d68f700 1 mds.beacon.a _send skipping beacon, heartbeat map not healthy 2015-09-11 22:13:00.120301 7f323279a700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:13:02.434936 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:13:02.434952 7f322d68f700 1 mds.beacon.a _send skipping beacon, heartbeat map not healthy 2015-09-11 22:13:05.120505 7f323279a700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2015-09-11 22:13:06.435116 7f322d68f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15
meanwhile the mon says
2015-09-11T15:27:52.700 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN mds a is laggy 2015-09-11T15:27:59.702 INFO:teuthology.orchestra.run.vpm110:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-09-11T15:27:59.928 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN mds a is laggy 2015-09-11T15:28:06.929 INFO:teuthology.orchestra.run.vpm110:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health' 2015-09-11T15:28:07.177 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN mds a is laggy 2015-09-11T15:28:14.177 INFO:teuthology.orchestra.run.vpm110:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'/a/sage-2015-09-11_14:28:20-upgrade:hammer-x-master---basic-vps/1050825
Updated by John Spray over 8 years ago
Hmm, kinda interesting that it's happening at the point in the log where the mons are restarted. Something in monc blocking perhaps?
Updated by Zheng Yan over 8 years ago
- Status changed from New to Fix Under Review
Actions