the logs of a failed MDS:
starting mds.a at
debug 2020-05-12T10:00:53.931+0000 7f6d6f39b700 -1 mds.0.openfiles _load_finish got (2) No such file or directory
cluster 2020-05-12T10:00:53.626686+0000 mgr.a (mgr.14142) 74 : cluster [DBG] pgmap v72: 17 pgs: 17 active+clean; 2.2 KiB data, 812 KiB used, 265 GiB / 268 GiB avail; 1023 B/s rd, 1 op/s
cluster 2020-05-12T10:00:53.923579+0000 mon.a (mon.0) 425 : cluster [INF] Health check cleared: FS_WITH_FAILED_MDS (was: 1 filesystem has a failed mds daemon)
cluster 2020-05-12T10:00:53.926056+0000 mon.a (mon.0) 426 : cluster [DBG] mds.? [v2:172.21.15.31:6826/981588232,v1:172.21.15.31:6827/981588232] up:boot
cluster 2020-05-12T10:00:53.926118+0000 mon.a (mon.0) 427 : cluster [INF] Standby daemon mds.a assigned to filesystem cephfs as rank 0
cluster 2020-05-12T10:00:53.926216+0000 mon.a (mon.0) 428 : cluster [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline)
cluster 2020-05-12T10:00:53.926428+0000 mon.a (mon.0) 429 : cluster [DBG] fsmap cephfs:0/1 1 up:standby, 1 failed
audit 2020-05-12T10:00:53.926571+0000 mon.a (mon.0) 430 : audit [DBG] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd=[{"prefix": "mds metadata", "who": "a"}]: dispatch
cluster 2020-05-12T10:00:53.928851+0000 mon.a (mon.0) 431 : cluster [DBG] fsmap cephfs:1/1 {0=a=up:replay}
audit 2020-05-12T10:00:54.457020+0000 mon.a (mon.0) 433 : audit [INF] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/host.smithi031","val":"{\"daemons\": {\"mon.a\": {\"hostname\": \"smithi031\", \"container_id\": \"105bf202ec76\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"a\", \"daemon_type\": \"mon\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.453939\", \"created\": \"2020-05-12T09:58:29.620917\", \"started\": \"2020-05-12T09:58:34.349610\"}, \"mgr.a\": {\"hostname\": \"smithi031\", \"container_id\": \"ae5a1b713eee\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"a\", \"daemon_type\": \"mgr\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.454013\", \"created\": \"2020-05-12T09:58:35.932808\", \"started\": \"2020-05-12T09:58:35.989411\"}, \"osd.0\": {\"hostname\": \"smithi031\", \"container_id\": \"1f2d6fcda37e\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"0\", \"daemon_type\": \"osd\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.454047\", \"created\": \"2020-05-12T09:59:29.312886\", \"started\": \"2020-05-12T09:59:30.872638\"}, \"osd.1\": {\"hostname\": \"smithi031\", \"container_id\": \"50d07b96ccd8\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"1\", \"daemon_type\": \"osd\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.454140\", \"created\": \"2020-05-12T09:59:44.896618\", \"started\": \"2020-05-12T09:59:46.443553\"}, \"osd.2\": {\"hostname\": \"smithi031\", \"container_id\": \"30629d30788e\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"2\", \"daemon_type\": \"osd\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.454204\", \"created\": \"2020-05-12T10:00:00.193354\", \"started\": \"2020-05-12T10:00:01.739960\"}, \"mds.a\": {\"hostname\": \"smithi031\", \"container_id\": \"a053eff517d8\", \"container_image_id\": \"ba1862563a7ec5bee9a9a7b56b0087f68457fcc4dec68b196c2f0023b5d5822f\", \"container_image_name\": \"quay.io/ceph-ci/ceph:5d96f0c9612029b065cea7c34cd161b174878f8c\", \"daemon_id\": \"a\", \"daemon_type\": \"mds\", \"version\": \"16.0.0-1393-g5d96f0c9612\", \"status\": 1, \"status_desc\": \"running\", \"last_refresh\": \"2020-05-12T10:00:54.454265\", \"created\": \"2020-05-12T10:00:07.083235\", \"started\": \"2020-05-12T10:00:52.996854\"}}, \"devices\": [{\"rejected_reasons\": [\"Insufficient space (<5GB) on vgs\", \"LVM detected\", \"locked\"], \"available\": false, \"path\": \"/dev/nvme0n1\", \"sys_api\": {\"removable\": \"0\", \"ro\": \"0\", \"vendor\": \"\", \"model\": \"INTEL SSDPEDMD400G4\", \"rev\": \"\", \"sas_address\": \"\", \"sas_device_handle\": \"\", \"support_discard\": \"512\", \"rotational\": \"0\", \"nr_requests\": \"1023\", \"scheduler_mode\": \"none\", \"partitions\": {}, \"sectors\": 0, \"sectorsize\": \"512\", \"size\": 400088457216.0, \"human_readable_size\": \"372.61 GB\", \"path\": \"/dev/nvme0n1\", \"locked\": 1}, \"lvs\": [{\"name\": \"lv_1\", \"comment\": \"not used by ceph\"}, {\"name\": \"lv_2\", \"osd_id\": \"2\", \"cluster_name\": \"ceph\", \"type\": \"block\", \"osd_fsid\": \"7cef8f05-5030-46eb-9733-7cc96b2329a6\", \"cluster_fsid\": \"09bf22bc-9437-11ea-a069-001a4aab830c\", \"osdspec_affinity\": \"\", \"block_uuid\": \"D2fXzg-9b6C-x9Qk-PvTG-cFEN-pcyi-zGTgCr\"}, {\"name\": \"lv_3\", \"osd_id\": \"1\", \"cluster_name\": \"ceph\", \"type\": \"block\", \"osd_fsid\": \"a856c3cd-7121-42c5-bcce-804b31cf7c33\", \"cluster_fsid\": \"09bf22bc-9437-11ea-a069-001a4aab830c\", \"osdspec_affinity\": \"\", \"block_uuid\": \"hKW7JS-J8Vp-qMu5-RemP-V1eP-EMdD-bVZ2ap\"}, {\"name\": \"lv_4\", \"osd_id\": \"0\", \"cluster_name\": \"ceph\", \"type\": \"block\", \"osd_fsid\": \"02984fcd-ea6e-4bb9-a8ce-7b4165ff17ff\", \"cluster_fsid\": \"09bf22bc-9437-11ea-a069-001a4aab830c\", \"osdspec_affinity\": \"\", \"block_uuid\": \"ltCJoY-aYDA-BgUq-SD2p-2SUO-s8Mo-EVPpxd\"}, {\"name\": \"lv_5\", \"comment\": \"not used by ceph\"}], \"human_readable_type\": \"ssd\", \"device_id\": \"INTEL SSDPEDMD400G4_CVFT53310008400BGN\"}, {\"rejected_reasons\": [\"locked\"], \"available\": false, \"path\": \"/dev/sda\", \"sys_api\": {\"removable\": \"0\", \"ro\": \"0\", \"vendor\": \"ATA\", \"model\": \"ST1000NM0033-9ZM\", \"rev\": \"SN04\", \"sas_address\": \"\", \"sas_device_handle\": \"\", \"support_discard\": \"0\", \"rotational\": \"1\", \"nr_requests\": \"64\", \"scheduler_mode\": \"mq-deadline\", \"partitions\": {\"sda1\": {\"start\": \"2048\", \"sectors\": \"1953522688\", \"sectorsize\": 512, \"size\": 1000203616256.0, \"human_readable_size\": \"931.51 GB\", \"holders\": []}}, \"sectors\": 0, \"sectorsize\": \"512\", \"size\": 1000204886016.0, \"human_readable_size\": \"931.51 GB\", \"path\": \"/dev/sda\", \"locked\": 1}, \"lvs\": [], \"human_readable_type\": \"hdd\", \"device_id\": \"ST1000NM0033-9ZM173_Z1W4HQEW\"}], \"daemon_config_deps\": {\"osd.0\": {\"deps\": [], \"last_config\": \"2020-05-12T09:59:27.963479\"}, \"osd.1\": {\"deps\": [], \"last_config\": \"2020-05-12T09:59:43.514126\"}, \"osd.2\": {\"deps\": [], \"last_config\": \"2020-05-12T09:59:58.767400\"}, \"mds.a\": {\"deps\": [], \"last_config\": \"2020-05-12T10:00:48.936080\"}}, \"last_daemon_update\": \"2020-05-12T10:00:54.454344\", \"last_device_update\": \"2020-05-12T10:00:05.549895\", \"networks\": {\"172.21.0.0/20\": [\"172.21.15.31\"]}, \"last_host_check\": \"2020-05-12T09:58:55.496480\"}"}]': finished
audit 2020-05-12T10:00:54.457810+0000 mon.a (mon.0) 434 : audit [INF] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd=[{"prefix": "config set", "who": "mds.all", "name": "mds_join_fs", "value": "all"}]: dispatch
audit 2020-05-12T10:00:54.458428+0000 mon.a (mon.0) 435 : audit [INF] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd=[{"prefix": "auth get-or-create", "entity": "mds.a", "caps": ["mon", "profile mds", "osd", "allow rwx", "mds", "allow"]}]: dispatch
audit 2020-05-12T10:00:54.459033+0000 mon.a (mon.0) 436 : audit [DBG] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
audit 2020-05-12T10:00:54.459757+0000 mon.a (mon.0) 437 : audit [DBG] from='mgr.14142 172.21.15.31:0/3507604701' entity='mgr.a' cmd=[{"prefix": "config get", "who": "mds.a", "key": "container_image"}]: dispatch
topping Ceph mds.a for 09bf22bc-9437-11ea-a069-001a4aab830c...
debug 2020-05-12T10:00:55.919+0000 7f6d763a9700 -1 received signal: Terminated from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
debug 2020-05-12T10:00:55.919+0000 7f6d763a9700 -1 mds.a *** got signal Terminated ***
cephadm 2020-05-12T10:00:54.459456+0000 mgr.a (mgr.14142) 75 : cephadm [INF] Deploying daemon mds.a on smithi031