Project

General

Profile

Bug #48783

raw osd's are not started on boot after upgrade from 14.2.11 to 14.2.16 ; ceph-volume raw activate claim systemd support not yet implemented

Added by Ronny Aasen 15 days ago. Updated 15 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Hello

After upgrading ceph-osd from 14.2.11 to 14.2.16 the raw osd's on the node does not autostart on boot any longer. ceph-volume simple do not see the osd any longer.
ceph-volume raw list, do find all osd's, but ceph-volume raw activate --device /dev/xx just state "systemd support not yet implemented".

manually starting the osd fail, since the partition is not mounted.
If i manually mount the partition, change the block dev permissions and start the service the osd starts.

  1. ceph-osd --version
    ceph version 14.2.16 (5d5ae817209e503a412040d46b3374855b7efe04) nautilus (stable)
  1. ceph-volume raw list {
    "15": {
    "ceph_fsid": "a43417f6-0c7c-4a58-a7ed-a1b9a7922cb5",
    "device": "/dev/nvme3n1p2",
    "osd_id": 15,
    "osd_uuid": "92ba9b4d-bd69-427e-ad12-80cc144d44fe",
    "type": "bluestore"
    },
    "16": {
    "ceph_fsid": "a43417f6-0c7c-4a58-a7ed-a1b9a7922cb5",
    "device": "/dev/mapper/ceph--78dfedd8--9603--4c07--93c5--53c1fb766604-osd--block--4e1aafb1--678e--4d10--95ad--fd2a05c418c4",
    "osd_id": 16,
    "osd_uuid": "4e1aafb1-678e-4d10-95ad-fd2a05c418c4",
    "type": "bluestore"
    },
    "6": {
    "ceph_fsid": "a43417f6-0c7c-4a58-a7ed-a1b9a7922cb5",
    "device": "/dev/nvme1n1p2",
    "osd_id": 6,
    "osd_uuid": "4c70ce0f-9bf7-444c-b643-dae2efb2a514",
    "type": "bluestore"
    },
    "7": {
    "ceph_fsid": "a43417f6-0c7c-4a58-a7ed-a1b9a7922cb5",
    "device": "/dev/sda2",
    "osd_id": 7,
    "osd_uuid": "b9d8c85c-a8b0-426c-9545-f26b78f7e1a6",
    "type": "bluestore"
    },
    "8": {
    "ceph_fsid": "a43417f6-0c7c-4a58-a7ed-a1b9a7922cb5",
    "device": "/dev/nvme2n1p2",
    "osd_id": 8,
    "osd_uuid": "de98bc32-a41e-4e8d-8f38-c37e406e0fe2",
    "type": "bluestore"
    }
    }
  1. ceph-volume raw activate --device /dev/sda2
    --> systemd support not yet implemented

root@node-a:~# systemctl start
root@node-a:~# systemctl status
- Ceph object storage daemon osd.7
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: enabled)
Drop-In: /lib/systemd/system/ceph-osd@.service.d
└─ceph-after-pve-cluster.conf
Active: failed (Result: exit-code) since Thu 2021-01-07 11:02:36 CET; 5s ago
Process: 12857 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 7 (code=exited, status=0/SUCCESS)
Process: 12861 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 7 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 12861 (code=exited, status=1/FAILURE)

Jan 07 11:02:36 node-a systemd1: : Service RestartSec=100ms expired, scheduling restart.
Jan 07 11:02:36 node-a systemd1: : Scheduled restart job, restart counter is at 3.
Jan 07 11:02:36 node-a systemd1: Stopped Ceph object storage daemon osd.7.
Jan 07 11:02:36 node-a systemd1: : Start request repeated too quickly.
Jan 07 11:02:36 node-a systemd1: : Failed with result 'exit-code'.
Jan 07 11:02:36 node-a systemd1: Failed to start Ceph object storage daemon osd.7.

  1. journalctl u
    -
    Logs begin at Thu 2021-01-07 10:54:12 CET, end at Thu 2021-01-07 11:03:04 CET. --
    Jan 07 11:02:35 node-a systemd1: Starting Ceph object storage daemon osd.7...
    Jan 07 11:02:35 node-a systemd1: Started Ceph object storage daemon osd.7.
    Jan 07 11:02:35 node-a ceph-osd12829: 2021-01-07 11:02:35.856 7f8d40ac4c80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:35 node-a ceph-osd12829: 2021-01-07 11:02:35.856 7f8d40ac4c80 -1 AuthRegistry(0x55d45a52a140) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:35 node-a ceph-osd12829: 2021-01-07 11:02:35.856 7f8d40ac4c80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:35 node-a ceph-osd12829: 2021-01-07 11:02:35.856 7f8d40ac4c80 -1 AuthRegistry(0x7fffd33ba268) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:35 node-a ceph-osd12829: failed to fetch mon config (--no-mon-config to skip)
    Jan 07 11:02:35 node-a systemd1: : Main process exited, code=exited, status=1/FAILURE
    Jan 07 11:02:35 node-a systemd1: : Failed with result 'exit-code'.
    Jan 07 11:02:36 node-a systemd1: : Service RestartSec=100ms expired, scheduling restart.
    Jan 07 11:02:36 node-a systemd1: : Scheduled restart job, restart counter is at 1.
    Jan 07 11:02:36 node-a systemd1: Stopped Ceph object storage daemon osd.7.
    Jan 07 11:02:36 node-a systemd1: Starting Ceph object storage daemon osd.7...
    Jan 07 11:02:36 node-a systemd1: Started Ceph object storage daemon osd.7.
    Jan 07 11:02:36 node-a ceph-osd12848: 2021-01-07 11:02:36.132 7fdc448fcc80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:36 node-a ceph-osd12848: 2021-01-07 11:02:36.132 7fdc448fcc80 -1 AuthRegistry(0x55b38a220140) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:36 node-a ceph-osd12848: 2021-01-07 11:02:36.136 7fdc448fcc80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:36 node-a ceph-osd12848: 2021-01-07 11:02:36.136 7fdc448fcc80 -1 AuthRegistry(0x7ffdac313ed8) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:36 node-a ceph-osd12848: failed to fetch mon config (--no-mon-config to skip)
    Jan 07 11:02:36 node-a systemd1: : Main process exited, code=exited, status=1/FAILURE
    Jan 07 11:02:36 node-a systemd1: : Failed with result 'exit-code'.
    Jan 07 11:02:36 node-a systemd1: : Service RestartSec=100ms expired, scheduling restart.
    Jan 07 11:02:36 node-a systemd1: : Scheduled restart job, restart counter is at 2.
    Jan 07 11:02:36 node-a systemd1: Stopped Ceph object storage daemon osd.7.
    Jan 07 11:02:36 node-a systemd1: Starting Ceph object storage daemon osd.7...
    Jan 07 11:02:36 node-a systemd1: Started Ceph object storage daemon osd.7.
    Jan 07 11:02:36 node-a ceph-osd12861: 2021-01-07 11:02:36.380 7fafb1ea5c80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:36 node-a ceph-osd12861: 2021-01-07 11:02:36.380 7fafb1ea5c80 -1 AuthRegistry(0x55a8d2ff6140) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:36 node-a ceph-osd12861: 2021-01-07 11:02:36.380 7fafb1ea5c80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-7/keyring: (2) No such file or directory
    Jan 07 11:02:36 node-a ceph-osd12861: 2021-01-07 11:02:36.380 7fafb1ea5c80 -1 AuthRegistry(0x7ffc0036e298) no keyring found at /var/lib/ceph/osd/ceph-7/keyring, disabling cephx
    Jan 07 11:02:36 node-a ceph-osd12861: failed to fetch mon config (--no-mon-config to skip)
    Jan 07 11:02:36 node-a systemd1: : Main process exited, code=exited, status=1/FAILURE
    Jan 07 11:02:36 node-a systemd1: : Failed with result 'exit-code'.
    Jan 07 11:02:36 node-a systemd1: : Service RestartSec=100ms expired, scheduling restart.
    Jan 07 11:02:36 node-a systemd1: : Scheduled restart job, restart counter is at 3.
    Jan 07 11:02:36 node-a systemd1: Stopped Ceph object storage daemon osd.7.
    Jan 07 11:02:36 node-a systemd1: : Start request repeated too quickly.
    Jan 07 11:02:36 node-a systemd1: : Failed with result 'exit-code'.
    Jan 07 11:02:36 node-a systemd1: Failed to start Ceph object storage daemon osd.7.
  1. manuall start ##
  1. mount /dev/sda1 /var/lib/ceph/osd/ceph-7/
  1. ls lh /var/lib/ceph/osd/ceph-7
    total 56K
    -rw-r--r-
    1 root root 402 Mar 15 2018 activate.monmap
    rw-r--r- 1 ceph ceph 3 Mar 15 2018 active
    lrwxrwxrwx 1 ceph ceph 58 Mar 15 2018 block -> /dev/disk/by-partuuid/93c12ed7-5361-434e-ab17-bf0cf40fe3dd
    [snip]
#chown ceph:ceph /dev/sda2
  1. systemctl start

History

#1 Updated by Ronny Aasen 15 days ago

Hello

if one start the osd's manually with the method above

mount /dev/sda1 /var/lib/ceph/osd/ceph-7
chown ceph:ceph /dev/sda2
systemctl start ceph-osd@7.service

then ceph-volume simple scan will work and genereate /etc/ceph/osd/*.json files
and ceph-volume simple activate --all will detect start and enable the osd's
and it functions on reboot.

Also available in: Atom PDF