Project

General

Profile

Bug #20807

Error in boot.log - Failed to start Ceph disk activation - Luminous

Added by Oscar Segarra over 6 years ago. Updated over 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-disk
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[root@vdicnode01 ~]# ceph -v
ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)

When I boot my ceph node (I have an all in one) I get the following message in boot.log:

[FAILED] Failed to start Ceph disk activation: /dev/sdb2.
See 'systemctl status ceph-disk@dev-sdb2.service' for details.
[FAILED] Failed to start Ceph disk activation: /dev/sdb1.
See 'systemctl status ceph-disk@dev-sdb1.service' for details.

[root@vdicnode01 ~]# systemctl status ceph-disk@dev-sdb1.service
● ceph-disk@dev-sdb1.service - Ceph disk activation: /dev/sdb1
   Loaded: loaded (/usr/lib/systemd/system/ceph-disk@.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2017-07-27 23:37:23 CEST; 1h 52min ago
  Process: 740 ExecStart=/bin/sh -c timeout $CEPH_DISK_TIMEOUT flock /var/lock/ceph-disk-$(basename %f) /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f (code=exited, status=1/FAILURE)
 Main PID: 740 (code=exited, status=1/FAILURE)

Jul 27 23:37:23 vdicnode01 sh[740]: main(sys.argv[1:])
Jul 27 23:37:23 vdicnode01 sh[740]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5682, in main
Jul 27 23:37:23 vdicnode01 sh[740]: args.func(args)
Jul 27 23:37:23 vdicnode01 sh[740]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4891, in main_trigger
Jul 27 23:37:23 vdicnode01 sh[740]: raise Error('return code ' + str(ret))
Jul 27 23:37:23 vdicnode01 sh[740]: ceph_disk.main.Error: Error: return code 1
Jul 27 23:37:23 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service: main process exited, code=exited, status=1/FAILURE
Jul 27 23:37:23 vdicnode01 systemd[1]: Failed to start Ceph disk activation: /dev/sdb1.
Jul 27 23:37:23 vdicnode01 systemd[1]: Unit ceph-disk@dev-sdb1.service entered failed state.
Jul 27 23:37:23 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service failed.

[root@vdicnode01 ~]# systemctl status ceph-disk@dev-sdb2.service
● ceph-disk@dev-sdb2.service - Ceph disk activation: /dev/sdb2
   Loaded: loaded (/usr/lib/systemd/system/ceph-disk@.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2017-07-27 23:37:23 CEST; 1h 52min ago
  Process: 744 ExecStart=/bin/sh -c timeout $CEPH_DISK_TIMEOUT flock /var/lock/ceph-disk-$(basename %f) /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f (code=exited, status=1/FAILURE)
 Main PID: 744 (code=exited, status=1/FAILURE)

Jul 27 23:37:23 vdicnode01 sh[744]: main(sys.argv[1:])
Jul 27 23:37:23 vdicnode01 sh[744]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5682, in main
Jul 27 23:37:23 vdicnode01 sh[744]: args.func(args)
Jul 27 23:37:23 vdicnode01 sh[744]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4891, in main_trigger
Jul 27 23:37:23 vdicnode01 sh[744]: raise Error('return code ' + str(ret))
Jul 27 23:37:23 vdicnode01 sh[744]: ceph_disk.main.Error: Error: return code 1
Jul 27 23:37:23 vdicnode01 systemd[1]: ceph-disk@dev-sdb2.service: main process exited, code=exited, status=1/FAILURE
Jul 27 23:37:23 vdicnode01 systemd[1]: Failed to start Ceph disk activation: /dev/sdb2.
Jul 27 23:37:23 vdicnode01 systemd[1]: Unit ceph-disk@dev-sdb2.service entered failed state.
Jul 27 23:37:23 vdicnode01 systemd[1]: ceph-disk@dev-sdb2.service failed.

I have created an entry in /etc/fstab in order to mount journal disk automatically:

/dev/sdb1               /var/lib/ceph/osd/ceph-0   xfs  defaults,noatime  1 2

But when I boot, I get the same error message.

When I execute ceph -s osd look work perfectly:

[root@vdicnode01 ~]# ceph -s
  cluster:
    id:     61881df3-1365-4139-a586-92b5eca9cf18
    health: HEALTH_WARN
            Degraded data redundancy: 5/10 objects degraded (50.000%), 128 pgs unclean, 128 pgs degraded, 128 pgs undersized
            128 pgs not scrubbed for 86400

  services:
    mon: 1 daemons, quorum vdicnode01
    mgr: vdicnode01(active)
    osd: 1 osds: 1 up, 1 in

  data:
    pools:   1 pools, 128 pgs
    objects: 5 objects, 1349 bytes
    usage:   1073 MB used, 39785 MB / 40858 MB avail
    pgs:     5/10 objects degraded (50.000%)
             128 active+undersized+degraded

History

#1 Updated by Loïc Dachary over 6 years ago

  • Description updated (diff)
  • Assignee set to Loïc Dachary

#2 Updated by Loïc Dachary over 6 years ago

  • Status changed from New to Need More Info

If I understand correctly, you have a machine with just one OSD, on /dev/sdb. And after you boot it works as expected (that's what the ceph -s shows). But when you examine the ceph-disk system units (with systemctl status etc.), you see that both of them failed and you wonder why it is the case. Am I right ?

If this is indeed what is going on, I suggest you remove the line you added in fstab, reboot and see if you get a different result. It would also help to get the relevant logs from journalctl

#3 Updated by Oscar Segarra over 6 years ago

Hi,

You are absolutely right, In my environment I have just one OSD on /dev/sdb and I'd like to be sure everything is working perfectly before adding new osd. On the other hand, as you are working hard in the new release I've thought that would be better to report the bug in order to help you to perform a better storage solution.

Error messages appear in boot.log with and without the line in /etc/fstab.

Now, I'm on holidays and I will be able to send journalctl log next week.

#4 Updated by Oscar Segarra over 6 years ago

Hi,

I attach the complete log:

[FAILED] Failed to start Ceph disk activation: /dev/sdb1.
See 'systemctl status ceph-disk@dev-sdb1.service' for details.
[FAILED] Failed to start Ceph disk activation: /dev/sdb2.
See 'systemctl status ceph-disk@dev-sdb2.service' for details.
[  OK  ] Started LVM2 PV scan on device 8:2.
[  OK  ] Started LSB: Bring up/down networking.
[  OK  ] Reached target Network.
         Starting Postfix Mail Transport Agent...
         Starting OpenSSH server daemon...
[  OK  ] Reached target Network is Online.
[  OK  ] Started Ceph cluster monitor daemon.
         Starting Ceph cluster monitor daemon...
[  OK  ] Reached target ceph target allowing to start/stop all ceph-mon@.service instances at once.
         Starting Ceph object storage daemon osd.0...
[  OK  ] Started Ceph cluster manager daemon.
         Starting Ceph cluster manager daemon...
[  OK  ] Reached target ceph target allowing to start/stop all ceph-mgr@.service instances at once.
         Starting Dynamic System Tuning Daemon...
         Starting System Logging Service...
         Starting Logout off all iSCSI sessions on shutdown...
         Starting Notify NFS peers of a restart...
[  OK  ] Started Logout off all iSCSI sessions on shutdown.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
         Starting Permit User Sessions...
         Starting Virtualization daemon...
         Starting Crash recovery kernel arming...
[  OK  ] Started Notify NFS peers of a restart.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Command Scheduler.
         Starting Command Scheduler...
         Starting Terminate Plymouth Boot Screen...
         Starting Wait for Plymouth Boot Screen to Quit...
[  OK  ] Started Ceph object storage daemon osd.0.
[  OK  ] Reached target ceph target allowing to start/stop all ceph-osd@.service instances at once.
[  OK  ] Reached target ceph target allowing to start/stop all ceph*@.service instances at once.
[root@vdicnode01 ~]# service ceph-disk@dev-sdb1 status
Redirecting to /bin/systemctl status  ceph-disk@dev-sdb1.service
● ceph-disk@dev-sdb1.service - Ceph disk activation: /dev/sdb1
   Loaded: loaded (/usr/lib/systemd/system/ceph-disk@.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2017-08-08 19:58:31 CEST; 10min ago
  Process: 741 ExecStart=/bin/sh -c timeout $CEPH_DISK_TIMEOUT flock /var/lock/ceph-disk-$(basename %f) /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f (code=exited, status=1/FAILURE)
 Main PID: 741 (code=exited, status=1/FAILURE)

Aug 08 19:58:31 vdicnode01 sh[741]: main(sys.argv[1:])
Aug 08 19:58:31 vdicnode01 sh[741]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5650, in main
Aug 08 19:58:31 vdicnode01 sh[741]: args.func(args)
Aug 08 19:58:31 vdicnode01 sh[741]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4853, in main_trigger
Aug 08 19:58:31 vdicnode01 sh[741]: raise Error('return code ' + str(ret))
Aug 08 19:58:31 vdicnode01 sh[741]: ceph_disk.main.Error: Error: return code 1
Aug 08 19:58:31 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service: main process exited, code=exited, status=1/FAILURE
Aug 08 19:58:31 vdicnode01 systemd[1]: Failed to start Ceph disk activation: /dev/sdb1.
Aug 08 19:58:31 vdicnode01 systemd[1]: Unit ceph-disk@dev-sdb1.service entered failed state.
Aug 08 19:58:31 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service failed.
[root@vdicnode01 ~]#
[root@vdicnode01 ~]#
[root@vdicnode01 ~]#
[root@vdicnode01 ~]# service ceph-disk@dev-sdb1 start
Redirecting to /bin/systemctl start  ceph-disk@dev-sdb1.service
Job for ceph-disk@dev-sdb1.service failed because the control process exited with error code. See "systemctl status ceph-disk@dev-sdb1.service" and "journalctl -xe" for details.
[root@vdicnode01 ~]# systemctl status ceph-disk@dev-sdb1.service
● ceph-disk@dev-sdb1.service - Ceph disk activation: /dev/sdb1
   Loaded: loaded (/usr/lib/systemd/system/ceph-disk@.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2017-08-08 20:09:04 CEST; 7s ago
  Process: 3148 ExecStart=/bin/sh -c timeout $CEPH_DISK_TIMEOUT flock /var/lock/ceph-disk-$(basename %f) /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f (code=exited, status=1/FAILURE)
 Main PID: 3148 (code=exited, status=1/FAILURE)

Aug 08 20:09:04 vdicnode01 sh[3148]: main(sys.argv[1:])
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5650, in main
Aug 08 20:09:04 vdicnode01 sh[3148]: args.func(args)
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4853, in main_trigger
Aug 08 20:09:04 vdicnode01 sh[3148]: raise Error('return code ' + str(ret))
Aug 08 20:09:04 vdicnode01 sh[3148]: ceph_disk.main.Error: Error: return code 1
Aug 08 20:09:04 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service: main process exited, code=exited, status=1/FAILURE
Aug 08 20:09:04 vdicnode01 systemd[1]: Failed to start Ceph disk activation: /dev/sdb1.
Aug 08 20:09:04 vdicnode01 systemd[1]: Unit ceph-disk@dev-sdb1.service entered failed state.
Aug 08 20:09:04 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service failed.
[root@vdicnode01 ~]# journalctl -xe
Aug 08 20:08:08 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:08] "GET /health_data HTTP/1.1" 200 32505 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:13 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:13] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:13 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:13] "GET /health_data HTTP/1.1" 200 32503 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:18 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:18] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:18 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:18] "GET /health_data HTTP/1.1" 200 32505 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:23 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:23] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:23 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:23] "GET /health_data HTTP/1.1" 200 32505 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:28 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:28] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:28 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:28] "GET /health_data HTTP/1.1" 200 32505 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:33 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:33] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:33 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:33] "GET /health_data HTTP/1.1" 200 32505 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:38 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:38] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:38 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:38] "GET /health_data HTTP/1.1" 200 32479 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:43 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:43] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:43 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:43] "GET /health_data HTTP/1.1" 200 32477 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:48 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:48] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:48 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:48] "GET /health_data HTTP/1.1" 200 32475 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:53 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:53] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:53 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:53] "GET /health_data HTTP/1.1" 200 32475 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:58 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:58] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:08:58 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:08:58] "GET /health_data HTTP/1.1" 200 32479 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:03 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:03] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:03 vdicnode01 polkitd[756]: Registered Authentication Agent for unix-process:3131:64999 (system bus name :1.27 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
Aug 08 20:09:03 vdicnode01 systemd[1]: Starting Ceph disk activation: /dev/sdb1...
-- Subject: Unit ceph-disk@dev-sdb1.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit ceph-disk@dev-sdb1.service has begun starting up.
Aug 08 20:09:03 vdicnode01 sh[3148]: main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdb1', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x18dfde8>, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', setgroup=None, setuser=None, statedir='/var/lib
Aug 08 20:09:03 vdicnode01 sh[3148]: command: Running command: /usr/sbin/init --version
Aug 08 20:09:03 vdicnode01 sh[3148]: command_check_call: Running command: /usr/bin/chown ceph:ceph /dev/sdb1
Aug 08 20:09:03 vdicnode01 sh[3148]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdb1
Aug 08 20:09:03 vdicnode01 sh[3148]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdb1
Aug 08 20:09:03 vdicnode01 sh[3148]: main_trigger: trigger /dev/sdb1 parttype 4fbd7e29-9d25-41b8-afd0-062c0ceff05d uuid f1a4c29b-f210-44e5-9668-ecce4db0f18e
Aug 08 20:09:03 vdicnode01 sh[3148]: command: Running command: /usr/sbin/ceph-disk --verbose activate /dev/sdb1
Aug 08 20:09:03 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:03] "GET /health_data HTTP/1.1" 200 32479 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:04 vdicnode01 sh[3148]: main_trigger:
Aug 08 20:09:04 vdicnode01 sh[3148]: main_trigger: main_activate: path = /dev/sdb1
Aug 08 20:09:04 vdicnode01 sh[3148]: get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdb1
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdb1
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
Aug 08 20:09:04 vdicnode01 sh[3148]: mount: Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.8j1k2s with options noatime,inode64
Aug 08 20:09:04 vdicnode01 sh[3148]: command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/tmp/mnt.8j1k2s
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.8j1k2s
Aug 08 20:09:04 vdicnode01 sh[3148]: activate: Cluster uuid is 61881df3-1365-4139-a586-92b5eca9cf18
Aug 08 20:09:04 vdicnode01 sh[3148]: command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
Aug 08 20:09:04 vdicnode01 sh[3148]: mount_activate: Failed to activate
Aug 08 20:09:04 vdicnode01 sh[3148]: unmount: Unmounting /var/lib/ceph/tmp/mnt.8j1k2s
Aug 08 20:09:04 vdicnode01 sh[3148]: command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.8j1k2s
Aug 08 20:09:04 vdicnode01 sh[3148]: Traceback (most recent call last):
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/sbin/ceph-disk", line 9, in <module>
Aug 08 20:09:04 vdicnode01 sh[3148]: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5699, in run
Aug 08 20:09:04 vdicnode01 sh[3148]: main(sys.argv[1:])
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5650, in main
Aug 08 20:09:04 vdicnode01 sh[3148]: args.func(args)
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3754, in main_activate
Aug 08 20:09:04 vdicnode01 sh[3148]: reactivate=args.reactivate,
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3517, in mount_activate
Aug 08 20:09:04 vdicnode01 sh[3148]: (osd_id, cluster) = activate(path, activate_key_template, init)
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3664, in activate
Aug 08 20:09:04 vdicnode01 sh[3148]: ' with fsid %s' % ceph_fsid)
Aug 08 20:09:04 vdicnode01 sh[3148]: ceph_disk.main.Error: Error: No cluster conf found in /etc/ceph with fsid 61881df3-1365-4139-a586-92b5eca9cf18
Aug 08 20:09:04 vdicnode01 sh[3148]: Traceback (most recent call last):
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/sbin/ceph-disk", line 9, in <module>
Aug 08 20:09:04 vdicnode01 sh[3148]: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5699, in run
Aug 08 20:09:04 vdicnode01 sh[3148]: main(sys.argv[1:])
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5650, in main
Aug 08 20:09:04 vdicnode01 sh[3148]: args.func(args)
Aug 08 20:09:04 vdicnode01 sh[3148]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4853, in main_trigger
Aug 08 20:09:04 vdicnode01 sh[3148]: raise Error('return code ' + str(ret))
Aug 08 20:09:04 vdicnode01 sh[3148]: ceph_disk.main.Error: Error: return code 1
Aug 08 20:09:04 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service: main process exited, code=exited, status=1/FAILURE
Aug 08 20:09:04 vdicnode01 systemd[1]: Failed to start Ceph disk activation: /dev/sdb1.
-- Subject: Unit ceph-disk@dev-sdb1.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit ceph-disk@dev-sdb1.service has failed.
--
-- The result is failed.
Aug 08 20:09:04 vdicnode01 systemd[1]: Unit ceph-disk@dev-sdb1.service entered failed state.
Aug 08 20:09:04 vdicnode01 systemd[1]: ceph-disk@dev-sdb1.service failed.
Aug 08 20:09:04 vdicnode01 polkitd[756]: Unregistered Authentication Agent for unix-process:3131:64999 (system bus name :1.27, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Aug 08 20:09:08 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:08] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:09 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:09] "GET /health_data HTTP/1.1" 200 32453 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:13 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:13] "GET /toplevel_data HTTP/1.1" 200 173 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
Aug 08 20:09:14 vdicnode01 ceph-mgr[1158]: 192.168.100.1 - - [08/Aug/2017:20:09:14] "GET /health_data HTTP/1.1" 200 32453 "http://192.168.100.101:7000/health" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 
[root@vdicnode01 ~]# cat /etc/ceph/ceph.conf
[global]
fsid = d6b54a37-1cbe-483a-94c0-703e072aa6fd
public_network = 192.168.100.0/24
cluster_network = 192.168.100.0/24
mon_initial_members = vdicnode01
mon_host = 192.168.100.101
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

osd pool default size = 2
rbd_default_format = 2
rbd_cache = false

[root@vdicnode01 ~]#

#5 Updated by Loïc Dachary over 6 years ago

Aug 08 20:09:04 vdicnode01 sh[3148]: ceph_disk.main.Error: Error: No cluster conf found in /etc/ceph with fsid 61881df3-1365-4139-a586-92b5eca9cf18

It looks like this is your problem.

#6 Updated by Oscar Segarra over 6 years ago

Hi Loic,

I have already seen this line...

Can you provide further information about how to fix this error? Can I update it manually?

Thanks a lot!

#7 Updated by Loïc Dachary over 6 years ago

  • Status changed from Need More Info to Rejected

I'm not sure why this inconsistency happens. Since this is outside of the scope of this issue, I invite you to discuss it on the mailing list if this is still relevant. Rejecting this issue because it does not look like a bug. Feel free to ask that it is re-opened if you disagree.

Also available in: Atom PDF