Bug #12696
closedceph-osd starts before network, fails
0%
Description
Our environment at startup, found that part of the OSD boot failure, log print is as follows:
OSD log:
11 17:32:15.443843 7fe538fad880 0 ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), process ceph-osd, pid 7898 2015-08-11 17:32:15.447847 7fe538fad880 -1 unable to find any IP address in networks: 111.111.111.0/24
message log:
Aug 11 17:32:15 ceph2 avahi-daemon[1779]: Registering new address record for fe80::92e2:baff:fe57:ec05 on enp7s0f1.*. Aug 11 17:32:15 ceph2 systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 56 --pid-file /var/run/ceph/osd.56.pid -c /etc/ceph/ceph.conf --cluster ceph -f... Aug 11 17:32:15 ceph2 systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 56 --pid-file /var/run/ceph/osd.56.pid -c /etc/ceph/ceph.conf --cluster ceph -f. Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:mount_activate: activate /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.0ebfabba-ae88-4ec8-befa-af12354da21e Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:Mounting /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.0ebfabba-ae88-4ec8-befa-af12354da21e on /var/lib/ceph/tmp/mnt.G_UXjH with options rw,noexec,nodev,noatime,nodiratime,nobarrier Aug 11 17:32:15 ceph2 bash[7894]: 2015-08-11 17:32:15.447847 7fe538fad880 -1 unable to find any IP address in networks: 111.111.111.0/24 Aug 11 17:32:15 ceph2 systemd[1]: run-7893.service: main process exited, code=exited, status=1/FAILURE Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:clean_mpoint: begin cleaning up mpoint for osd.51 Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:clean_mpoint: disk path has no changed Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:clean mount point finished Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:ceph osd.51 already mounted in position; unmounting ours. Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.G_UXjH Aug 11 17:32:15 ceph2 ceph[2350]: WARNING:ceph-disk:Starting ceph osd.51... Aug 11 17:32:15 ceph2 ceph[2350]: === osd.51 === Aug 11 17:32:15 ceph2 avahi-daemon[1779]: Joining mDNS multicast group on interface enp7s0f1.IPv4 with address 111.111.111.242. Aug 11 17:32:15 ceph2 avahi-daemon[1779]: New relevant interface enp7s0f1.IPv4 for mDNS. Aug 11 17:32:15 ceph2 avahi-daemon[1779]: Registering new address record for 111.111.111.242 on enp7s0f1.IPv4. Aug 11 17:32:16 ceph2 network[6084]: Bringing up interface enp7s0f1: [ OK ]
Problem cause analysis?
If the server node starts, cluster network configuration and OSD process simultaneously, so will appear above the print information
The solution?
Retry in function “fill_in_one_address” find cluster network
Files
Updated by huanwen ren over 8 years ago
Updated by Sage Weil over 8 years ago
- Subject changed from fill_in_one_address's not retry on pick_address.cc to ceph-osd starts before network, fails
- Status changed from New to Need More Info
- Assignee set to Sage Weil
- Priority changed from Normal to Urgent
- Source changed from other to Community (user)
Updated by Sage Weil over 8 years ago
- what OS/distro is this?
- is the ceph service enabled? the final start should have done an activate-all which would have caught this case. Do you have the full startup log so we can see what happens the second time osd.51 is started?
Updated by Samuel Just over 8 years ago
- Status changed from Need More Info to Can't reproduce
Updated by huanwen ren over 8 years ago
- File messages_error messages_error added
Sage Weil wrote:
- what OS/distro is this?
- is the ceph service enabled? the final start should have done an activate-all which would have caught this case. Do you have the full startup log so we can see what happens the second time osd.51 is started?
sorry, I'm later
my OS's info:
Kenerl Version: Linux version 3.10.0-123.el7.x86_64 Red Hat Enterprise Linux Server release 7.0 (Maipo)
Updated by huanwen ren over 8 years ago
Full startup log:
Please see attached file