Project

General

Profile

Bug #43417

Updated by Kefu Chai almost 3 years ago

I set a local loopback ip on the same network segment as the cluster, like lo:0.The network configuration is as follows: 

 <pre> 
 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>    mtu 1500 
         inet 192.168.211.42    netmask 255.255.255.0    broadcast 192.168.211.255 
         inet6 fe80::ea61:1fff:fe16:e7b7    prefixlen 64    scopeid 0x20<link> 
         ether e8:61:1f:16:e7:b7    txqueuelen 1000    (Ethernet) 
         RX packets 2845662864    bytes 2454056684356 (2.2 TiB) 
         RX errors 0    dropped 8338    overruns 0    frame 0 
         TX packets 2573756982    bytes 2075412304729 (1.8 TiB) 
         TX errors 0    dropped 0 overruns 0    carrier 0    collisions 0 
         device memory 0xc5800000-c5fffff 
 lo: flags=73<UP,LOOPBACK,RUNNING>    mtu 65536 
         inet 127.0.0.1    netmask 255.0.0.0 
         inet6 ::1    prefixlen 128    scopeid 0x10<host> 
         loop    txqueuelen 1    (Local Loopback) 
         RX packets 61609662    bytes 155594418449 (144.9 GiB) 
         RX errors 0    dropped 0    overruns 0    frame 0 
         TX packets 61609662    bytes 155594418449 (144.9 GiB) 
         TX errors 0    dropped 0 overruns 0    carrier 0    collisions 0 
 lo:0: flags=73<UP,LOOPBACK,RUNNING>    mtu 65536 
         inet 192.168.211.200    netmask 255.255.255.255 
         loop    txqueuelen 1    (Local Loopback) 
 </pre> 
 The ceph configure is as follows: 

 <pre> 
 [global] 
 mon_initial_members = 172e18e211e**, 172e18e211e**,172e18e211e** 
 mon_host = 192.168.211.***,192.168.211.***,192.168.211.*** 
 auth_cluster_required = cephx 
 auth_service_required = cephx 
 auth_client_required = cephx 
 public network = 192.168.211.0/24 
 </pre> 


 

 The the problem is : 
 <pre> 
 [root@172e18e211e42 ~]# systemctl status ceph-osd@8 
 ● ceph-osd@8.service - Ceph object storage daemon osd.8 
    Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) 
    Active: active (running) since Tue 2019-12-24 09:33:42 CST; 12s ago 
   Process: 1553818 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) 
  Main PID: 1553823 (ceph-osd) 
    CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@8.service 
            └─1553823 /usr/bin/ceph-osd -f --cluster ceph --id 8 --setuser ceph --setgroup ceph 

 Dec 24 09:33:42 172e18e211e42 systemd[1]: Starting Ceph object storage daemon osd.8... 
 Dec 24 09:33:42 172e18e211e42 systemd[1]: Started Ceph object storage daemon osd.8. 
 Dec 24 09:33:42 172e18e211e42 ceph-osd[1553823]: 2019-12-24 09:33:42.981 7f6304c02d80 -1 Falling back to public interface 
 Dec 24 09:33:45 172e18e211e42 ceph-osd[1553823]: 2019-12-24 09:33:45.820 7f6304c02d80 -1 osd.8 9095 log_to_monitors {default=true} 
 Dec 24 09:33:45 172e18e211e42 ceph-osd[1553823]: 2019-12-24 09:33:45.849 7f62f722a700 -1 osd.8 9095 set_numa_affinity unable to identify public interface 'lo:0' numa node: (2) No such file or directory 
 Dec 24 09:33:51 172e18e211e42 ceph-osd[1553823]: 2019-12-24 09:33:51.758 7f62f722a700 -1 osd.8 9131 set_numa_affinity unable to identify public interface 'lo:0' numa node: (2) No such file or directory 
 </pre> 

 


 The the osd can't restart . And I pinpoint the problem at /src/common/ipaddr.cc , 

 <pre><code class="cpp"> 
 const struct ifaddrs *find_ipv4_in_subnet(const struct ifaddrs *addrs, 
					   const struct sockaddr_in *net, 
					   unsigned int prefix_len, 
					   int numa_node) 
 </code></pre> 

 can't get the right ip. 

Back