Bug #47300
closedmount.ceph fails to understand AAAA records from SRV record
0%
Description
Hello,
Unsure if this belongs to CephFS or RADOS :-). I have seen numerous of issues here regarding IPv6/AAAA records and SRV records that were all fixed in older versions like: https://tracker.ceph.com/issues/23078 and https://tracker.ceph.com/issues/23174
I am trying to get my clients to mount a CephFS on my IPv6 only ceph cluster using a SRV record to obtain monitor information. This does not work.
Client kernel: 5.4.0
Client ceph version: 15.2.3-0ubuntu0.20.04.1
SRV record:
;; ANSWER SECTION: _ceph-mon._tcp.example.dev. 78 IN SRV 10 60 3300 node1.example.dev. _ceph-mon._tcp.example.dev. 78 IN SRV 10 60 6789 node3.example.dev. _ceph-mon._tcp.example.dev. 78 IN SRV 10 60 6789 node2.example.dev.
node[1-3] only have an AAAA record, no A record.
My fstab:
:/backups /data/cephfs ceph name=backups,fs=backups,noatime,_netdev 0 2
Attempting to mount without a /etc/ceph/ceph.conf:
mount /data/cephfs did not load config file, using default settings. 2020-09-04T09:21:41.173+0200 ffffad442010 -1 Errors while parsing config file! 2020-09-04T09:21:41.173+0200 ffffad442010 -1 parse_file: filesystem error: cannot get file size: No such file or directory [ceph.conf] 2020-09-04T09:21:41.173+0200 ffffad442010 -1 Errors while parsing config file! 2020-09-04T09:21:41.173+0200 ffffad442010 -1 parse_file: filesystem error: cannot get file size: No such file or directory [ceph.conf] 2020-09-04T09:21:41.189+0200 ffffad442010 -1 res_query() failed 2020-09-04T09:21:41.193+0200 ffffad442010 -1 res_query() failed no monitors specified to connect to.2020-09-04T09:21:41.193+0200 ffffad442010 -1 res_query() failed unable to determine mon addresses
Attempting to mount with the following ceph.conf:
[global] ms bind ipv6 = true
mount /data/cephfs server name not found: 2001:db8:2000:8a:ec4:7aff:fe31:d638:6789 (Name or service not known) failed to resolve source
If I change my fstab and add the monitor nodes like this it works fine but that defeats the purpose of the SRV record.
node1,node2,node3:/backups /data/cephfs ceph name=backups,fs=backups,noatime,_netdev 0 2
mount | grep ceph [2001:db8:2000:8a:ae1f:6bff:fe05:9e70],[2001:db8:2000:8a:ec4:7aff:fede:d796],[2001:db8:2000:8a:ec4:7aff:fe31:d638]:/backups on /data/cephfs type ceph (rw,noatime,name=backups,secret=<hidden>,acl,mds_namespace=backups)
Sidenote that this only works with `ms bind ipv6 = true` in ceph.conf, without that option you are thrown the following error because it probably only searches for an a record:
mount /data/cephfs 2020-09-04T09:34:25.965+0200 ffff92502010 -1 res_query() failed 2020-09-04T09:34:25.969+0200 ffff92502010 -1 res_query() failed no monitors specified to connect to. 2020-09-04T09:34:25.973+0200 ffff92502010 -1 res_query() failed
So far i can't seem to find a combination of configuration that works that lets me obtain my monitor addresses from a SRV record and actually giving me a working cephfs mount.
Documentation: https://docs.ceph.com/docs/master/rados/configuration/mon-lookup-dns/
Updated by Josh Durgin over 3 years ago
Thanks for the detailed description. The earlier fix clearly depends on ms_bind_ipv6: https://github.com/ceph/ceph/pull/20530
The mon client should try both ipv4 and v6 resolution to be more user friendly and not require the ms_bind_ipv6 option which doesn't make sense for clients.
The other part of the bug, where the resolution fails even with that setting, needs further investigation.
Updated by Daniël Vos over 2 years ago
Issue still present on 16.2.6 (ceph packages 16.2.6-1focal, kernel 5.11.0-38-generic)
ms bind ipv6 = trueis no longer required, the following error remains:
# mount /data/cephfs server name not found: 2001:db8:2000:8a:ec4:7aff:fe31:d638:6789 (Name or service not known) failed to resolve source
Updated by Radoslaw Zarzynski about 2 years ago
- Assignee set to Matan Breizman
- Priority changed from High to Normal
Lowering the priority as there are no recent reports about the issue and assigning as it's a good way to learn about networking / msgr. Matan, please reach out to me if you need help. This isn't urgent.
Updated by Matan Breizman about 2 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 46051
Updated by Kefu Chai almost 2 years ago
- Status changed from Fix Under Review to Resolved
Updated by Matan Breizman almost 2 years ago
- Status changed from Resolved to Pending Backport
- Backport set to pacific,quincy
Updated by Backport Bot almost 2 years ago
- Copied to Backport #55513: quincy: mount.ceph fails to understand AAAA records from SRV record added
Updated by Backport Bot almost 2 years ago
- Copied to Backport #55514: pacific: mount.ceph fails to understand AAAA records from SRV record added
Updated by Matan Breizman almost 2 years ago
- Status changed from Pending Backport to Resolved