Project

General

Profile

Bug #43385

Monitor lookup in DNS needs to be case-insensitive

Added by Harald Staub over 4 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
low-hanging-fruit
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

For a long time, we have ceph.conf without a "mon host" line, instead use DNS SRV records:
https://docs.ceph.com/docs/master/rados/configuration/mon-lookup-dns/

Yesterday this broke for a cluster. At least there is a good error message (although the interesting line is missing in some cases), e.g.:
$ ceph -s
unable to get monitor info from DNS SRV with service name: ceph-mon
no monitors specified to connect to.
2019-12-18 14:45:56.699083 7fa6c8ef7700 -1 resolved target not in search domain: s0001.s1.scloud.switch.ch / .S1.scloud.switch.ch
[errno 2] error connecting to the cluster

Note the capital "S" in the domain name.

The error message ("resolved target not in search domain:") is in
https://github.com/ceph/ceph/blob/nautilus/src/common/dns_resolve.cc

The problem is that the string comparison there needs to be case-insensitive.

Some background information: DNS is case-insensitive. DNS servers are free to answer DNS requests with a mix of lowercase and uppercase letters, this is dependent on the implementation. In our case, there was a request (from outside the cluster) containing the uppercase letter. The name server echoed this in his reply. The answer landed in the cache of a resolver. Now every system that gets served by this resolver gets the error.

The error can also be provoked by the client alone, with a resolv.conf with
search S1.scloud.switch.ch switch.ch
->

$ host s0013
s0013.S1.scloud.switch.ch has IPv6 address 2001:620:5ca1:8001::1013
$ ceph -s
unable to get monitor info from DNS SRV with service name: ceph-mon
no monitors specified to connect to.
2019-12-19 14:52:50.761073 7f56d7ed1700 -1 resolved target not in search domain: s0013.s1.scloud.switch.ch / .S1.scloud.switch.ch
[errno 2] error connecting to the cluster

History

#1 Updated by Brad Hubbard over 4 years ago

  • Tags set to low-hanging-fruit

#2 Updated by Laura Flores over 1 year ago

  • Tags set to low-hanging-fruit

Also available in: Atom PDF