Project

General

Profile

Actions

Bug #49654

closed

iSCSI stops working after Upgrade 15.2.4 -> 15.2.9

Added by Frank Holtz about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have four iSCSI-Gateways deployed with "ceph orch". two gateways each form an IQN sharing one /23 network.

In Upgrade processs, I have restarted the Gateways s103 and s104 by changing the included hosts in "ceph orch deploy" and then, I have redeployed both gateways. First, I only have redeployed only one iscsi-gateway. This gateway stopped providing any LUN. Restarting the second gateway has the same result.

If I stop the iscsi-service, gwcli reports the stopped node is down.

Mapping a new LUN with gwcli shows the following message:

Failed - ceph-UUID-iscsi.iscsi.s103.ljhzwr cannot be used to perform this operation because it is not defined within the gateways configuration

The rbd-target-api.log shows this relevant message:

2021-03-08 14:36:51,744     INFO [gateway.py:236:define()] - Configuration does not have an entry for this host(ceph-UUID-iscsi.iscsi.s103.ljhzwr) - nothing to define to LIO


Files

gateway.conf (19.2 KB) gateway.conf gateway.conf object Frank Holtz, 03/08/2021 02:22 PM
ceph-s.txt (1.16 KB) ceph-s.txt ceph status Frank Holtz, 03/08/2021 02:33 PM
rbd-target-api.log (5.16 KB) rbd-target-api.log rbd-target-api.log Frank Holtz, 03/08/2021 02:41 PM
gwcli-ls.txt (12.5 KB) gwcli-ls.txt gwcli ls Frank Holtz, 03/08/2021 02:53 PM
iscsi-gateway.cfg (400 Bytes) iscsi-gateway.cfg iscsi-gateway.cfg s103 Frank Holtz, 03/08/2021 03:03 PM

Related issues 1 (0 open1 closed)

Related to Orchestrator - Bug #50306: /etc/hosts is not passed to ceph containers. clusters that were relying on /etc/hosts for name resolution will have strange behaviorResolved

Actions
Actions #1

Updated by Frank Holtz about 3 years ago

Inside the iscsi-Container the wrong hostname is reported (CentOS 8 + podman):

# python3
Python 3.6.8 (default, Aug 24 2020, 17:57:11)
[GCC 8.3.1 20191121 (Red Hat 8.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.getfqdn()
'ceph-UUID-iscsi.iscsi.s101.kkbxij'

The reason is in /etc/hosts inside the container:
cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
#...other entries including own ip (not iscsi-ip)....
127.0.1.1 s101 s101 ceph-6ba5bfc8-009c-11eb-9f56-0025b5ff7900-iscsi.iscsi.s101.kkbxij

To fix this, the FQDN has set as hostname. After rebooting the node, everythin is fine.

Actions #2

Updated by Sebastian Wagner about 3 years ago

  • Project changed from Ceph to Orchestrator
  • Category deleted (common)
Actions #3

Updated by Sebastian Wagner about 3 years ago

  • Priority changed from Normal to High
Actions #4

Updated by Sebastian Wagner almost 3 years ago

  • Related to Bug #50306: /etc/hosts is not passed to ceph containers. clusters that were relying on /etc/hosts for name resolution will have strange behavior added
Actions #5

Updated by Sebastian Wagner almost 3 years ago

  • Status changed from New to In Progress
Actions #6

Updated by Sebastian Wagner almost 3 years ago

  • Pull request ID set to 41483
Actions #7

Updated by Sage Weil almost 3 years ago

  • Status changed from In Progress to Pending Backport
  • Backport set to pacific
Actions #8

Updated by Sebastian Wagner almost 3 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF