Project

General

Profile

Actions

Bug #63286

open

[NVMeOF Gateway service fails to deploy at nvmeof v0.0.5

Added by Rahul Lepakshi 7 months ago. Updated 6 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

At nvmeof v0.0.5, nvmeof service fails to deploy
Image - quay.io/ceph/nvmeof:v0.0.5 ---> built on 23-oct-2023

OMAP entities -

[ceph: root@argo023 /]# rados -p rbd2 listomapvals nvmeof.state
omap_version
value (1 bytes) :
00000000  31                                                |1|
00000001

Failure logs - GW node-1 - argo027

[root@argo027 ~]# journalctl -u ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service
Oct 23 09:54:07 argo027 systemd[1]: Starting Ceph nvmeof.rbd2.argo027.mzlzao for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534...
Oct 23 09:54:07 argo027 bash[1559307]: Trying to pull quay.io/ceph/nvmeof:v0.0.5...
Oct 23 09:54:08 argo027 bash[1559307]: Getting image source signatures
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:f86465781a50211aaece789713bd5aacad03d4cb233a7a4d2f679264fd424fcf
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:fbd86e10d72176abd2c5d311951b344cfbaed882e7d3df969fc47452e7f5f2b2
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:a8cad91fa812d8ddab2e139ca7b303698c718e70a8012ab148e6078f66e23203
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:ea9ecfb6e74027ed827ba3b6e5984455189d6133e33237a88da7244f3759e888
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:55110d5706bfb5980f03e6cd0dfa84595ab17ba371a030b1dd66325a4e92e3e7
Oct 23 09:54:08 argo027 bash[1559307]: Copying blob sha256:bedf9e690ab8f8d00d7122ea17ca348abb3f06262af519523d248e4cc7eb12a3
Oct 23 09:54:09 argo027 bash[1559307]: Copying blob sha256:037219d705dec76ff0975293fa81413e6d1348ecd86a3211b113f288beb85f04
Oct 23 09:54:16 argo027 bash[1559307]: Copying config sha256:d32a205e274d587de90e2c24ad21f15712b97b0f99f9978d148c9225d405059d
Oct 23 09:54:16 argo027 bash[1559307]: Writing manifest to image destination
Oct 23 09:54:16 argo027 bash[1559307]: Storing signatures
Oct 23 09:54:16 argo027 podman[1559307]:
Oct 23 09:54:16 argo027 podman[1559307]: 2023-10-23 09:54:16.514570907 +0000 UTC m=+8.690389132 container create ebac228f81d20ab0a2c74ec433b8ca9119c1adc3784bb7fadc1c8443ee002490 (image=quay.io/ceph/nvmeof:v0.>
Oct 23 09:54:16 argo027 podman[1559307]: 2023-10-23 09:54:16.560116027 +0000 UTC m=+8.735934246 container init ebac228f81d20ab0a2c74ec433b8ca9119c1adc3784bb7fadc1c8443ee002490 (image=quay.io/ceph/nvmeof:v0.0.>
Oct 23 09:54:16 argo027 podman[1559307]: 2023-10-23 09:54:16.564053846 +0000 UTC m=+8.739872067 container start ebac228f81d20ab0a2c74ec433b8ca9119c1adc3784bb7fadc1c8443ee002490 (image=quay.io/ceph/nvmeof:v0.0>
Oct 23 09:54:16 argo027 podman[1559307]: 2023-10-23 09:54:07.84945393 +0000 UTC m=+0.025272152 image pull  quay.io/ceph/nvmeof:v0.0.5
Oct 23 09:54:16 argo027 bash[1559307]: ebac228f81d20ab0a2c74ec433b8ca9119c1adc3784bb7fadc1c8443ee002490
Oct 23 09:54:16 argo027 systemd[1]: Started Ceph nvmeof.rbd2.argo027.mzlzao for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:Starting gateway client.nvmeof.rbd2.argo027.mzlzao
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: DEBUG:control.server:Starting serve
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: DEBUG:control.server:Configuring server client.nvmeof.rbd2.argo027.mzlzao
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:SPDK Target Path: /usr/local/bin/nvmf_tgt
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:SPDK Socket: /var/tmp/spdk.sock
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:Starting /usr/local/bin/nvmf_tgt -u -r /var/tmp/spdk.sock
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:Attempting to initialize SPDK: rpc_socket: /var/tmp/spdk.sock, conn_retries: 300, tim>
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO: Setting log level to WARN
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:JSONRPCClient(/var/tmp/spdk.sock):Setting log level to WARN
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:16.804329] Starting SPDK v23.01.1 / DPDK 22.11.1 initialization...
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:16.804406] [ DPDK EAL parameters: nvmf --no-shconf -c 0x1 --no-pci --huge-unlink --log->
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: TELEMETRY: No legacy callbacks, legacy socket not created
Oct 23 09:54:16 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:16.919265] app.c: 712:spdk_app_start: *NOTICE*: Total cores available: 1
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:17.010922] reactor.c: 926:reactor_run: *NOTICE*: Reactor started on core 0
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:17.058342] accel_sw.c: 681:sw_accel_module_init: *NOTICE*: Accel framework software mod>
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: DEBUG:control.server:create_transport: tcp options: {"in_capsule_data_size": 8192, "max_io_qpairs_per_ctr>
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: [2023-10-23 09:54:17.196003] tcp.c: 629:nvmf_tcp_create: *NOTICE*: *** TCP Transport Init ***
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.server:Starting ceph nvmeof discovery service
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.state:First gateway: created object nvmeof.state
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:Using NVMeoF gateway version 0.0.5
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:Using SPDK version 23.01.1
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:Using Ceph version 17.2.6
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:NVMeoF gateway built on: 2023-10-22 18:27:39 UTC
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:NVMeoF gateway Git repository: https://github.com/ceph/ceph-nvmeof.git
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:NVMeoF gateway Git branch: devel
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:NVMeoF gateway Git commit: 69574a7a0891ccbf1de8181e44ad8d97c293e892
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:Using configuration file /src/ceph-nvmeof.conf
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:Configuration file content:
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:============================================================================
Oct 23 09:54:17 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1559503]: INFO:control.grpc:# This file is generated by cephadm.
...skipping...
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     if name != member.name:
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/usr/lib64/python3.9/types.py", line 178, in __get__
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     return self.fget(instance)
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/server.py", line 46, in sigchld_handler
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     raise SystemExit(f"Gateway subprocess terminated {pid=} {exit_code=}")
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: SystemExit: Gateway subprocess terminated pid=6 exit_code=1
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: INFO:control.server:Aborting SPDK(client.nvmeof.rbd2.argo027.mzlzao) pid 3...
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: INFO:control.server:Terminating discovery service...
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: Traceback (most recent call last):
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/__main__.py", line 35, in <module>
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     gateway.serve()
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/server.py", line 117, in serve
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     self.server = grpc.server(futures.ThreadPoolExecutor(max_workers=1))
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/__pypackages__/3.9/lib/grpc/__init__.py", line 2053, in server
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     from grpc import _server  # pylint: disable=cyclic-import
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/__pypackages__/3.9/lib/grpc/_server.py", line 863, in <module>
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     class _ServerStage(enum.Enum):
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/usr/lib64/python3.9/enum.py", line 1008, in unique
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     if name != member.name:
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/usr/lib64/python3.9/types.py", line 178, in __get__
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     return self.fget(instance)
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/server.py", line 46, in sigchld_handler
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     raise SystemExit(f"Gateway subprocess terminated {pid=} {exit_code=}")
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: SystemExit: Gateway subprocess terminated pid=6 exit_code=1
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: During handling of the above exception, another exception occurred:
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: Traceback (most recent call last):
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     return _run_code(code, main_globals, None,
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     exec(code, run_globals)
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/__main__.py", line 36, in <module>
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     gateway.keep_alive()
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/server.py", line 93, in __exit__
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     self._stop_discovery()
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:   File "/src/control/server.py", line 278, in _stop_discovery
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]:     os.kill(self.discovery_pid, signal.SIGINT)
Oct 23 09:55:04 argo027 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo027-mzlzao[1561164]: ProcessLookupError: [Errno 3] No such process
Oct 23 09:55:04 argo027 podman[1561217]: 2023-10-23 09:55:04.679215062 +0000 UTC m=+0.030549301 container died c5e760242961995b839b50b3355f7fad3435d2a690d175e8c0a54cb391d4cfac (image=quay.io/ceph/nvmeof:v0.0.>
Oct 23 09:55:04 argo027 podman[1561217]: 2023-10-23 09:55:04.695618098 +0000 UTC m=+0.046952309 container remove c5e760242961995b839b50b3355f7fad3435d2a690d175e8c0a54cb391d4cfac (image=quay.io/ceph/nvmeof:v0.>
Oct 23 09:55:04 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Main process exited, code=exited, status=1/FAILURE
Oct 23 09:55:04 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Failed with result 'exit-code'.
Oct 23 09:55:04 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Consumed 1.601s CPU time.
Oct 23 09:55:15 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Scheduled restart job, restart counter is at 5.
Oct 23 09:55:15 argo027 systemd[1]: Stopped Ceph nvmeof.rbd2.argo027.mzlzao for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.
Oct 23 09:55:15 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Consumed 1.601s CPU time.
Oct 23 09:55:15 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Start request repeated too quickly.
Oct 23 09:55:15 argo027 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo027.mzlzao.service: Failed with result 'exit-code'.
Oct 23 09:55:15 argo027 systemd[1]: Failed to start Ceph nvmeof.rbd2.argo027.mzlzao for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.

Failure logs - GW node-2 - argo028

[root@argo028 ~]# journalctl -u ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service
Oct 23 09:54:17 argo028 systemd[1]: Starting Ceph nvmeof.rbd2.argo028.rdguob for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534...
Oct 23 09:54:17 argo028 bash[1532362]: Trying to pull quay.io/ceph/nvmeof:v0.0.5...
Oct 23 09:54:18 argo028 bash[1532362]: Getting image source signatures
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:f86465781a50211aaece789713bd5aacad03d4cb233a7a4d2f679264fd424fcf
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:fbd86e10d72176abd2c5d311951b344cfbaed882e7d3df969fc47452e7f5f2b2
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:a8cad91fa812d8ddab2e139ca7b303698c718e70a8012ab148e6078f66e23203
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:bedf9e690ab8f8d00d7122ea17ca348abb3f06262af519523d248e4cc7eb12a3
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:ea9ecfb6e74027ed827ba3b6e5984455189d6133e33237a88da7244f3759e888
Oct 23 09:54:18 argo028 bash[1532362]: Copying blob sha256:55110d5706bfb5980f03e6cd0dfa84595ab17ba371a030b1dd66325a4e92e3e7
Oct 23 09:54:19 argo028 bash[1532362]: Copying blob sha256:037219d705dec76ff0975293fa81413e6d1348ecd86a3211b113f288beb85f04
Oct 23 09:54:26 argo028 bash[1532362]: Copying config sha256:d32a205e274d587de90e2c24ad21f15712b97b0f99f9978d148c9225d405059d
Oct 23 09:54:26 argo028 bash[1532362]: Writing manifest to image destination
Oct 23 09:54:26 argo028 bash[1532362]: Storing signatures
Oct 23 09:54:26 argo028 podman[1532362]:
Oct 23 09:54:26 argo028 podman[1532362]: 2023-10-23 09:54:26.186046847 +0000 UTC m=+8.428543455 container create 5a1cf4d17b585f010e8f40c9f48b9941feeb985c181343a274e4cf4b4abccf9e (image=quay.io/ceph/nvmeof:v0.>
Oct 23 09:54:26 argo028 podman[1532362]: 2023-10-23 09:54:17.783613522 +0000 UTC m=+0.026110123 image pull  quay.io/ceph/nvmeof:v0.0.5
Oct 23 09:54:26 argo028 podman[1532362]: 2023-10-23 09:54:26.221922013 +0000 UTC m=+8.464418620 container init 5a1cf4d17b585f010e8f40c9f48b9941feeb985c181343a274e4cf4b4abccf9e (image=quay.io/ceph/nvmeof:v0.0.>
Oct 23 09:54:26 argo028 podman[1532362]: 2023-10-23 09:54:26.22594186 +0000 UTC m=+8.468438473 container start 5a1cf4d17b585f010e8f40c9f48b9941feeb985c181343a274e4cf4b4abccf9e (image=quay.io/ceph/nvmeof:v0.0.>
Oct 23 09:54:26 argo028 bash[1532362]: 5a1cf4d17b585f010e8f40c9f48b9941feeb985c181343a274e4cf4b4abccf9e
Oct 23 09:54:26 argo028 systemd[1]: Started Ceph nvmeof.rbd2.argo028.rdguob for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:Starting gateway client.nvmeof.rbd2.argo028.rdguob
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: DEBUG:control.server:Starting serve
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: DEBUG:control.server:Configuring server client.nvmeof.rbd2.argo028.rdguob
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:SPDK Target Path: /usr/local/bin/nvmf_tgt
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:SPDK Socket: /var/tmp/spdk.sock
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:Starting /usr/local/bin/nvmf_tgt -u -r /var/tmp/spdk.sock
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:Attempting to initialize SPDK: rpc_socket: /var/tmp/spdk.sock, conn_retries: 300, tim>
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO: Setting log level to WARN
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:JSONRPCClient(/var/tmp/spdk.sock):Setting log level to WARN
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.464928] Starting SPDK v23.01.1 / DPDK 22.11.1 initialization...
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.465023] [ DPDK EAL parameters: nvmf --no-shconf -c 0x1 --no-pci --huge-unlink --log->
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: TELEMETRY: No legacy callbacks, legacy socket not created
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.580671] app.c: 712:spdk_app_start: *NOTICE*: Total cores available: 1
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.686182] reactor.c: 926:reactor_run: *NOTICE*: Reactor started on core 0
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.735221] accel_sw.c: 681:sw_accel_module_init: *NOTICE*: Accel framework software mod>
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: DEBUG:control.server:create_transport: tcp options: {"in_capsule_data_size": 8192, "max_io_qpairs_per_ctr>
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: [2023-10-23 09:54:26.857801] tcp.c: 629:nvmf_tcp_create: *NOTICE*: *** TCP Transport Init ***
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.server:Starting ceph nvmeof discovery service
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.state:nvmeof.state omap object already exists.
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:Using NVMeoF gateway version 0.0.5
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:Using SPDK version 23.01.1
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:Using Ceph version 17.2.6
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:NVMeoF gateway built on: 2023-10-22 18:27:39 UTC
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:NVMeoF gateway Git repository: https://github.com/ceph/ceph-nvmeof.git
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:NVMeoF gateway Git branch: devel
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:NVMeoF gateway Git commit: 69574a7a0891ccbf1de8181e44ad8d97c293e892
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:Using configuration file /src/ceph-nvmeof.conf
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:Configuration file content:
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:============================================================================
Oct 23 09:54:26 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1532557]: INFO:control.grpc:# This file is generated by cephadm.
...skipping...
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 124, in serve
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     gateway_state.start_update()
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/state.py", line 411, in start_update
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     self.omap.register_watch(notify_event)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/state.py", line 309, in register_watch
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     self.watch = self.ioctx.watch(self.omap_name, _watcher_callback)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 46, in sigchld_handler
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     raise SystemExit(f"Gateway subprocess terminated {pid=} {exit_code=}")
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: SystemExit: Gateway subprocess terminated pid=6 exit_code=1
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: INFO:control.server:Aborting SPDK(client.nvmeof.rbd2.argo028.rdguob) pid 3...
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: INFO:control.server:Stopping the server...
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: INFO:control.server:Terminating discovery service...
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: Traceback (most recent call last):
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/__main__.py", line 35, in <module>
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     gateway.serve()
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 124, in serve
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     gateway_state.start_update()
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/state.py", line 411, in start_update
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     self.omap.register_watch(notify_event)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/state.py", line 309, in register_watch
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     self.watch = self.ioctx.watch(self.omap_name, _watcher_callback)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 46, in sigchld_handler
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     raise SystemExit(f"Gateway subprocess terminated {pid=} {exit_code=}")
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: SystemExit: Gateway subprocess terminated pid=6 exit_code=1
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: During handling of the above exception, another exception occurred:
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: Traceback (most recent call last):
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     return _run_code(code, main_globals, None,
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     exec(code, run_globals)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/__main__.py", line 36, in <module>
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     gateway.keep_alive()
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 93, in __exit__
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     self._stop_discovery()
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:   File "/src/control/server.py", line 278, in _stop_discovery
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]:     os.kill(self.discovery_pid, signal.SIGINT)
Oct 23 09:55:14 argo028 ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534-nvmeof-rbd2-argo028-rdguob[1534252]: ProcessLookupError: [Errno 3] No such process
Oct 23 09:55:14 argo028 podman[1534341]: 2023-10-23 09:55:14.417704477 +0000 UTC m=+0.033019868 container died c670c049a1ca79045d1e2602031f270e28dd76160fcd1744d1d23053ab6d402f (image=quay.io/ceph/nvmeof:v0.0.>
Oct 23 09:55:14 argo028 podman[1534341]: 2023-10-23 09:55:14.435523286 +0000 UTC m=+0.050838670 container remove c670c049a1ca79045d1e2602031f270e28dd76160fcd1744d1d23053ab6d402f (image=quay.io/ceph/nvmeof:v0.>
Oct 23 09:55:14 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Main process exited, code=exited, status=1/FAILURE
Oct 23 09:55:14 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Failed with result 'exit-code'.
Oct 23 09:55:14 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Consumed 1.681s CPU time.
Oct 23 09:55:24 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Scheduled restart job, restart counter is at 5.
Oct 23 09:55:24 argo028 systemd[1]: Stopped Ceph nvmeof.rbd2.argo028.rdguob for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.
Oct 23 09:55:24 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Consumed 1.681s CPU time.
Oct 23 09:55:24 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Start request repeated too quickly.
Oct 23 09:55:24 argo028 systemd[1]: ceph-27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534@nvmeof.rbd2.argo028.rdguob.service: Failed with result 'exit-code'.
Oct 23 09:55:24 argo028 systemd[1]: Failed to start Ceph nvmeof.rbd2.argo028.rdguob for 27ca60e2-6e35-11ee-a2a5-ac1f6b0a1534.

Actions #1

Updated by Aviv Caro 6 months ago

Fixed.

Actions

Also available in: Atom PDF