Project

General

Profile

Bug #46529

cephadm: error removing storage for container "...-mon": remove /var/lib/containers/storage/overlay/.../merged: device or resource busy

Added by Sebastian Wagner 4 months ago. Updated 5 days ago.

Status:
Resolved
Priority:
Urgent
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

/a/teuthology-2020-07-12_07:01:02-rados-master-distro-basic-smithi/5217488 on centos_7.6

2020-07-12T15:39:57.736 INFO:journalctl@ceph.mon.c.smithi145.stdout:Jul 12 15:39:57 smithi145 bash[9707]: Error: error removing storage for container "ceph-567f836e-c455-11ea-a06e-001a4aab830c-mon.c": remove /var/lib/containers/storage/overlay/3e200c9a65162f9c55a682cb3ea6b559b07a6569a910eb1057ea4f995067f9eb/merged: device or resource busy
2020-07-12T15:39:58.066 INFO:journalctl@ceph.mon.c.smithi145.stdout:Jul 12 15:39:58 smithi145 bash[9707]: Error: error creating container storage: the container name "ceph-567f836e-c455-11ea-a06e-001a4aab830c-mon.c" is already in use by "4588abec9d4e9d178e083064e81a5766f0e2d800170592bc51a2d31f247e09b5". You have to remove that container to be able to reuse that name.: that name is already in use

/a/teuthology-2020-07-12_07:01:02-rados-master-distro-basic-smithi/5217586 on centos_7.6

2020-07-12T16:13:53.582 INFO:journalctl@ceph.mon.b.smithi101.stdout:Jul 12 16:13:53 smithi101 bash[9557]: Error: error removing storage for container "ceph-0cd81794-c45a-11ea-a06e-001a4aab830c-mon.b": remove /var/lib/containers/storage/overlay/5315326bcdd6353c37a96033711571342823d18d5207df15c7f9bc485e6c1be8/merged: device or resource busy
2020-07-12T16:13:53.918 INFO:journalctl@ceph.mon.b.smithi101.stdout:Jul 12 16:13:53 smithi101 bash[9557]: Error: error creating container storage: the container name "ceph-0cd81794-c45a-11ea-a06e-001a4aab830c-mon.b" is already in use by "642817748ec62641582f40d49f61b5b92a10277b3f8c4cc8638e68562a0ac1f0". You have to remove that container to be able to reuse that name.: that name is already in use

This might be kernel related.


Related issues

Related to Orchestrator - Bug #44990: cephadm: exec: "/usr/bin/ceph-mon": stat /usr/bin/ceph-mon: no such file or directory New
Related to Orchestrator - Bug #46704: container_linux.go:349: "exec: \"stat\": executable file not found New

History

#1 Updated by Sebastian Wagner 4 months ago

  • Related to Bug #44990: cephadm: exec: "/usr/bin/ceph-mon": stat /usr/bin/ceph-mon: no such file or directory added

#3 Updated by Sebastian Wagner 4 months ago

/a/kchai-2020-07-18_13:35:09-rados-wip-kefu-testing-2020-07-18-1927-distro-basic-smithi/5237560

also centos 7.6 (based on master)

#5 Updated by Sebastian Wagner 4 months ago

  • Priority changed from High to Urgent

#6 Updated by Sebastian Wagner 4 months ago

Some thoughts:

Right now, this conflict makes `suites/rados/thrash-old-clients´ the only suite that tests cephadm on Centos 7. And it turns out that we likely have a problem with podman on CentOS 7.6. Thus I see two options:

1. We revert https://github.com/ceph/ceph/pull/35719 and continue to test cephadm on CentOS 7. Then, we'd need someone with in-depth podman experience to debug the issue we see with CentOS 7.6.
3. Alternatively, we revert https://github.com/ceph/ceph/pull/32377 and test `thrash-old-clients` using the traditional package based deployment.

Might be related to https://github.com/containers/podman/issues/2553#issuecomment-504229382

#7 Updated by Brad Hubbard 4 months ago

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224005
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223947
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224121
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224035
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224061

#8 Updated by Sebastian Wagner 4 months ago

Brad Hubbard wrote:

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224005

thrash-old-clients on centos_7.6

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223947

thrash-old-clients on centos_7.6

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224121

thrash-old-clients on centos_7.6

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224035

thrash-old-clients on centos_7.6

/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224061

thrash-old-clients on centos_7.6

#9 Updated by Sebastian Wagner 4 months ago

Seems podman on CentOS 7 is broken?

#10 Updated by Brad Hubbard 4 months ago

  • Backport set to octopus

#11 Updated by Brad Hubbard 4 months ago

All 7.6.

/a/yuriw-2020-08-05_14:55:18-rados-wip-yuri-testing-2020-08-04-2244-octopus-distro-basic-smithi/5289047
/a/yuriw-2020-08-05_14:55:18-rados-wip-yuri-testing-2020-08-04-2244-octopus-distro-basic-smithi/5289135
/a/yuriw-2020-08-05_14:55:18-rados-wip-yuri-testing-2020-08-04-2244-octopus-distro-basic-smithi/5289017

#13 Updated by Sebastian Wagner 3 months ago

2020-08-21T02:46:48.316 INFO:journalctl@ceph.mon.c.smithi040.stdout:Aug 21 02:46:48 smithi040 bash[8791]: time="2020-08-21T02:46:48Z" level=error msg="unable to remove container 93488819ddcce8e2873eb8ab7665ce8761a15f308a2b5c05a9d579bb4983ed38 after failing to start and attach to it" 
2020-08-21T02:46:48.367 INFO:journalctl@ceph.mon.c.smithi040.stdout:Aug 21 02:46:48 smithi040 bash[8791]: Error: container_linux.go:345: starting container process caused "exec: \"/usr/bin/ceph-mon\": stat /usr/bin/ceph-mon: no such file or directory" 
2020-08-21T02:46:48.368 INFO:journalctl@ceph.mon.c.smithi040.stdout:Aug 21 02:46:48 smithi040 bash[8791]: : OCI runtime error

/a/yuriw-2020-08-20_00:20:21-rados-wip-yuri7-testing-2020-08-19-2051-octopus-distro-basic-smithi/5360912/teuthology.log

#14 Updated by Sebastian Wagner 3 months ago

  • Related to Bug #46704: container_linux.go:349: "exec: \"stat\": executable file not found added

#15 Updated by Neha Ojha 3 months ago

/a/teuthology-2020-08-26_07:01:02-rados-master-distro-basic-smithi/5377136
/a/teuthology-2020-08-26_07:01:02-rados-master-distro-basic-smithi/5377278
/a/teuthology-2020-08-26_07:01:02-rados-master-distro-basic-smithi/5377385

#16 Updated by Deepika Upadhyay 3 months ago

5378365, 5378277, 5378451, 5378510
yuriw-2020-08-26_18:16:40-rados-wip-yuri-testing-2020-08-26-1631-octopus-distro-basic-smithi/

yuriw-2020-08-27_00:49:53-rados-wip-yuri8-testing-2020-08-26-2329-octopus-distro-basic-smithi/5379093

#18 Updated by Sebastian Wagner 3 months ago

  • Status changed from New to Fix Under Review
  • Assignee set to Yuri Weinstein
  • Pull request ID set to 36915

#19 Updated by Yuri Weinstein 3 months ago

  • Assignee changed from Yuri Weinstein to Abhishek Lekshmanan

#21 Updated by Neha Ojha 3 months ago

  • Status changed from Fix Under Review to Pending Backport

#22 Updated by Yuri Weinstein 2 months ago

Yuri Weinstein wrote:

https://github.com/ceph/ceph/pull/36931 - octopus PR

merged

#23 Updated by Sebastian Wagner 5 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF