Bug #58685
RHEL 9 (cgroups v2) - the pid limits ARE enforced as compared to RHEL8 (cgroup v1)
% Done:
0%
Source:
Tags:
backport_processed
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Description
Downstream bug - https://bugzilla.redhat.com/show_bug.cgi?id=2165644
Mark Kogan and I have been working and discussing this issue and as the tracker title says we need to take care of this change because, in the RGW container, we saw the container crash when we wanted to use rgw thread pool size of 2048.
2023-01-30T14:01:04.421+0000 7f650419f600 -1 *** Caught signal (Aborted) ** in thread 7f650419f600 thread_name:radosgw ceph version 17.2.5-63.el9cp (5c1d62abbfba4f16a4ecda23145329df253ac85a) quincy (stable) 1: /lib64/libc.so.6(+0x54d90) [0x7f65079dfd90] 2: /lib64/libc.so.6(+0xa154c) [0x7f6507a2c54c] 3: raise() 4: abort() 5: /lib64/libstdc++.so.6(+0xa1a21) [0x7f6507c50a21] 6: /lib64/libstdc++.so.6(+0xad39c) [0x7f6507c5c39c] 7: /lib64/libstdc++.so.6(+0xad407) [0x7f6507c5c407] 8: /lib64/libstdc++.so.6(+0xad669) [0x7f6507c5c669] 9: (std::__throw_system_error(int)+0x9b) [0x7f6507c537f8] 10: /lib64/libstdc++.so.6(+0xdbafd) [0x7f6507c8aafd] 11: (RGWAsioFrontend::run()+0x1bc) [0x7f65081f181c] 12: (radosgw_Main(int, char const**)+0x4bbe) [0x7f65083453de] 13: /lib64/libc.so.6(+0x3feb0) [0x7f65079caeb0] 14: __libc_start_main() 15: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events -
Mark and my internal discussion:
Mark Kogan, 12:48 AM @Vikhyat Umrao - please note cgroupVersion// RHEL 8: podman info --debug | grep -i cgroup cgroupControllers: [] cgroupManager: cgroupfs cgroupVersion: v1 // RHEL 9: podman info --debug | grep -i cgroup cgroupControllers: cgroupManager: systemd cgroupVersion: v2 Mark Kogan, 1:24 AM, Edited [edited] @Vikhyat Umrao updating that with this test application running in fedora pod podman run -it --rm --replace --name fedora fedora[root@331514e9f2a9 /]# cat threads.cpp #include <iostream> #include <thread> #include <vector> #include <unistd.h> #include <chrono> void print_thread_id(int id) { using namespace std::chrono_literals; std::cout << "Thread " << id << std::endl; std::this_thread::sleep_for(2000ms); } int main() { std::vector<std::thread> threads; for (int i = 0; i < 4096 ; ++i) { threads.push_back(std::thread(print_thread_id, i)); } for (auto& t : threads) { t.join(); } return 0; } [root@331514e9f2a9 /]# clang++ threads.cpp -pthread -o threads results shows that under RHEL 8 (cgroups v1) there is no enforcement - can reach Thread 4093 Thread 4094 Thread 4095 BUT on RHEL 9 (cgroups v2) - the pid limits ARE enforced !Thread 2043 Thread 2044 Thread 2045 terminate called after throwing an instance of 'std::system_error' what(): Resource temporarily unavailable Aborted (core dumped) @Mark Kogan as always great work mate, I can add more that I have also verified that it has nothing to do with the host OS version if it is RHEL8 or RHEL9 ... that is why the 5.3 version is working fine in the RHEL9 host and RHCS 6.0 version is not working fine in the RHEL9 host It is mainly to do with container image RHEL version specially cgroup version as you said 5.3 is built RHEL8 container image and 6.0 is built with RHEL9 container image I have already tested 6.0 that is built with RHEL9 image has same issue in RHEL8 host this confirms that it is issue with the container image that is being used to build RHCS 6 with this, we need to think wider as it is not only impacting the RGW container it can impact all other Ceph containers, it could be MDS, OSD, MON, MGR etc ... I will move this bug to Cephadm and will open a rook bug also we need to take care of it in both product lines ... Matt Benjamin, 11:28 AM yes... Vikhyat Umrao, 11:59 AM @Adam King please review this thread I would be moving this bz to cephadm - https://bugzilla.redhat.com/show_bug.cgi?id=2165644 and will be opening a new bug for rook ... this is behavior change in podman/cgroup version2 in rhel9 Vikhyat Umrao, 12:04 PM, Edited I am testing the solution to set --pids-limit=-1 to see if it fixes the issue Vikhyat Umrao, 12:17 PM # cat unit.run | grep pids-limit /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --authfile=/etc/ceph/podman-auth.json --net=host --entrypoint /usr/bin/radosgw --init --name ceph-04bf3460-a7b4-11ed-bf7c-000af7995756-rgw-rgws-f22-h21-000-6048r-ghroyj -d --log-driver journald --conmon-pidfile /run/ceph-06bf3460-a7b4-11ed-bf7c-000af7995756@rgw.rgws.f22-h21-000-6048r.ghroyj.service-pid --cidfile /run/ceph-06bf3460-a7b4-11ed-bf7c-000af7995756@rgw.rgws.f22-h21-000-6048r.ghroyj.service-cid --pids-limit -1 --cgroups=split -e CONTAINER_IMAGE=registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:7672426dd2265ccabc6550cce3ffa0711c44e1ce04c20a93b2955707c4494f85 -e NODE_NAME=f22-h21-000-6048r.rdu2.scalelab.redhat.com -e CEPH_USE_RANDOM_NONCE=1 -e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 -v /var/run/ceph/06bf3460-a7b4-11ed-bf7c-000af7995756:/var/run/ceph:z -v /var/log/ceph/06bf3460-a7b4-11ed-bf7c-000af7995756:/var/log/ceph:z -v /var/lib/ceph/06bf3460-a7b4-11ed-bf7c-000af7995756/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /var/lib/ceph/06bf3460-a7b4-11ed-bf7c-000af7995756/rgw.rgws.f22-h21-000-6048r.ghroyj:/var/lib/ceph/radosgw/ceph-rgw.rgws.f22-h21-000-6048r.ghroyj:z -v /var/lib/ceph/06bf3460-a7b4-11ed-bf7c-000af7995756/rgw.rgws.f22-h21-000-6048r.ghroyj/config:/etc/ceph/ceph.conf:z registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:7672426dd2265ccabc6550cce3ffa0711c44e1ce04c20a93b2955707c4494f85 -n client.rgw.rgws.f22-h21-000-6048r.ghroyj -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false bingo so the solution is working Vikhyat Umrao, 12:19 PM I also found out the man page in RHEL9 has a typo :( will open a RHEL9 bug https://docs.podman.io/en/latest/markdown/podman-run.1.html --pids-limit=limit Tune the container’s pids limit. Set to -1 to have unlimited pids for the container. The default is 2048 on systems that support “pids” cgroup controller. this is from the above link but if I do man page in RHEL9 # man podman-run | grep -A3 pids-limit --pids-limit=limit Tune the container's pids limit. Set to -1 to have unlimited pids for the container. The default is 4096 on systems that support "pids" cgroup controller. so man pages 4096 but URL says 2048 and per our testing it is 2048 so looks like man page needs to be fixed Vikhyat Umrao, 27 min rook bug - https://bugzilla.redhat.com/show_bug.cgi?id=2168722 Vikhyat Umrao, 20 min man page bug - https://bugzilla.redhat.com/show_bug.cgi?id=2168727
Related issues
History
#1 Updated by Vikhyat Umrao about 1 year ago
Adam - FYI!
#2 Updated by Adam King about 1 year ago
- Status changed from New to In Progress
- Assignee set to Adam King
- Backport set to quincy, pacific
- Pull request ID set to 50083
#3 Updated by Adam King about 1 year ago
- Status changed from In Progress to Pending Backport
#4 Updated by Backport Bot about 1 year ago
- Copied to Backport #58882: pacific: RHEL 9 (cgroups v2) - the pid limits ARE enforced as compared to RHEL8 (cgroup v1) added
#5 Updated by Backport Bot about 1 year ago
- Copied to Backport #58883: quincy: RHEL 9 (cgroups v2) - the pid limits ARE enforced as compared to RHEL8 (cgroup v1) added
#6 Updated by Backport Bot about 1 year ago
- Tags set to backport_processed
#7 Updated by Adam King about 1 year ago
- Status changed from Pending Backport to Resolved