Bug #13995
ceph-osd "nonce collision" due to unshared pid namespaces
0%
Description
I found Issue #13032 yesterday after I've been hitting a brick wall while trying to deploy ceph on and off during the last couple of months, everybody has said network-problem but i have never found one.
I added "--pid=host" to the docker containers running ceph-osd, and after that everything was instantly stable and working as expected.
Issue #13032 seems to affect ceph-osd when running multiple osd's in different pid namespaces (containers), pid is always the same for each osd.
When i launched one osd in a container, everything was fine.
Two osd's in two different containers on the same host also seemed to work alright.
But when i launched a third one on a second host the third one couldn't connect to osd's on a host where two were running.
ceph-osd logs ~100k messages per second (not exactly sure, but it was a lot), ~500MB of stdout/stderr per minute, 250-350% cpu was used per osd just to try and reconnect, netstat went from 25 tcp sessions to over 100k within a couple of seconds, cluster was inherently unstable since most OSD's reported most other OSD's as broken or similar.
I tried logging with debug ms = 10 and i got ~600MB of logs per second, if you want debug logs i can post a second or two of them somewhere.
Maybe these reconnect-attempts should be throttled in the daemon as well, if 100k new connections per second doesn't work it's usually not a network or ip stack problem but an application problem. Throttling them somehow (with a config option to increase the limit as to not affect future deploys where it's actually needed?) would've made troubleshooting much easier as well since i could then match the logs between the different osd's, and maybe also run with debug ms = 10.
These are the log lines from the nodes that get stuck in an infinite reconnect-loop, this very rapidly fills the hosts diskspace:
http://pastebin.com/SU3ed7hG
These are logs from a node stuck in an infinite reconnect-loop:
http://pastebin.com/M63PJVFE
These are logs from a node receiving connections:
http://pastebin.com/atQS7a8n
Problem at least exist on ubuntu with 9.2.0, and hammer.
History
#1 Updated by Samuel Just about 8 years ago
- Assignee set to Loïc Dachary
#2 Updated by Samuel Just about 8 years ago
I don't quite understand the problem, is it that the pid and hostname was the same?
#3 Updated by Loïc Dachary about 8 years ago
- Status changed from New to Need More Info
that's an interesting problem :-) It would be great if you could provide steps to reproduce the problem. I don't see how it can happen if the OSD is running in an unprivileged container with no bind mount of any kind. But maybe you're doing something slightly different ?
#4 Updated by Samuel Just over 7 years ago
- Status changed from Need More Info to Can't reproduce
Feel free to reopen once there is more information.