Bug #23537
closedlibceph: monX xxxxxx session lost, hunting for new mon
0%
Description
maybe connected with #17664
I use Luminous 12.2.2 on both client and cluster. Kernel at cephfs client: Linux mmwork 4.13.0-37-generic #42-Ubuntu SMP Wed Mar 7 14:13:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux.
I don't know if same bug, but symptoms are exactly the same. Many lines in dmesg like:
```
[Sun Apr 1 23:37:05 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:37:36 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon
[Sun Apr 1 23:37:36 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:38:07 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
[Sun Apr 1 23:38:07 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:38:37 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon
[Sun Apr 1 23:38:37 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:39:08 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
[Sun Apr 1 23:39:08 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:39:39 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon
[Sun Apr 1 23:39:39 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:40:09 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon
[Sun Apr 1 23:40:09 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:40:40 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon
[Sun Apr 1 23:40:40 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:41:11 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon
[Sun Apr 1 23:41:11 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:41:42 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
```
Please note timeouts values: every 30 seconds. this timeout was mentioned in linked task.
Updated by Марк Коренберг about 6 years ago
Important: on another machine with same OS everything is fine.
Updated by Greg Farnum about 6 years ago
- Project changed from Ceph to Linux kernel client
Updated by Patrick Donnelly about 6 years ago
- Category set to libceph
- Assignee set to Ilya Dryomov
- Source set to Community (user)
- Release deleted (
luminous) - ceph-qa-suite deleted (
fs)
Updated by Ilya Dryomov about 6 years ago
Марк Коренберг wrote:
Important: on another machine with same OS everything is fine.
Another client machine where you mount cephfs with "mount -t ceph ..."?
Updated by Ilya Dryomov about 6 years ago
v12.2.2 includes the fix for #17664.
Do these messages appear right after you mount or later? Do they go away if you remount (i.e. umount + mount)?
Updated by Ilya Dryomov almost 6 years ago
- Status changed from New to Fix Under Review
Updated by Ilya Dryomov almost 6 years ago
I think I found the issue. The fix should be in soon and will be backported to stable kernels.
Updated by Ilya Dryomov almost 6 years ago
- Status changed from Fix Under Review to 15
Updated by Ilya Dryomov almost 6 years ago
- Status changed from 15 to Pending Backport
Updated by Ilya Dryomov almost 6 years ago
- Status changed from Pending Backport to Resolved
In 4.9.98, 4.14.39, 4.16.7.
Updated by Марк Коренберг almost 6 years ago
Linux mmwork 4.15.0-21-generic #22-Ubuntu SMP Tue May 1 13:26:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Is still reproduced.
Well, will wait for 4.17.
Updated by Ilya Dryomov almost 6 years ago
Naturally -- the required fix isn't present in 4.15.0-21-generic.