Project

General

Profile

Bug #23537

libceph: monX xxxxxx session lost, hunting for new mon

Added by Марк Коренберг over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
kcephfs
Crash signature (v1):
Crash signature (v2):

Description

maybe connected with #17664

I use Luminous 12.2.2 on both client and cluster. Kernel at cephfs client: Linux mmwork 4.13.0-37-generic #42-Ubuntu SMP Wed Mar 7 14:13:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux.

I don't know if same bug, but symptoms are exactly the same. Many lines in dmesg like:

```
[Sun Apr 1 23:37:05 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:37:36 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon

[Sun Apr 1 23:37:36 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:38:07 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon

[Sun Apr 1 23:38:07 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:38:37 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon

[Sun Apr 1 23:38:37 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:39:08 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon

[Sun Apr 1 23:39:08 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:39:39 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon

[Sun Apr 1 23:39:39 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:40:09 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon

[Sun Apr 1 23:40:09 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:40:40 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon

[Sun Apr 1 23:40:40 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:41:11 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon

[Sun Apr 1 23:41:11 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:41:42 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
```

Please note timeouts values: every 30 seconds. this timeout was mentioned in linked task.

History

#1 Updated by Марк Коренберг over 4 years ago

Important: on another machine with same OS everything is fine.

#2 Updated by Greg Farnum over 4 years ago

  • Project changed from Ceph to Linux kernel client

#3 Updated by Patrick Donnelly over 4 years ago

  • Category set to libceph
  • Assignee set to Ilya Dryomov
  • Source set to Community (user)
  • Release deleted (luminous)
  • ceph-qa-suite deleted (fs)

#4 Updated by Ilya Dryomov over 4 years ago

Марк Коренберг wrote:

Important: on another machine with same OS everything is fine.

Another client machine where you mount cephfs with "mount -t ceph ..."?

#5 Updated by Ilya Dryomov over 4 years ago

v12.2.2 includes the fix for #17664.

Do these messages appear right after you mount or later? Do they go away if you remount (i.e. umount + mount)?

#6 Updated by Ilya Dryomov over 4 years ago

  • Status changed from New to Fix Under Review

#7 Updated by Ilya Dryomov over 4 years ago

I think I found the issue. The fix should be in soon and will be backported to stable kernels.

#8 Updated by Ilya Dryomov over 4 years ago

  • Status changed from Fix Under Review to 15

#10 Updated by Ilya Dryomov over 4 years ago

  • Status changed from Pending Backport to Resolved

In 4.9.98, 4.14.39, 4.16.7.

#11 Updated by Марк Коренберг over 4 years ago

Linux mmwork 4.15.0-21-generic #22-Ubuntu SMP Tue May 1 13:26:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Is still reproduced.

Well, will wait for 4.17.

#12 Updated by Ilya Dryomov over 4 years ago

Naturally -- the required fix isn't present in 4.15.0-21-generic.

Also available in: Atom PDF