https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2018-10-19T22:45:58Z
Ceph
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=123220
2018-10-19T22:45:58Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-6 priority-high2" href="/issues/36389">Bug #36389</a>: untar encounters unexpected EPERM on kclient/multimds cluster with thrashing</i> added</li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=123222
2018-10-19T22:46:04Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-9 priority-7 priority-highest closed" href="/issues/36349">Bug #36349</a>: mds: src/mds/MDCache.cc: 1637: FAILED ceph_assert(follows >= realm->get_newest_seq())</i> added</li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=123226
2018-10-19T22:48:15Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Subject</strong> changed from <i>msg: messages are not queued but not sent</i> to <i>msg: messages are queued but not sent</i></li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=123951
2018-10-31T23:25:25Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-10 priority-6 priority-high2 closed" href="/issues/36666">Bug #36666</a>: msg: rejoin message queued but not sent</i> added</li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=126818
2019-01-04T21:46:07Z
Sage Weil
sage@newdream.net
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>RADOS</i></li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=131827
2019-03-12T23:16:35Z
Greg Farnum
gfarnum@redhat.com
<ul><li><strong>Project</strong> changed from <i>RADOS</i> to <i>Messengers</i></li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=131949
2019-03-13T16:22:15Z
Patrick Donnelly
pdonnell@redhat.com
<ul></ul><p>I haven't seen this recently. Usually I grep for "no longer laggy" in MDS logs within the multimds suite runs. Right now that's cluttered with false positives due to valgrind. I have opened <a class="issue tracker-1 status-3 priority-6 priority-high2 closed" title="Bug: qa: tolerate longer heartbeat timeouts when using valgrind (Resolved)" href="https://tracker.ceph.com/issues/38723">#38723</a> to get rid of those.</p>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=131976
2019-03-14T03:50:52Z
Patrick Donnelly
pdonnell@redhat.com
<ul></ul><p>No, the problem still exists:</p>
<pre>
2019-03-13 21:50:45.156 1b269700 5 mds.beacon.i Sending beacon up:active seq 795
2019-03-13 21:50:45.189 1b269700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] --> [v2:172.21.15.8:3301/0,v1:172.21.15.8:6790/0] -- mdsbeacon(4456/i up:active seq 795 v409) v7 -- 0x1f2eac40 con 0x2105ce10
2019-03-13 21:50:45.189 1b269700 20 mds.beacon.i sender thread waiting interval 4s
...
2019-03-13 21:50:57.721 1b269700 5 mds.beacon.i Sending beacon up:active seq 798
2019-03-13 21:50:57.721 1b269700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] --> [v2:172.21.15.8:3301/0,v1:172.21.15.8:6790/0] -- mdsbeacon(4456/i up:active seq 798 v409) v7 -- 0x2824fcc0 con 0x2105ce10
2019-03-13 21:50:57.722 1b269700 20 mds.beacon.i sender thread waiting interval 4s
...
2019-03-13 21:51:06.111 18263700 5 mds.6.18 laggy, deferring lock(a=nudge inest 0x50000002025.head) v1
2019-03-13 21:51:06.140 18263700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 952 ==== mdsmap(e 410) v1 ==== 2276+0+0 (crc 0 0 0) 0x121258d0 con 0x2105ce10
2019-03-13 21:51:06.229 18263700 5 mds.i handle_mds_map old map epoch 410 <= 414, discarding
2019-03-13 21:51:06.229 18263700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 953 ==== mdsmap(e 411) v1 ==== 2272+0+0 (crc 0 0 0) 0x149d8780 con 0x2105ce10
2019-03-13 21:51:06.229 18263700 5 mds.i handle_mds_map old map epoch 411 <= 414, discarding
2019-03-13 21:51:06.229 18263700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 954 ==== mdsmap(e 412) v1 ==== 2268+0+0 (crc 0 0 0) 0x1b5b45f0 con 0x2105ce10
2019-03-13 21:51:06.230 18263700 5 mds.i handle_mds_map old map epoch 412 <= 414, discarding
2019-03-13 21:51:06.230 18263700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 955 ==== mdsmap(e 413) v1 ==== 2264+0+0 (crc 0 0 0) 0x1b627980 con 0x2105ce10
2019-03-13 21:51:06.231 18263700 5 mds.i handle_mds_map old map epoch 413 <= 414, discarding
2019-03-13 21:51:06.231 18263700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 956 ==== mdsmap(e 414) v1 ==== 2264+0+0 (crc 0 0 0) 0x1b653d30 con 0x2105ce10
2019-03-13 21:51:06.231 18263700 5 mds.i handle_mds_map old map epoch 414 <= 414, discarding
...
2019-03-13 21:51:06.311 1625f700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 957 ==== mdsbeacon(4456/i up:active seq 795 v414) v7 ==== 126+0+0 (crc 0 0 0) 0x14681c10 con 0x2105ce10
2019-03-13 21:51:06.312 1625f700 5 mds.beacon.i received beacon reply up:active seq 795 rtt 21.1565
2019-03-13 21:51:06.320 1625f700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 958 ==== mdsbeacon(4456/i up:active seq 796 v414) v7 ==== 126+0+0 (crc 0 0 0) 0x1f4576e0 con 0x2105ce10
2019-03-13 21:51:06.320 1625f700 5 mds.beacon.i received beacon reply up:active seq 796 rtt 17.0176
2019-03-13 21:51:06.321 1625f700 1 -- [v2:172.21.15.169:6836/237079033,v1:172.21.15.169:6837/237079033] <== mon.2 v2:172.21.15.8:3301/0 959 ==== mdsbeacon(4456/i up:active seq 797 v414) v7 ==== 126+0+0 (crc 0 0 0) 0x20898ff0 con 0x2105ce10
2019-03-13 21:51:06.321 1625f700 5 mds.beacon.i received beacon reply up:active seq 797 rtt 12.7697
2019-03-13 21:51:06.322 1625f700 0 mds.beacon.i MDS is no longer laggy
</pre>
<p>From: /ceph/teuthology-archive/pdonnell-2019-03-13_19:07:54-multimds-master-distro-basic-smithi/3718401/remote/smithi169/log/ceph-mds.i.log.gz</p>
<p>mon gets all of these beacons in rapid succession:</p>
<pre>
2019-03-13 21:51:06.227 1d9a4700 1 -- [v2:172.21.15.8:3301/0,v1:172.21.15.8:6790/0] <== mds.6 v2:172.21.15.169:6836/237079033 672 ==== mdsbeacon(4456/i up:active seq 795 v409) v7 ==== 416+0+0 (crc 0 0 0) 0x1a0de5a0 con 0x2292ebc0
...
2019-03-13 21:51:06.243 1d9a4700 1 -- [v2:172.21.15.8:3301/0,v1:172.21.15.8:6790/0] <== mds.6 v2:172.21.15.169:6836/237079033 677 ==== mdsbeacon(4456/i up:active seq 800 v414) v7 ==== 416+0+0 (crc 0 0 0) 0x21f87780 con 0x2292ebc0
</pre>
<p>From: /ceph/teuthology-archive/pdonnell-2019-03-13_19:07:54-multimds-master-distro-basic-smithi/3718401/remote/smithi008/log/ceph-mon.c.log.gz</p>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=152240
2019-11-20T22:37:42Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-5 priority-high3" href="/issues/42920">Bug #42920</a>: mds: removed from map due to dropped (?) beacons</i> added</li></ul>
Messengers - Bug #36540: msg: messages are queued but not sent
https://tracker.ceph.com/issues/36540?journal_id=157084
2020-01-27T13:16:43Z
Sage Weil
sage@newdream.net
<ul><li><strong>Target version</strong> changed from <i>v14.0.0</i> to <i>v15.0.0</i></li></ul>