Project

General

Profile

Bug #39150

mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum

Added by Patrick Donnelly over 3 years ago. Updated 6 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):

0037073cc3ac263b34152d0d00e30b5f8ee6ff25d309aa22de81e99b33c18e5b
01fe407aa36177ebade5c10cd1ca1b8a7e214dc44b7c3314f3a05c82d96070b2
02b9953dd602d50e803b32065040392754234059b7c78dbc52c05b2cf2480c82
0a4ae0b1fe10851f11f6119256e4526088bc54ae8a871695a79859a5d9bf6cdd
0c3729fca27d98d58d528e908dbd0e0f4b7ac708e5778cf41976ba2dc992d280
0d31619405b90977aa8ba93d1aa0de302f38b31523f5b24750f7f734101749cf
0f71e7c40c0d7a571deb6d0293f20c9f9e86954651cd7a3e670b6cfddec2e64a
1006527e6b73331da4af4f329a2c91ad60fc1a1e42ed371668c2e70a1900c38c
171bb3b4c4a5cddd807d880095b932d6136043f45d52a6479c845b412e251ebf
1774b4da6e43135a7cbc32baece2a1e6e32d5e0f4988feaaa25bad7ea499152d
1cc2c183c985a12ffd90fcc22301ee5fc8924cdbaac7bb86aa9b3f605c35abcc
1e38abda93a6eb21a156dfdad370df3d9aa07fce82bc0ae6f1db39ce35a58520
1f5f72ffb7ec30455b1238ff046a80619957b4a1ea90d6381df4f7260ced3cf8
2288ee26e57c0d86ab231111fb0853c2879ee6c2244f706cd754fc1f4aa4f6df
27e43265a72bd977462170c1aaa6eb587b69863be57e32bd9d96d5658d1a6c91
28e96f48d734371f3de2ce562c28ebc591a27763f6b3f721002cef5a43a370c8
2e57dd0d327fecc770fcc99f20dcbf57dbffcc3ab1ba00d488136a011c638163
37d2da5484651ba0bc0aa425b2a3029217a642f2f592f3ffb6dbb108b2216ea8
3804ce6bd27df78d651ae85fc5659bb303913ef600ecf91cabfbd3c9c797182f
3a662d0984c17fbeaf5198bd6b20be5866810c652cca481c894f7cdc721d6340
444bbd61bba8eaedcc3fcf88d69982332d54f0b64cc7a2c1420cd365b11688b2
45aa2b2ae51cb0358e27161cf2d1afa122cd3de237120cc26267d2c29eeb7761
46616716f881b169815ad1afe5ec3cdebcc89bf3ae50637fd0b5fb8ac01217af
477a9f47c1c91cddfcdb38bd4f4b68e95fd99a04276d84dbd382620150915ec3
4d653e9c3ee37041dd2a1cf556ea466db3e74addb7a8d3efb38d8e8a268096d3
50ab467369eeda4735cc2c2219da44016ed98552e8699e20991538571b305388
52364301bebf6da1fb9330f7bfd29c1bd45df1202ea8a70121eb483ea3a481a1
53808cea7fc2878d50b4f983cf23e0aa685b169c2090b5eb3f41e4b5a1e55b76
56053032bcc45b55a16d484e58c0947a58ce158e1dc089dba7552500d49ad4ea
5614ee01349834181c19990869a934cd459b960d250443800b648a6066e07877
5782d6d8fd10557472172a01104b5f377ce91d57ff1b38bb3c983212e3cf8747
599feb545b52e9426933c8bbea51bfdf817c45ac92a5c7c90be45c76b6b5ee2b
5d3237426b682a9d0e501eb9b29651d7021758ba763acef287ba8b65ae4aa10c
5de79f6af8fceb6d07b74a2c0086040b1290b9ddfe1485c7bc3a35c9e734868c
651712564ef713504b84edd17288d949dc1819923a3d8d4d454e6568b8edec46
653733cbc179da1bfb82ec5a73c4dadc94f4fa16a4a208fb3f3cd1d3d4a2c5fd
69ea3e5e6912d49c32c7958efa0d9af843dff3b41a061f95ed32ef73efdcb7d7
6c1cb309b68664d00282027d1ba1a2acf957b1efecca681b4c4eac5bd71e66f2
6d6faaa1380cc76166e1de40fca37961ee556cfe7829a7ce0b4768206fc254d4
713793a0fd9f314787ab0987df569ebca008a8cef81556382e2b142ac54f9faf
7600e2c4dff3b2745d43e62675bfed1d676cccb48d230e55e6eeb46278f6259f
76e9418d6773677225fcbe95066de253a805150c03ef12353fc0db9b1977897d
789d789696a0daa5932cdbd5b021ebefd702026e6f9fdd10c60487cecf050129
7e7e314e491555efd254bc82aa1b8249c2d5565d5c2c3e048087588ea80997f9
7f30774fdeff3299c20c49dfe32fface87fbaf5cf0ff88f87db05c6b85307b89
85d9d0551f1c5bc8ab885b25e713f95695dffc22a9e3feb329902231896efed0
8adcfb87b088c80f48f4ebf8f4d48a4fa179f2f29303d16bd695173fbb1c83b9
8c0c708b30c7b150d192dc3774327092e0fa3fb14956334c1e3d45c7a0a7d180
8c5ccf65bc699e123490e8acf8058effd49e839bed779f73b2fc2ebe0047d8c1
904c128123cd9b50d6196d9eecaf2b48c094ec2120a7b514ecfbba12ee620d44
9a7de784614490d603daf107bd08718ae6bcbb5f165bbd36f1958473c90c8298
9c8794042eb1d0677892b6a951a5d0f2c58a6886618add013631426358a5d933
a35620c323a01c1ac660eee1042235afda85316d49cc27522f17ae5255862990
a8249d527fa8b1ac3df0c46b5bb94360b0bedd04ff96d334580744199bcb1fdd
aef8fbf142955ddf1007a8f60d4839f38e5fd4380e4b99faf0b864341b141a94
af2cd66391faea2b5e52d9db535990a6164530a554d42557b62e5ece2e325bae
c7d7213859ab7cdabcc40049aff5482ebbf1b9e92d6e65a376ea1d5e89787cf6
cd97d88a6d6863362ec58bec80a97ed89fc94292b4c2325a2439060b9fc6f2e3
d2e21097b6d6975cd5ebe9ffbeae31af93dea3545392d4b713cb63a940a92779
dfabce7a9eda4d0e0f2f65e643eb9bc5054b82e58dba7206943c62a9d22229db
e1dcde46df2a08231398b4c3abdd316deb41af98b9522f3b03291f09f189a6b1
e8d16c9ff6b75e2f25f60c19928e13173834600b3eb8a40066b18e5ac337aad1
eda591d1e7428ff097fd5ee27dcdb530cf85125c053ba7d0e0204e5fdb9f4a17
eefd2f54edd310e2481ab0a0f4c2bee7ddaa126a6f04f4b4bdf05e8ac99ea3a5
f27e6ecc2ef24e60aafa709e2c7b9c096beedb5933431e04435ee39cc51acaa7
f4b631574d3a65262f2fef7adaff609d8a476e66c09d53cca0802e7846c5bada
f91c2ad8dc044afabdaf82169a6c9fb16fa5a52c3818cb7fa4c987ebfb666b26
fea18c11258f3ec945cfe4f500a055bcb5f0a6182df6186ca428c6fda911f59e
feed246b5b8fd7b55bd6b4ad942e3de80c1f04fc1b8446a615ae8a7310c9b3ed
0f9f5cdbdedac2f97ac4cfb856f5363ade0d5442d19905dadc0d8cc7c5a1c203
1b851a647fd99035eae4617269c162ddc0ca8088f0e45a42b9f3d917dd9ffbac
1e19e924a858513e6aa3312fb6c0ffe0aba3f7ff4a30f15579eb0c9745ca9f0d
24d2e457991a4b25d11701ef84b0aa1a701b0a8bbe3931f9c3906c314755fd91
2b70a0749b6df7529a37abed998dc5ac904d0bb59057b3a5af7242bb277c41a0
2c4c2a8df54c774c379179d243ba3f3b63948555b8e85bfb4a0bbee688116485
3235d384e0b57dd21b629902401560c997fae3630d1a5e5d44f6c7dc23e70b24
33da0e0b37012332a8e4fd115e84893bfa2708ab00aa3e40b44eedd345b4f904
399c6f4d1900f90b5d32e1e5ddfaf0739b2f069cec6aa84114771231a43aa08f
442cb77a4b48fe3793975d545b71b1eeacc1df8fda34aee20a16f2db6e9758f3
49e33478f626549cdb2c000a820af6e34705f1cbe8b623daa3952854134282c9
5854e2b9ad73a03e32c718cd35e8245960facba999ca431ed47b4298d85c0f27
58e4d5fe6ce12be326e8a45f4b65b471041e2937277038d6dc5bef2c4840be09
5b672d3969d42e32f3575fa1962291d057545c40b00f5fd200ffe68be83eaee2
601cd339ce16e4f5d90403fdf6facf938310770e0e6c4fe33ab145f0829c0b8b
6b4d9327010e2a32fbe8592545c480d3b56f55c035bc965adee4feda68a65dae
6b9731c07a7be7b7a70a360846e4c226a3f3697e6c3574c68a125b64b8fddbaa
72909647829fe10423ed58cb6b1bb25cf4361a8a30ce089b8bbcc56cef5e2385
76f85fe836a932a6d1431fcdd4afd6f22fef14d6d4ff72b7eec92f6fdb8b0009
76fad4931c66319a5751ce8747761c9dd56a2ade4d907b784f55749982245f29
869811946de06c60d23a983592e607ec6568ea9d58332b96a703eb99cca708fe
b2f926ff53c12f90e56c8d507bd201607aa2bc82c8316a0397042bd3b72a2c28
b3c6b23926968cf461baa5dd8ea945eb053261c7c27d6294ba725cb020ddb6a8
b573f573e49aa6653cd0698715c21169d2c4f9bf371f725283e69b575107135f
b76d37d5658cb5fae9eeba1ec5af8c716811b017ee4724545d37ce9b2a72a277
b7df6df15efe4c849c74fadd44c7115187ea01ce976bc608738821d7a0b4e73d
ba67bb3a2c61b9e25af95d2acb9d80e337c41fd60ffdb747f22df9f63b50a3ee
beed93a7f6b8abd789185f0d6a73bf8856848f2aa403f29e42b4ddf8f7fae9bf
c44c794bfe1ed9b017e4c93a7031dde8893d4aaaef68f4f9bbfe9121b3f73b78
d325b85126e4565493175a6cb48a60524bfeafe12a86af91c8a5946640758ad0
d95742e5f7278bd7d1a68ed42c74e03a9a84b736f83d1ea847b5a325af2d35f6
dad1ebbed3ea9cf6c701953e66b600f7355b8b43e0659e1fda718bda3502a647
e2ec02436f52889d93736082fd68aab1968aefad43126248a35a08df4ec9ac3c
e439219388e62c8828274c4bf2fdc57b8f045dc893181c473def02aa20cd9ca0
e688e3029b5a0c8e7a2b26d3eafc90cc4256208cdf3fd42ef3afe4fbdea367cf
e713a6ef3b71d08b098d997dfb751be11092bb9cb9d5f9f355a70fef521adf9d
ee8c42c4eae844f6a790ec53e8371bd0bd3fa690f360fb4bd220539f78b9112a
f401d58951145efcbc1b4a36ce439273068bad073f28b459ae767186c65477ee
fb779840c44a694e9e53599633dbecc5b0ac5522a9e8478ed7d0f6bd02c776d4


Description

2019-04-06T09:27:34.791 INFO:tasks.ceph.mds.b:Sent signal 15
2019-04-06T09:27:34.791 INFO:tasks.ceph.mon.a:Sent signal 15
2019-04-06T09:27:34.792 INFO:tasks.ceph.mon.c:Sent signal 15
2019-04-06T09:27:34.792 INFO:tasks.ceph.mon.b:Sent signal 15
2019-04-06T09:27:34.803 INFO:tasks.ceph.mon.a.smithi085.stderr:2019-04-06 09:27:34.801 7f854e356700 -1 received  signal: Terminated from /usr/bin/python /usr/bin/daemon-helper kill ceph-mon -f --cluster ceph -i a  (PID: 17117) UID: 0
2019-04-06T09:27:34.803 INFO:tasks.ceph.mon.a.smithi085.stderr:2019-04-06 09:27:34.801 7f854e356700 -1 mon.a@0(electing) e1 *** Got Signal Terminated ***
2019-04-06T09:27:34.807 INFO:tasks.ceph.mon.c.smithi180.stderr:2019-04-06 09:27:34.806 7f26a7a1e700 -1 received  signal: Terminated from /usr/bin/python /usr/bin/daemon-helper kill ceph-mon -f --cluster ceph -i c  (PID: 17101) UID: 0
2019-04-06T09:27:34.807 INFO:tasks.ceph.mon.c.smithi180.stderr:2019-04-06 09:27:34.806 7f26a7a1e700 -1 mon.c@2(electing) e1 *** Got Signal Terminated ***
2019-04-06T09:27:34.807 INFO:tasks.ceph.mon.b.smithi180.stderr:2019-04-06 09:27:34.806 7fc14ddde700 -1 received  signal: Terminated from /usr/bin/python /usr/bin/daemon-helper kill ceph-mon -f --cluster ceph -i b  (PID: 17099) UID: 0
2019-04-06T09:27:34.808 INFO:tasks.ceph.mon.b.smithi180.stderr:2019-04-06 09:27:34.806 7fc14ddde700 -1 mon.b@1(electing) e1 *** Got Signal Terminated ***
2019-04-06T09:27:34.808 INFO:tasks.ceph.mds.b.smithi180.stderr:2019-04-06 09:27:34.806 7f97d9dd8700 -1 received  signal: Terminated from /usr/bin/python /usr/bin/daemon-helper kill ceph-mds -f --cluster ceph -i b  (PID: 19872) UID: 0
2019-04-06T09:27:34.808 INFO:tasks.ceph.mds.b.smithi180.stderr:2019-04-06 09:27:34.806 7f97d9dd8700 -1 mds.b *** got signal Terminated ***
2019-04-06T09:27:34.939 INFO:tasks.ceph.mon.c.smithi180.stderr:/build/ceph-15.0.0-122-gcf4d304/src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f26b7a03340 time 2019-04-06 09:27:34.940966
2019-04-06T09:27:34.939 INFO:tasks.ceph.mon.c.smithi180.stderr:/build/ceph-15.0.0-122-gcf4d304/src/mon/Monitor.cc: 267: FAILED ceph_assert(session_map.sessions.empty())
2019-04-06T09:27:34.941 INFO:tasks.ceph.mon.c.smithi180.stderr: ceph version 15.0.0-122-gcf4d304 (cf4d304f05231b6375986616bc965edc8181a4e1) octopus (dev)
2019-04-06T09:27:34.941 INFO:tasks.ceph.mon.c.smithi180.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f26aebba0d2]
2019-04-06T09:27:34.941 INFO:tasks.ceph.mon.c.smithi180.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f26aebba2ad]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr: 3: (Monitor::~Monitor()+0x962) [0x69dfc2]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr: 4: (Monitor::~Monitor()+0x9) [0x69e039]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr: 5: (main()+0x2801) [0x578df1]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr: 6: (__libc_start_main()+0xf0) [0x7f26ad0fd830]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr: 7: (_start()+0x29) [0x65bee9]
2019-04-06T09:27:34.942 INFO:tasks.ceph.mon.c.smithi180.stderr:*** Caught signal (Aborted) **
2019-04-06T09:27:34.943 INFO:tasks.ceph.mon.c.smithi180.stderr: in thread 7f26b7a03340 thread_name:ceph-mon

From: /ceph/teuthology-archive/pdonnell-2019-04-06_02:21:29-fs-wip-pdonnell-testing-20190405.231924-distro-basic-smithi/3814565/teuthology.log

Seems there were other issues with the mons during that run as well. Mons lost quorum around 08:58:28.846.


Related issues

Related to RADOS - Bug #56192: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) New
Duplicated by RADOS - Bug #51882: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) Duplicate
Duplicated by RADOS - Bug #52199: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) Duplicate
Duplicated by RADOS - Bug #52198: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) Duplicate
Duplicated by RADOS - Bug #52142: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) Duplicate
Copied to RADOS - Backport #53659: pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum Resolved
Copied to RADOS - Backport #53660: octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum Resolved

History

#1 Updated by Greg Farnum over 3 years ago

  • Subject changed from mon: "FAILED ceph_assert(session_map.sessions.empty())" to mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
  • Priority changed from High to Normal

The monitor was out of quorum for 30 minutes; it probably has to do with holding on to client connections or else not cleaning up the session map from when it was last in. I'm not sure this is high priority though since it's a crash on shutdown in a failure scenario...

#2 Updated by Patrick Donnelly over 3 years ago

Greg Farnum wrote:

The monitor was out of quorum for 30 minutes; it probably has to do with holding on to client connections or else not cleaning up the session map from when it was last in. I'm not sure this is high priority though since it's a crash on shutdown in a failure scenario...

Unless you're looking at something different, the lost quorum happened during the test not at shutdown. The mds thrasher had just successfully thrashed (killed and standby took over successfully) an MDS around ~8 seconds earlier.

#3 Updated by Greg Farnum over 3 years ago

mon.c timeline:
2019-04-06 08:58:28.846 hits a lease timeout and triggers the election process
2019-04-06 08:58:28.846 first output of "probing" state
2019-04-06 08:58:28.850 first output of "electing" state
2019-04-06 09:27:34.942 crash output line

It does not output the "peon" or "leader" state again in those 29 minutes; it times out 291 elections and starts 294 during that time. I don't know why it happened but mon.c was out of quorum that whole time.

#4 Updated by Sage Weil over 3 years ago

mon.c is failing to connect to mon.a:

2019-04-06 09:19:20.484 7f269f20d700  1 --2- [v2:172.21.15.180:3301/0,v1:172.21.15.180:6790/0] >> [v2:172.21.15.85:3300/0,v1:172.21.15.85:6789/0] conn(0x3137200 0x2f77600 secure :-1 s=BANNER_CONNECTING pgs=3162 cs=280 l=0 rx=0x4171da0 tx=0x4b2b080)._handle_peer_banner_payload supported=0 required=0
2019-04-06 09:19:20.484 7f269f20d700  1 --2- [v2:172.21.15.180:3301/0,v1:172.21.15.180:6790/0] >> [v2:172.21.15.85:3300/0,v1:172.21.15.85:6789/0] conn(0x3137200 0x2f77600 secure :-1 s=START_CONNECT pgs=3162 cs=281 l=0 rx=0x4171da0 tx=0x4b2b080)._fault waiting 15.000000

same in the other direction:
2019-04-06 09:19:00.906 7f8546b47700  1 --2- [v2:172.21.15.85:3300/0,v1:172.21.15.85:6789/0] >> [v2:172.21.15.180:3301/0,v1:172.21.15.180:6790/0] conn(0x361b680 0x33f9b80 secure :-1 s=BANNER_CONNECTING pgs=581 cs=284 l=0 rx=0x8ed3ab0 tx=0x4eb4b80)._handle_peer_banner_payload supported=0 required=0
2019-04-06 09:19:00.906 7f8546b47700  1 --2- [v2:172.21.15.85:3300/0,v1:172.21.15.85:6789/0] >> [v2:172.21.15.180:3301/0,v1:172.21.15.180:6790/0] conn(0x361b680 0x33f9b80 secure :-1 s=START_CONNECT pgs=581 cs=285 l=0 rx=0x8ed3ab0 tx=0x4eb4b80)._fault waiting 15.000000

#5 Updated by Sage Weil over 3 years ago

(not surprisingly, MON_DOWN is in the ceph.log too, and the run would have failed with that had it not failed for some other reason. will keep an eye out for that!)

#6 Updated by Patrick Donnelly over 3 years ago

/ceph/teuthology-archive/pdonnell-2019-04-17_06:12:56-kcephfs-wip-pdonnell-testing-20190417.032809-distro-basic-smithi/3857629/teuthology.log

#7 Updated by Neha Ojha over 3 years ago

/a/yuriw-2019-06-07_19:41:42-rados-wip-yuri4-testing-2019-06-07-1600-nautilus-distro-basic-smithi/4012630/

#8 Updated by Patrick Donnelly almost 3 years ago

/ceph/teuthology-archive/pdonnell-2020-02-15_16:51:06-fs-wip-pdonnell-testing-20200215.033325-distro-basic-smithi/4767980/teuthology.log

#9 Updated by Neha Ojha over 1 year ago

/a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049676

#10 Updated by Sridhar Seshasayee over 1 year ago

/a/sseshasa-2021-05-17_11:08:21-rados-wip-sseshasa-testing-2021-05-17-1504-distro-basic-smithi/6118250

#11 Updated by Neha Ojha over 1 year ago

  • Priority changed from Normal to Urgent
  • Backport set to pacific

/a/yuriw-2021-06-02_18:33:05-rados-wip-yuri3-testing-2021-06-02-0826-pacific-distro-basic-smithi/6147408

#12 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-06-28_17:32:48-rados-wip-yuri2-testing-2021-06-28-0858-pacific-distro-basic-smithi/6239590

#13 Updated by Sage Weil over 1 year ago

  • Duplicated by Bug #51882: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) added

#14 Updated by Neha Ojha over 1 year ago

  • Backport changed from pacific to pacific, octopus

#15 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-08-06_16:31:19-rados-wip-yuri-master-8.6.21-distro-basic-smithi/6324701

#16 Updated by Telemetry Bot over 1 year ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v15.2.10, v15.2.11, v15.2.12, v15.2.13, v15.2.2, v15.2.3, v15.2.4, v15.2.5, v15.2.6, v15.2.7, v15.2.8, v15.2.9 added

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=4d653e9c3ee37041dd2a1cf556ea466db3e74addb7a8d3efb38d8e8a268096d3

Assert condition: session_map.sessions.empty()
Assert function: virtual Monitor::~Monitor()

Sanitized backtrace:

    pthread_getname_np()
    ceph::logging::Log::dump_recent()
    Monitor::~Monitor()
    Monitor::~Monitor()
    main()
    __libc_start_main()
    _start()

Crash dump sample:
{
    "assert_condition": "session_map.sessions.empty()",
    "assert_file": "mon/Monitor.cc",
    "assert_func": "virtual Monitor::~Monitor()",
    "assert_line": 262,
    "assert_msg": "mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f4ff8c8c6c0 time 2021-08-03T11:49:35.421508+0000\nmon/Monitor.cc: 262: FAILED ceph_assert(session_map.sessions.empty())",
    "assert_thread_name": "ceph-mon",
    "backtrace": [
        "(()+0x12b20) [0x7f4fed96ab20]",
        "(pthread_getname_np()+0x48) [0x7f4fed96bd98]",
        "(ceph::logging::Log::dump_recent()+0x428) [0x7f4ff01c4978]",
        "(()+0x4be2db) [0x555e399352db]",
        "(()+0x12b20) [0x7f4fed96ab20]",
        "(gsignal()+0x10f) [0x7f4fec5d27ff]",
        "(abort()+0x127) [0x7f4fec5bcc35]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f4fefe74d61]",
        "(()+0x27af2a) [0x7f4fefe74f2a]",
        "(Monitor::~Monitor()+0xef6) [0x555e39704c26]",
        "(Monitor::~Monitor()+0xd) [0x555e39704c7d]",
        "(main()+0x565e) [0x555e396974ee]",
        "(__libc_start_main()+0xf3) [0x7f4fec5be7b3]",
        "(_start()+0x2e) [0x555e396c0d8e]" 
    ],
    "ceph_version": "15.2.13",
    "crash_id": "2021-08-03T11:49:35.767310Z_62904f71-57d0-4a50-93a8-264c4cc6ff32",
    "entity_name": "mon.465717d0783140bdb59100800078d74713f06fc3",
    "os_id": "centos",
    "os_name": "CentOS Linux",
    "os_version": "8",
    "os_version_id": "8",
    "process_name": "ceph-mon",
    "stack_sig": "c7d7213859ab7cdabcc40049aff5482ebbf1b9e92d6e65a376ea1d5e89787cf6",
    "timestamp": "2021-08-03T11:49:35.767310Z",
    "utsname_machine": "x86_64",
    "utsname_release": "4.19.0-17-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Debian 4.19.194-3 (2021-07-18)" 
}

#17 Updated by jianwei zhang over 1 year ago

{
    "crash_id": "2021-08-26T03:38:46.109584Z_c0f5c111-a3bc-4210-8edd-e72cb5344590",
    "timestamp": "2021-08-26T03:38:46.109584Z",
    "process_name": "ceph-mon",
    "entity_name": "mon.c",
    "ceph_version": "v15.2.8.1.0.0",
    "utsname_hostname": "node-102",
    "utsname_sysname": "Linux",
    "utsname_release": "3.10.0-862.el7.x86_64",
    "utsname_version": "#1 SMP Fri Apr 20 16:44:24 UTC 2018",
    "utsname_machine": "x86_64",
    "os_name": "CentOS Linux",
    "os_id": "centos",
    "os_version_id": "7",
    "os_version": "7 (Core)",
    "assert_condition": "session_map.sessions.empty()",
    "assert_func": "virtual Monitor::~Monitor()",
    "assert_file": "/SDS-CICD/release/ceph15-tancz/rpmbuild/BUILD/ceph-15.2.8.1.0.0/src/mon/Monitor.cc",
    "assert_line": 262,
    "assert_thread_name": "ceph-mon",
    "assert_msg": "src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f8e02893340 time 2021-08-26T11:38:46.105871+0800\nsrc/mon/Monitor.cc: 262: FAILED ceph_assert(session_map.sessions.empty())\n",
    "backtrace": [
        "(()+0xf5d0) [0x7f8df78ce5d0]",
        "(gsignal()+0x37) [0x7f8df66c4207]",
        "(abort()+0x148) [0x7f8df66c58f8]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x19b) [0x7f8df9ac4c9e]",
        "(()+0x269e17) [0x7f8df9ac4e17]",
        "(Monitor::~Monitor()+0x846) [0x557a6eceded6]",
        "(Monitor::~Monitor()+0x9) [0x557a6ecedf29]",
        "(main()+0x260a) [0x557a6ec7ba9a]",
        "(__libc_start_main()+0xf5) [0x7f8df66b03d5]",
        "(()+0x2304f0) [0x557a6ecac4f0]" 
    ]
}

#18 Updated by Neha Ojha over 1 year ago

  • Duplicated by Bug #52199: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) added

#19 Updated by Neha Ojha over 1 year ago

  • Duplicated by Bug #52198: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) added

#20 Updated by Neha Ojha about 1 year ago

  • Duplicated by Bug #52142: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) added

#21 Updated by Deepika Upadhyay about 1 year ago

  • Crash signature (v1) updated (diff)
021-10-02T17:30:34.842 INFO:tasks.ceph.mon.a.smithi063.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.6-216-g6e2fe4ec/rpm/el8/BUILD/ceph-16.2.6-216-g6e2fe4ec/src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 4045240 time 2021-10-02T17:30:34.839243+0000
2021-10-02T17:30:34.843 INFO:tasks.ceph.mon.a.smithi063.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.6-216-g6e2fe4ec/rpm/el8/BUILD/ceph-16.2.6-216-g6e2fe4ec/src/mon/Monitor.cc: 287: FAILED ceph_assert(session_map.sessions.empty())

/ceph/teuthology-archive/yuriw-2021-10-02_15:03:31-rados-wip-yuri2-testing-2021-10-01-0902-pacific-distro-basic-smithi/641
7691/teuthology.log

#22 Updated by Sage Weil about 1 year ago

/a/sage-2021-10-28_02:19:01-rados-wip-sage3-testing-2021-10-27-1300-distro-basic-smithi/6464204

with logs!

#23 Updated by Aishwarya Mathuria about 1 year ago

/a/yuriw-2021-11-20_18:01:41-rados-wip-yuri8-testing-2021-11-20-0807-distro-basic-smithi/6516396

#24 Updated by Sage Weil 12 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 44337

#25 Updated by Sage Weil 12 months ago

  • Status changed from Fix Under Review to Pending Backport

#26 Updated by Backport Bot 12 months ago

  • Copied to Backport #53659: pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum added

#27 Updated by Backport Bot 12 months ago

  • Copied to Backport #53660: octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum added

#28 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v14.2.2, v15.2.14, v15.2.15 added

#29 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)

#30 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v16.2.0, v16.2.2, v16.2.4, v16.2.5, v16.2.6, v16.2.7 added

#31 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)

#32 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)

#33 Updated by Telemetry Bot 9 months ago

  • Crash signature (v1) updated (diff)
  • Affected Versions v14.2.0, v14.2.1, v14.2.10, v14.2.11, v14.2.13, v14.2.16, v14.2.4, v14.2.5, v14.2.7, v14.2.8, v15.2.0 added

#34 Updated by Neha Ojha 6 months ago

  • Status changed from Pending Backport to Resolved
  • Crash signature (v1) updated (diff)

#35 Updated by Telemetry Bot 6 months ago

  • Related to Bug #56192: crash: virtual Monitor::~Monitor(): assert(session_map.sessions.empty()) added

Also available in: Atom PDF