Bug #65704: mds+valgrind: beacon thread blocked for 60+ seconds - CephFS - Ceph

Actions

Copy link

Bug #65704

open

mds+valgrind: beacon thread blocked for 60+ seconds

Added by Patrick Donnelly 21 days ago. Updated 20 days ago.

Status:

New

Priority:

Urgent

Assignee:

Leonid Usov

Category:

Correctness/Safety

Target version:

Ceph - v20.0.0

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS, quiesce

Labels (FS):

qa-failure

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

This one is really weird and my working theory is that this is related to the quiesce database. Test symptom:

2024-04-30T08:17:04.679 DEBUG:teuthology.run:Teuthology command: teuthology --owner scheduled_pdonnell@teuthology --name pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi --archive /home/teuthworker/archive/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680611 --verbose --description fs/verify/{begin/{0-install 1-ceph 2-logrotate 3-modules} clusters/1a5s-mds-1c-client conf/{client mds mgr mon osd} distro/{ubuntu/{overrides ubuntu_latest}} mount/fuse objectstore-ec/bluestore-ec-root overrides/{ignorelist_health ignorelist_wrongly_marked_down mon-debug pg_health session_timeout} ranks/3 tasks/fsstress validater/valgrind} -- /home/teuthworker/archive/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680611/orig.config.yaml
...
2024-04-30T08:45:39.650 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.c is failed for ~128s
2024-04-30T08:45:39.650 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.d is failed for ~291s
2024-04-30T08:45:46.363 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.c is failed for ~134s
2024-04-30T08:45:46.363 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.d is failed for ~298s
2024-04-30T08:45:53.077 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.c is failed for ~141s
2024-04-30T08:45:53.078 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.d is failed for ~305s
2024-04-30T08:45:53.078 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons

From: /teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680611/teuthology.log

2024-04-30T08:39:32.618+0000 13368640  5 mds.beacon.d Sending beacon up:active seq 23
2024-04-30T08:39:32.618+0000 13368640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:3300/0,v1:172.21.15.96:6789/0] -- mdsbeacon(4255/d up:active seq=23 v16) -- 0x1b2d2d90 con 0xc2618f0
2024-04-30T08:39:32.618+0000 13368640 20 mds.beacon.d sender thread waiting interval 4s
...
2024-04-30T08:39:33.630+0000 db5d640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== mon.1 v2:172.21.15.96:3300/0 87 ==== mdsbeacon(4255/d up:active seq=23 v16) ==== 130+0+0 (secure 0 0 0) 0xc318090 con 0xc2618f0
...
2024-04-30T08:39:40.833+0000 13368640  5 mds.beacon.d Sending beacon up:active seq 25
2024-04-30T08:39:40.833+0000 13368640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:3300/0,v1:172.21.15.96:6789/0] -- mdsbeacon(4255/d up:active seq=25 v16) -- 0x1abd1400 con 0xc2618f0
2024-04-30T08:39:40.833+0000 13368640 20 mds.beacon.d sender thread waiting interval 4s
...
2024-04-30T08:40:42.093+0000 13368640  0 mds.beacon.d missed beacon ack from the monitors
2024-04-30T08:40:42.093+0000 13368640  1 heartbeat_map is_healthy 'MDSRank' had timed out after 60.000003815s
2024-04-30T08:40:42.093+0000 13368640  0 mds.beacon.d Skipping beacon heartbeat to monitors (last acked 65.3553s ago); MDS internal heartbeat is not healthy!
2024-04-30T08:40:42.097+0000 13368640 20 mds.beacon.d sender thread waiting interval 0.5s

From: /teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680611/remote/smithi096/log/ceph-mds.d.log.gz

The alarming problem is that the beacon upkeep thread apparently slept for about 60 seconds! This should only be possible if something is sitting on the Beacon::mutex but I don't see any recent changes that would cause this.

Further adding to the confusion, another rank immediately followed with this problem:

2024-04-30T08:40:36.700+0000 13368640  5 mds.beacon.c Sending beacon up:active seq 44
2024-04-30T08:40:36.700+0000 13368640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.5:3300/0,v1:172.21.15.5:6789/0] -- mdsbeacon(4206/c up:active seq=44 v16) -- 0x1af30fe0 con 0xc274200
2024-04-30T08:40:36.704+0000 13368640 20 mds.beacon.c sender thread waiting interval 4s
2024-04-30T08:40:36.712+0000 db5d640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] <== mon.0 v2:172.21.15.5:3300/0 91 ==== mdsbeacon(4206/c up:active seq=44 v16) ==== 130+0+0 (secure 0 0 0) 0x1af37bb0 con 0xc274200
2024-04-30T08:40:36.712+0000 db5d640  5 mds.beacon.c received beacon reply up:active seq 44 rtt 0.0119999
2024-04-30T08:40:36.712+0000 13368640 20 mds.beacon.c sender thread waiting interval 3.988s
...
2024-04-30T08:40:39.968+0000 11364640  1 mds.c Updating MDS map to version 17 from mon.0
2024-04-30T08:40:39.968+0000 11364640 10 mds.c      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes}
2024-04-30T08:40:39.968+0000 11364640 10 mds.c  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes}
2024-04-30T08:40:39.968+0000 11364640 10 mds.c my gid is 4206
2024-04-30T08:40:39.968+0000 11364640 10 mds.c map says I am mds.1.11 state up:active
2024-04-30T08:40:39.968+0000 11364640 10 mds.c msgr says I am [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041]
2024-04-30T08:40:39.972+0000 11364640 10 mds.c  peer mds gid 4255 removed from map
...
2024-04-30T08:42:17.850+0000 13368640  5 mds.beacon.c Sending beacon up:active seq 45
2024-04-30T08:42:17.850+0000 13368640 20 mds.beacon.c sender thread waiting interval 4s
...
2024-04-30T08:42:18.506+0000 12b67640  1 mds.beacon.c MDS connection to Monitors appears to be laggy; 101.807s since last acked beacon

From: /teuthology/pdonnell-2024-04-30_05:04:19-fs-wip-pdonnell-testing-20240429.210911-debug-distro-default-smithi/7680611/remote/smithi005/log/ceph-mds.c.log.gz

Why this may be related to the quiescedb:

2024-04-30T08:40:40.060+0000 11364640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.5:3300/0,v1:172.21.15.5:6789/0] -- mon_subscribe({osdmap=33}) -- 0x1af8f410 con 0xc274200
2024-04-30T08:40:40.060+0000 db5d640  1 --2- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] >>  conn(0x1af8faa0 0x1af921f0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).accept
2024-04-30T08:40:40.068+0000 1a375640 10 quiesce.mgr.4206 <membership_upkeep> epoch:17 leader:false->true members:4206,4205
2024-04-30T08:40:40.072+0000 1a375640 15 quiesce.mgr.4206 <membership_upkeep> Updated membership
2024-04-30T08:40:40.088+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:40.088+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1af984d0
2024-04-30T08:40:40.088+0000 1a375640  1 --2- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] >> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] conn(0x1af98720 0x1af9d200 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect
2024-04-30T08:40:40.088+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1af984d0 con 0x1af98720
2024-04-30T08:40:41.092+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:41.092+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1b0a6270
2024-04-30T08:40:41.092+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1b0a6270 con 0x1af98720
2024-04-30T08:40:42.096+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:42.096+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1a8af500
2024-04-30T08:40:42.096+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1a8af500 con 0x1af98720
2024-04-30T08:40:43.100+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:43.100+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0xc1bc9c0
2024-04-30T08:40:43.100+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0xc1bc9c0 con 0x1af98720
2024-04-30T08:40:44.104+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:44.104+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1ad5a310
2024-04-30T08:40:44.104+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1ad5a310 con 0x1af98720
2024-04-30T08:40:45.108+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:45.108+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x188d2b40
2024-04-30T08:40:45.108+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x188d2b40 con 0x1af98720
2024-04-30T08:40:46.112+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:46.112+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1b2e5910
2024-04-30T08:40:46.112+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1b2e5910 con 0x1af98720
2024-04-30T08:40:47.115+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:47.115+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- ?+0 0x1a64ed60
2024-04-30T08:40:47.115+0000 1a375640  1 -- [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] --  -- 0x1a64ed60 con 0x1af98720

It's one of the few threads doing work. It may be innocent but bears looking into.

Actions

Copy link

Updated by Patrick Donnelly 21 days ago

Description updated (diff)

Actions

Copy link

Updated by Leonid Usov 21 days ago

Subject changed from mds: quiesce upkeep (?) appears to block MDS to mds: quiesce appears to block MDS

2024-04-30T08:40:40.068+0000 1a375640 10 quiesce.mgr.4206 <membership_upkeep> epoch:17 leader:false->true members:4206,4205
2024-04-30T08:40:40.072+0000 1a375640 15 quiesce.mgr.4206 <membership_upkeep> Updated membership
2024-04-30T08:40:40.088+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:40.088+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:41.092+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205
2024-04-30T08:40:42.096+0000 1a375640 10 quiesce.mds.1 <operator()> sending listing q-db[v:(17:0) sets:0/0] to the peer 4205

This is harmless, the leader (4206) is bootstrapping. Bootstrap won’t complete until the leader gets acks from all members of the cluster, and apparently, member 4205 doesn’t respond. QDB doesn’t try to resolve this on its own, it believes that if somebody configured a cluster including 4205 that node must be responsive, and if it isn’t, then a new cluster membership is coming with that one removed. Until then, once a second the leader will ping members it has never heard from.

If this is the only evidence against it, I wouldn't consider the quiesce db a root cause.

Actions

Copy link

Updated by Patrick Donnelly 21 days ago

Subject changed from mds: quiesce appears to block MDS to mds+valgind: beacon thread blocked for 60+ seconds

Actions

Copy link

Updated by Leonid Usov 21 days ago · Edited

Looking at remote/smithi096/log/ceph-mds.d.log.gz, I get an impression that the node is cut from network for a few mins.

After this log line

2024-04-30T08:39:38.781+0000 11364640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== client.4687 172.21.15.5:0/2886621917 1 ==== command(tid 0: {"prefix": "get_command_descriptions"}) ==== 62+0+0 (crc 0 0 0) 0
x1ad46210 con 0x1b935ab0

this node only attempts to send messages but gets nothing from the outside world

2024-04-30T08:39:38.781+0000 1ab76640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --  -- ?+0 0x1b3fcb20
2024-04-30T08:39:38.781+0000 1ab76640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --  -- 0x1b3fcb20 con 0x1b08c260
2024-04-30T08:39:39.829+0000 f360640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> 172.21.15.5:0/2886621917 -- command_reply(tid 0: 0 ) -- 0xc331d70 con 0x1b935ab0
2024-04-30T08:39:39.829+0000 1ab76640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --  -- ?+0 0xc09c250
2024-04-30T08:39:39.829+0000 1ab76640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] --  -- 0xc09c250 con 0x1b08c260
2024-04-30T08:39:40.833+0000 13368640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:3300/0,v1:172.21.15.96:6789/0] -- mdsbeacon(4255/d up:active seq=25 v16) -- 0x1abd1400 con 0xc2618f0
2024-04-30T08:39:40.833+0000 19373640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] -- mds_metrics from rank=0 carrying 0 metric upda
tes -- ?+0 0x1abd35d0
2024-04-30T08:39:40.833+0000 19373640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] -- mds_metrics from rank=0 carrying 0 metric updates -- 0x1a
bd35d0 con 0xc238fb0
2024-04-30T08:39:40.837+0000 13b69640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.5:6800/2925405611,v1:172.21.15.5:6801/2925405611] -- mgrreport(unknown.d +0-0 packed 1614) -- 0x13f4e5b0 con 0
x13c888c0
2024-04-30T08:39:40.837+0000 11b65640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== mds.0 v2:172.21.15.96:6832/560761291 0 ==== mds_metrics from rank=0 carrying 0 metric updates ==== 0+0+0 (unknown 0 0 0) 0x1
abd35d0 con 0xc238fb0
2024-04-30T08:39:40.841+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] -- mdsping -- ?+0 0x13f4ebc0
2024-04-30T08:39:40.841+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] -- mdsping -- 0x13f4ebc0 con 0xc238fb0
2024-04-30T08:39:40.841+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] -- mdsping -- ?+0 0x1b2a6dc0
2024-04-30T08:39:40.841+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.5:6834/2742512041,v1:172.21.15.5:6835/2742512041] -- mdsping -- 0x1b2a6dc0 con 0x1b08c260
2024-04-30T08:39:40.841+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] send_to--> mds [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] -- mdsping -- ?+0 0x1abd4db0
2024-04-30T08:39:40.845+0000 19b74640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] --> [v2:172.21.15.96:6836/2517227941,v1:172.21.15.96:6837/2517227941] -- mdsping -- 0x1abd4db0 con 0x1b0758d0

( <== mds.0 is a loopback receipt )

and this continues until we suddenly start getting pongs from mds.2 a whole minute after

2024-04-30T08:40:42.097+0000 e35e640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== mds.2 v2:172.21.15.96:6836/2517227941 32 ==== mds_metrics from rank=2 carrying 0 metric updates ==== 22+0+0 (crc 0 0 0) 0x189
c5000 con 0x1b0758d0
2024-04-30T08:40:42.097+0000 e35e640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== mds.2 v2:172.21.15.96:6836/2517227941 33 ==== mds_metrics from rank=2 carrying 0 metric updates ==== 22+0+0 (crc 0 0 0) 0xc2e
c690 con 0x1b0758d0
2024-04-30T08:40:42.101+0000 e35e640  1 -- [v2:172.21.15.96:6832/560761291,v1:172.21.15.96:6833/560761291] <== mds.2 v2:172.21.15.96:6836/2517227941 34 ==== mds_metrics from rank=2 carrying 0 metric updates ==== 22+0+0 (crc 0 0 0) 0x13e
e7bf0 con 0x1b0758d0

Maybe, it's the messenger thread that was blocked?

Actions

Copy link

Updated by Patrick Donnelly 21 days ago

Subject changed from mds+valgind: beacon thread blocked for 60+ seconds to mds+valgrind: beacon thread blocked for 60+ seconds

Actions

Copy link

Updated by Patrick Donnelly 21 days ago

Leonid Usov wrote in #note-4:

Looking at remote/smithi096/log/ceph-mds.d.log.gz, I get an impression that the node is cut from network for a few mins.

After this log line

[...]

this node only attempts to send messages but gets nothing from the outside world

[...]

( <== mds.0 is a loopback receipt )

and this continues until we suddenly start getting pongs from mds.2 a whole minute after

[...]

Maybe, it's the messenger thread that was blocked?

Could be. But why would the beacon upkeep thread also fail to acquire (presumably) the Beacon::mutex?

Actions

Copy link

Updated by Leonid Usov 21 days ago

The alarming problem is that the beacon upkeep thread apparently slept for about 60 seconds! This should only be possible if something is sitting on the Beacon::mutex but I don't see any recent changes that would cause this.

I see what you mean. I think that we may be looking here at another case of an overloaded containerized stand

      dout(20) << "sender thread waiting interval " << interval << "s" << dendl;
      if (cvar.wait_for(lock, interval*1s) == std::cv_status::timeout) {
        if (sent) {
          //missed beacon ack because we timedout after a beacon send
          dout(0) << "missed beacon ack from the monitors" << dendl;

the difference of 60 seconds between the messages could mean that the beacon mutex was held, but there is no other call that would hold this mutex for so long. I think the system didn't properly serve the container for a whole minute.

Actions

Copy link