Project

General

Profile

Bug #50825

qa: snaptest-git-ceph hang during mon thrashing v2

Added by Patrick Donnelly about 1 month ago. Updated 10 days ago.

Status:
Need More Info
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-05-14T22:15:23.866 INFO:tasks.workunit:Running workunit fs/snaps/snaptest-git-ceph.sh...
...
2021-05-15T01:15:27.761 INFO:tasks.workunit.client.0.smithi102.stderr:Updating files:   0% (20/10638)^MUpdating files:   1% (107/10638)^MUpdating files:   1% (131/10638)^MUpdating files:   2% (213/10638)^MUpdating files:   2% (224/10638)^MUpdating files:   3% (320/10638)^MUpdating files:   4% (426/10638)^MUpdating files:   5% (532/10638)^MUpdating files:   5% (618/10638)^MUpdating files:   6% (639/10638)^MUpdating files:   6% (675/10638)^MUpdating files:   7% (745/10638)^MUpdating files:   7% (792/10638)^MUpdating files:   7% (813/10638)^MUpdating files:   8% (852/10638)^MUpdating files:   8% (889/10638)^MUpdating files:   9% (958/10638)^MUpdating files:   9% (979/10638)^MUpdating files:   9% (1052/10638)^MUpdating files:  10% (1064/10638)^MUpdating files:  10% (1133/10638)^MUpdating files:  11% (1171/10638)^MUpdating files:  11% (1215/10638)^MUpdating files:  11% (1233/10638)^MUpdating files:  12% (1277/10638)^MUpdating files:  12% (1356/10638)^MUpdating files:  12% (1377/10638)^MUpdating files:  13% (1383/10638)^MUpdating files:  13% (1400/10638)^MUpdating files:  13% (1470/10638)^MUpdating files:  13% (1475/10638)^MUpdating files:  13% (1476/10638)^MUpdating files:  14% (1490/10638)^MUpdating files:  15% (1596/10638)^MUpdating files:  15% (1657/10638)^MUpdating files:  15% (1678/10638)^MUpdating files:  15% (1679/10638)^MUpdating files:  16% (1703/10638)^MUpdating files:  17% (1809/10638)^MUpdating files:  17% (1866/10638)^MUpdating files:  17% (1871/10638)^MUpdating files:  17% (1896/10638)^MUpdating files:  17% (1914/10638)^MUpdating files:  18% (1915/10638)^MUpdating files:  18% (1926/10638)^MUpdating files:  18% (1967/10638)^MUpdating files:  18% (1972/10638)^MUpdating files:  19% (2022/10638)^MUpdating files:  19% (2055/10638)^MUpdating files:  19% (2080/10638)^MUpdating files:  20% (2128/10638)^MUpdating files:  20% (2144/10638)^MUpdating files:  20% (2191/10638)^MUpdating files:  21% (2234/10638)^MUpdating files:  21% (2261/10638)^MUpdating files:  22% (2341/10638)^MUpdating files:  22% (2385/10638)^MUpdating files:  22% (2426/10638)^MUpdating files:  23% (2447/10638)^MUpdating files:  23% (2486/10638)^MUpdating files:  23% (2542/10638)^MUpdating files:  24% (2554/10638)^MUpdating files:  24% (2572/10638)^MUpdating files:  25% (2660/10638)^MUpdating files:  25% (2664/10638)^MUpdating files:  25% (2700/10638)^MUpdating files:  25% (2757/10638)^MUpdating files:  26% (2766/10638)^MUpdating files:  26% (2865/10638)^MUpdating files:  27% (2873/10638)^MUpdating files:  27% (2895/10638)^MUpdating files:  27% (2906/10638)^MUpdating files:  27% (2970/10638)^MUpdating files:  28% (2979/10638)^MUpdating files:  28% (3017/10638)^MUpdating files:  28% (3035/10638)^MUpdating files:  29% (3086/10638)^MUpdating files:  29% (3151/10638)^MUpdating files:  29% (3152/10638)^MUpdating files:  29% (3176/10638)^MUpdating files:  30% (3192/10638)^MUpdating files:  30% (3267/10638)^MUpdating files:  31% (3298/10638)^MUpdating files:  31% (3332/10638)^MUpdating files:  31% (3369/10638)^MUpdating files:  31% (3371/10638)^MUpdating files:  32% (3405/10638)^MUpdating files:  32% (3421/10638)^MUpdating files:  32% (3498/10638)^M++ retry
2021-05-15T01:15:27.761 INFO:tasks.workunit.client.0.smithi102.stderr:++ rm -rf ceph
2021-05-15T01:15:36.753 INFO:tasks.mon_thrash.mon_thrasher:killing mon.b
2021-05-15T01:15:36.753 INFO:tasks.mon_thrash.mon_thrasher:reviving mon.b
2021-05-15T01:15:36.753 INFO:tasks.ceph.mon.b:Restarting daemon
2021-05-15T01:15:36.754 DEBUG:teuthology.orchestra.run.smithi183:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f --cluster ceph -i b
2021-05-15T01:15:36.757 INFO:tasks.ceph.mon.b:Started

From: /ceph/teuthology-archive/pdonnell-2021-05-14_21:45:42-fs-master-distro-basic-smithi/6115755/teuthology.log

Might be related to #50021 and #50824

History

#1 Updated by Xiubo Li about 1 month ago

  • Status changed from New to In Progress

#2 Updated by Xiubo Li 30 days ago

2021-05-14T22:15:23.866 DEBUG:teuthology.orchestra.run.smithi102:workunit test fs/snaps/snaptest-git-ceph.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e78e41c7f45263bfc3d22dafa953b7e485aac84d TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/snaps/snaptest-git-ceph.sh
2021-05-14T22:15:27.763 INFO:tasks.workunit.client.0.smithi102.stderr:+ set -e
2021-05-14T22:15:27.763 INFO:tasks.workunit.client.0.smithi102.stderr:+ retried=false
2021-05-14T22:15:27.763 INFO:tasks.workunit.client.0.smithi102.stderr:+ trap -- retry EXIT
2021-05-14T22:15:27.764 INFO:tasks.workunit.client.0.smithi102.stderr:+ rm -rf ceph
2021-05-14T22:15:27.764 INFO:tasks.workunit.client.0.smithi102.stderr:+ timeout 1800 git clone git://git.ceph.com/ceph.git
...
2021-05-14T22:15:27.766 INFO:tasks.workunit.client.0.smithi102.stderr:Cloning into 'ceph'...
...

2021-05-15T01:15:27.761 INFO:tasks.workunit.client.0.smithi102.stderr:Updating files:   0% (20/10638)^MUpdating files:   1% (107/10638)^MUpdating files:   1% (131/10638)^MUpdating files:   2% (213/10638)^MUpdating files:   2% (224/10638)^MUpdating files:   3% (320/10638)^MUpdating files:   4% (426/10638)^MUpdating files:   5% (532/10638)^MUpdating files:   5% (618/10638)^MUpdating files:   6% (639/10638)^MUpdating files:   6% (675/10638)^MUpdating files:   7% (745/10638)^MUpdating files:   7% (792/10638)^MUpdating files:   7% (813/10638)^MUpdating files:   8% (852/10638)^MUpdating files:   8% (889/10638)^MUpdating files:   9% (958/10638)^MUpdating files:   9% (979/10638)^MUpdating files:   9% (1052/10638)^MUpdating files:  10% (1064/10638)^MUpdating files:  10% (1133/10638)^MUpdating files:  11% (1171/10638)^MUpdating files:  11% (1215/10638)^MUpdating files:  11% (1233/10638)^MUpdating files:  12% (1277/10638)^MUpdating files:  12% (1356/10638)^MUpdating files:  12% (1377/10638)^MUpdating files:  13% (1383/10638)^MUpdating files:  13% (1400/10638)^MUpdating files:  13% (1470/10638)^MUpdating files:  13% (1475/10638)^MUpdating files:  13% (1476/10638)^MUpdating files:  14% (1490/10638)^MUpdating files:  15% (1596/10638)^MUpdating files:  15% (1657/10638)^MUpdating files:  15% (1678/10638)^MUpdating files:  15% (1679/10638)^MUpdating files:  16% (1703/10638)^MUpdating files:  17% (1809/10638)^MUpdating files:  17% (1866/10638)^MUpdating files:  17% (1871/10638)^MUpdating files:  17% (1896/10638)^MUpdating files:  17% (1914/10638)^MUpdating files:  18% (1915/10638)^MUpdating files:  18% (1926/10638)^MUpdating files:  18% (1967/10638)^MUpdating files:  18% (1972/10638)^MUpdating files:  19% (2022/10638)^MUpdating files:  19% (2055/10638)^MUpdating files:  19% (2080/10638)^MUpdating files:  20% (2128/10638)^MUpdating files:  20% (2144/10638)^MUpdating files:  20% (2191/10638)^MUpdating files:  21% (2234/10638)^MUpdating files:  21% (2261/10638)^MUpdating files:  22% (2341/10638)^MUpdating files:  22% (2385/10638)^MUpdating files:  22% (2426/10638)^MUpdating files:  23% (2447/10638)^MUpdating files:  23% (2486/10638)^MUpdating files:  23% (2542/10638)^MUpdating files:  24% (2554/10638)^MUpdating files:  24% (2572/10638)^MUpdating files:  25% (2660/10638)^MUpdating files:  25% (2664/10638)^MUpdating files:  25% (2700/10638)^MUpdating files:  25% (2757/10638)^MUpdating files:  26% (2766/10638)^MUpdating files:  26% (2865/10638)^MUpdating files:  27% (2873/10638)^MUpdating files:  27% (2895/10638)^MUpdating files:  27% (2906/10638)^MUpdating files:  27% (2970/10638)^MUpdating files:  28% (2979/10638)^MUpdating files:  28% (3017/10638)^MUpdating files:  28% (3035/10638)^MUpdating files:  29% (3086/10638)^MUpdating files:  29% (3151/10638)^MUpdating files:  29% (3152/10638)^MUpdating files:  29% (3176/10638)^MUpdating files:  30% (3192/10638)^MUpdating files:  30% (3267/10638)^MUpdating files:  31% (3298/10638)^MUpdating files:  31% (3332/10638)^MUpdating files:  31% (3369/10638)^MUpdating files:  31% (3371/10638)^MUpdating files:  32% (3405/10638)^MUpdating files:  32% (3421/10638)^MUpdating files:  32% (3498/10638)^M++ retry
2021-05-15T01:15:27.761 INFO:tasks.workunit.client.0.smithi102.stderr:++ rm -rf ceph

It was timedout after 3 hours already. It seems the `timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/snaps/snaptest-git-ceph.sh` process was suspended during the 3 hours.

#3 Updated by Xiubo Li 29 days ago

I am afraid this is also caused by `git` tool's bug, but there has not remote/ directory for this test.

#4 Updated by Xiubo Li 10 days ago

  • Status changed from In Progress to Need More Info

Hi Patrick,

There was no any obvious logs for this one, it could be also the git tool crash issue with Tracker#50824, but there didn't have the remote/ dir, so I couldn't confirm it.

If this could be reproduceable in future, please share the link, thanks.

Also available in: Atom PDF