Project

General

Profile

Actions

Bug #62510

closed

Bug #59413: cephfs: qa snaptest-git-ceph.sh failed with "got remote process result: 128"

snaptest-git-ceph.sh failure with fs/thrash

Added by Venky Shankar 8 months ago. Updated 2 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/vshankar-2023-08-16_11:13:33-fs-wip-vshankar-testing-20230809.035933-testing-default-smithi/7369825

fs/thrash/multifs/{begin/{0-install 1-ceph 2-logrotate} clusters/1a3s-mds-2c-client conf/{client mds mon osd} distro/{ubuntu_latest} mount/fuse msgr-failures/none objectstore/bluestore-bitmap overrides/{frag ignorelist_health ignorelist_wrongly_marked_down multifs session_timeout thrashosds-health} tasks/{1-thrash/mds 2-workunit/cfuse_workunit_snaptests}}

Related issues 1 (0 open1 closed)

Related to Linux kernel client - Bug #48640: qa: snapshot mismatch during mds thrashingResolvedXiubo Li

Actions
Actions #1

Updated by Venky Shankar 8 months ago

Xiubo, please take this one.

Actions #2

Updated by Venky Shankar 8 months ago

  • Subject changed from snaptest-git-ceph.sh failures with fs/thrash to snaptest-git-ceph.sh failure with fs/thrash
Actions #3

Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

Xiubo, please take this one.

Sure.

Actions #5

Updated by Xiubo Li 8 months ago

  • Status changed from New to In Progress
Actions #6

Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

/a/vshankar-2023-08-16_11:13:33-fs-wip-vshankar-testing-20230809.035933-testing-default-smithi/7369825

[...]

It failed when cloning the ceph git repo from https://git.ceph.com/ceph.git:

2023-08-16T11:52:18.665 INFO:tasks.workunit.client.1.smithi125.stderr:error: RPC failed; curl 56 GnuTLS recv error (-9): Error decoding the received TLS packet.
2023-08-16T11:52:18.670 INFO:tasks.workunit.client.1.smithi125.stderr:fetch-pack: unexpected disconnect while reading sideband packet
2023-08-16T11:52:18.677 INFO:tasks.workunit.client.1.smithi125.stderr:fatal: early EOF
2023-08-16T11:52:18.685 INFO:tasks.workunit.client.1.smithi125.stderr:fatal: fetch-pack: invalid index-pack output
2023-08-16T11:52:18.935 INFO:tasks.workunit.client.1.smithi125.stderr:+ retry
2023-08-16T11:52:18.935 INFO:tasks.workunit.client.1.smithi125.stderr:+ rm -rf ceph
2023-08-16T11:52:18.937 INFO:tasks.workunit.client.1.smithi125.stderr:+ timeout 3600 git clone https://git.ceph.com/ceph.git
2023-08-16T11:52:18.943 INFO:tasks.workunit.client.1.smithi125.stderr:Cloning into 'ceph'...
...
Actions #7

Updated by Xiubo Li 8 months ago

Venky Shankar wrote:

Another one, but with kclient

https://pulpito.ceph.com/vshankar-2023-08-23_03:59:53-fs-wip-vshankar-testing-20230822.060131-testing-default-smithi/7377275/

While this one failed when creating the snapshot v0.71:

2023-08-23T07:30:51.960 INFO:tasks.workunit.client.0.smithi104.stderr:+ for v in $versions
2023-08-23T07:30:51.960 INFO:tasks.workunit.client.0.smithi104.stderr:+ '[' 71 -eq 48 ']'
2023-08-23T07:30:51.960 INFO:tasks.workunit.client.0.smithi104.stderr:+ ver=v0.71
2023-08-23T07:30:51.961 INFO:tasks.workunit.client.0.smithi104.stdout:checking v0.71
2023-08-23T07:30:51.961 INFO:tasks.workunit.client.0.smithi104.stderr:+ echo checking v0.71
2023-08-23T07:30:51.961 INFO:tasks.workunit.client.0.smithi104.stderr:+ cd .snap/v0.71
2023-08-23T07:30:51.963 INFO:tasks.workunit.client.0.smithi104.stderr:+ git diff --exit-code
...
2023-08-23T07:30:53.702 INFO:tasks.workunit.client.0.smithi104.stdout:diff --git a/src/msg/Pipe.cc b/src/msg/Pipe.cc
2023-08-23T07:30:53.702 INFO:tasks.workunit.client.0.smithi104.stdout:index 66b64d0097a..e69de29bb2d 100644
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:--- a/src/msg/Pipe.cc
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:+++ b/src/msg/Pipe.cc
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:@@ -1,2251 +0,0 @@
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:-// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*-
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:-// vim: ts=8 sw=2 smarttab
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:-/*
2023-08-23T07:30:53.703 INFO:tasks.workunit.client.0.smithi104.stdout:- * Ceph - scalable distributed file system
2023-08-23T07:30:53.704 INFO:tasks.workunit.client.0.smithi104.stdout:- *
...
2023-08-23T07:30:53.994 INFO:tasks.workunit.client.0.smithi104.stdout:-    len -= did;
2023-08-23T07:30:53.994 INFO:tasks.workunit.client.0.smithi104.stdout:-    buf += did;
2023-08-23T07:30:53.994 INFO:tasks.workunit.client.0.smithi104.stdout:-    //lgeneric_dout(cct, DBL) << "tcp_write did " << did << ", " << len << " left" << dendl;
2023-08-23T07:30:53.995 INFO:tasks.workunit.client.0.smithi104.stdout:-  }
2023-08-23T07:30:53.995 INFO:tasks.workunit.client.0.smithi104.stdout:-  return 0;
2023-08-23T07:30:53.995 INFO:tasks.workunit.client.0.smithi104.stdout:-}
2023-08-23T07:30:53.995 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0...
Actions #8

Updated by Xiubo Li 8 months ago

Xiubo Li wrote:

Venky Shankar wrote:

Another one, but with kclient

https://pulpito.ceph.com/vshankar-2023-08-23_03:59:53-fs-wip-vshankar-testing-20230822.060131-testing-default-smithi/7377275/

While this one failed when creating the snapshot v0.71:

[...]

This is a known issue with https://tracker.ceph.com/issues/48640, and this test failure was based on the rhel disto, which hasn't included the fix yet:

2023-08-23T06:32:59.108 DEBUG:teuthology.orchestra.run.smithi104:> echo no | sudo yum reinstall kernel || true
2023-08-23T06:32:59.438 INFO:teuthology.orchestra.run.smithi104.stdout:Updating Subscription Management repositories.
2023-08-23T06:32:59.438 INFO:teuthology.orchestra.run.smithi104.stdout:Unable to read consumer identity
2023-08-23T06:32:59.696 INFO:teuthology.orchestra.run.smithi104.stdout:Last metadata expiration check: 0:00:01 ago on Wed 23 Aug 2023 06:32:58 AM UTC.
2023-08-23T06:32:59.738 INFO:teuthology.orchestra.run.smithi104.stdout:Installed package kernel-4.18.0-372.9.1.el8.x86_64 (from anaconda) not available.
2023-08-23T06:32:59.772 INFO:teuthology.orchestra.run.smithi104.stderr:Error: No packages marked for reinstall.
2023-08-23T06:32:59.819 DEBUG:teuthology.orchestra.run.smithi104:> sudo yum reinstall -y kernel || true
2023-08-23T06:33:00.148 INFO:teuthology.orchestra.run.smithi104.stdout:Updating Subscription Management repositories.
2023-08-23T06:33:00.148 INFO:teuthology.orchestra.run.smithi104.stdout:Unable to read consumer identity
2023-08-23T06:33:00.400 INFO:teuthology.orchestra.run.smithi104.stdout:Last metadata expiration check: 0:00:02 ago on Wed 23 Aug 2023 06:32:58 AM UTC.
2023-08-23T06:33:00.439 INFO:teuthology.orchestra.run.smithi104.stdout:Installed package kernel-4.18.0-372.9.1.el8.x86_64 (from anaconda) not available.
2023-08-23T06:33:00.470 INFO:teuthology.orchestra.run.smithi104.stderr:Error: No packages marked for reinstall.
2023-08-23T06:33:00.515 DEBUG:teuthology.orchestra.run.smithi104:> rpm -q kernel | sort -rV | head -n 1
2023-08-23T06:33:00.596 INFO:teuthology.orchestra.run.smithi104.stdout:kernel-4.18.0-372.9.1.el8.x86_64
2023-08-23T06:33:00.596 DEBUG:teuthology.task.kernel:get_latest_image_version_rpm: 4.18.0-372.9.1.el8.x86_64

I will backport it to downstream later.

I will leave this PR to track the issue in descrption.

Actions #9

Updated by Xiubo Li 8 months ago

  • Status changed from In Progress to Duplicate
  • Parent task set to #59413

Xiubo Li wrote:

Venky Shankar wrote:

/a/vshankar-2023-08-16_11:13:33-fs-wip-vshankar-testing-20230809.035933-testing-default-smithi/7369825

[...]

It failed when cloning the ceph git repo from https://git.ceph.com/ceph.git:

[...]

This is a duplicated issue with https://tracker.ceph.com/issues/59413.

Actions #10

Updated by Venky Shankar 7 months ago

  • Related to Bug #48640: qa: snapshot mismatch during mds thrashing added
Actions #11

Updated by Venky Shankar 7 months ago

Xiubo, this is still waiting on fix to https://tracker.ceph.com/issues/48640, yes?

Actions #12

Updated by Xiubo Li 7 months ago

Venky Shankar wrote:

Xiubo, this is still waiting on fix to https://tracker.ceph.com/issues/48640, yes?

No, this has been fixed by https://tracker.ceph.com/issues/48640 in kclient upstream.

Actions #13

Updated by Venky Shankar 7 months ago

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo, this is still waiting on fix to https://tracker.ceph.com/issues/48640, yes?

No, this has been fixed by https://tracker.ceph.com/issues/48640 in kclient upstream.

Right. but you did mention about downstream backporting the fix to rhel.

Actions #14

Updated by Xiubo Li 7 months ago

Venky Shankar wrote:

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo, this is still waiting on fix to https://tracker.ceph.com/issues/48640, yes?

No, this has been fixed by https://tracker.ceph.com/issues/48640 in kclient upstream.

Right. but you did mention about downstream backporting the fix to rhel.

Please see the latest fixing for this in https://tracker.ceph.com/issues/59343. The corresponding patches just applied to mainline recently and haven't backport to downstream yet.

The patchwork link for kclient: https://patchwork.kernel.org/project/ceph-devel/patch/20230511100911.361132-1-xiubli@redhat.com/

I will do it next week.

Actions #15

Updated by Venky Shankar 7 months ago

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo Li wrote:

Venky Shankar wrote:

Xiubo, this is still waiting on fix to https://tracker.ceph.com/issues/48640, yes?

No, this has been fixed by https://tracker.ceph.com/issues/48640 in kclient upstream.

Right. but you did mention about downstream backporting the fix to rhel.

Please see the latest fixing for this in https://tracker.ceph.com/issues/59343. The corresponding patches just applied to mainline recently and haven't backport to downstream yet.

The patchwork link for kclient: https://patchwork.kernel.org/project/ceph-devel/patch/20230511100911.361132-1-xiubli@redhat.com/

I will do it next week.

Nice - that's what I was looking for if its in downstream rhel or not. Thx, Xiubo.

Actions #16

Updated by Venky Shankar 5 months ago

Any update on this, Xiubo?

Actions #17

Updated by Xiubo Li 5 months ago

Venky Shankar wrote:

Any update on this, Xiubo?

Already done this weeks ago, please see Jira https://issues.redhat.com/browse/RHEL-16412. The MR has been approved and waiting to be applied by the rhel team in downstream.

Actions

Also available in: Atom PDF