Project

General

Profile

Bug #43905

qa: test_rebuild_inotable infinite loop

Added by Patrick Donnelly 19 days ago. Updated 6 days ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
Pull request ID:
Crash signature:

Description

2020-01-26T00:41:08.331 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo mkdir dir1
2020-01-26T00:41:08.373 INFO:teuthology.orchestra.run.smithi023:> sudo adjust-ulimits daemon-helper kill python3 -c '
2020-01-26T00:41:08.373 INFO:teuthology.orchestra.run.smithi023:> import os
2020-01-26T00:41:08.373 INFO:teuthology.orchestra.run.smithi023:> import stat
2020-01-26T00:41:08.374 INFO:teuthology.orchestra.run.smithi023:>
2020-01-26T00:41:08.374 INFO:teuthology.orchestra.run.smithi023:> print(os.stat("/home/ubuntu/cephtest/mnt.0/dir1").st_ino)
2020-01-26T00:41:08.374 INFO:teuthology.orchestra.run.smithi023:> '
2020-01-26T00:41:08.507 INFO:teuthology.orchestra.run.smithi023.stdout:1099511627776
2020-01-26T00:41:08.693 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo setfattr -n ceph.dir.pin -v 1 dir1
2020-01-26T00:41:09.779 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo touch dir1/file1
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:> sudo adjust-ulimits daemon-helper kill python3 -c '
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:> import os
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:> import stat
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:>
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:> print(os.stat("/home/ubuntu/cephtest/mnt.0/dir1/file1").st_ino)
2020-01-26T00:41:09.818 INFO:teuthology.orchestra.run.smithi023:> '
2020-01-26T00:41:09.949 INFO:teuthology.orchestra.run.smithi023.stdout:1099511627777
2020-01-26T00:41:10.137 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo rm -f dir1/file1
...
2020-01-26T05:10:48.681 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo rm -f dir1/file1
2020-01-26T05:10:48.964 INFO:teuthology.orchestra.run.smithi023:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:48.967 INFO:teuthology.orchestra.run.smithi112:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:48.969 INFO:teuthology.orchestra.run.smithi116:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:48.972 INFO:teuthology.orchestra.run.smithi140:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:48.974 INFO:teuthology.orchestra.run.smithi177:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:48.977 INFO:teuthology.orchestra.run.smithi194:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-26T05:10:49.757 INFO:teuthology.orchestra.run.smithi023:> cd /home/ubuntu/cephtest/mnt.0 && sudo touch dir1/file1
2020-01-26T05:10:49.792 DEBUG:teuthology.exit:Got signal 15; running 2 handlers...

from: /ceph/teuthology-archive/pdonnell-2020-01-25_00:06:52-kcephfs-wip-pdonnell-testing-20200124.211519-distro-basic-smithi/4703590/teuthology.log

This is probably caused by https://github.com/ceph/ceph/pull/32816

I noticed also that this check

https://github.com/ceph/ceph/blob/88c49d483a9d9aeb6bbddd44b8857847133f62b2/qa/tasks/cephfs/test_data_scan.py#L623

looks like a mistake. Should it not be (1 << 40)?

History

#1 Updated by Zheng Yan 18 days ago

(2 << 40) is correct because inode number of rank 1 start at (2 << 40)

#2 Updated by Zheng Yan 18 days ago

It's a bug revealed by 'mds: cleanup '* -> excl' check in Locker::file_eval()'

#3 Updated by Patrick Donnelly 11 days ago

Zheng Yan wrote:

(2 << 40) is correct because inode number of rank 1 start at (2 << 40)

That's pretty weird. It should be (1 << 41).

#4 Updated by Zheng Yan 11 days ago

void InoTable::reset_state()
{
  // use generic range. FIXME THIS IS CRAP
  free.clear();
  //#ifdef __LP64__
  uint64_t start = (uint64_t)(rank+1) << 40;
  uint64_t len = (uint64_t)1 << 40;
  //#else
  //# warning this looks like a 32-bit system, using small inode numbers.
  //  uint64_t start = (uint64_t)(mds->get_nodeid()+1) << 25;
  //  uint64_t end = ((uint64_t)(mds->get_nodeid()+2) << 25) - 1;
  //#endif
  free.insert(start, len);

  projected_free = free;
}

#5 Updated by Zheng Yan 6 days ago

  • Status changed from New to Closed

It's bug in test branch

Also available in: Atom PDF