Project

General

Profile

Actions

Bug #1200

closed

4-MDS fsstress remote ino lookup cycle

Added by Greg Farnum almost 13 years ago. Updated over 7 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Notice how it's missing an expected ino in a dir that's marked complete, and then it just tries to do the lookup agan.

2011-06-16 17:52:39.329752 7f3f8629e710 mds0.server dispatch_client_request client_request(client4115:10004 lookup #40000000022/f6f) v1
2011-06-16 17:52:39.329757 7f3f8629e710 mds0.server rdlock_path_pin_ref request(client4115:10004 cr=0x1602480) #40000000022/f6f
2011-06-16 17:52:39.329762 7f3f8629e710 mds0.cache traverse: opening base ino 40000000022 snap head
2011-06-16 17:52:39.329766 7f3f8629e710 mds0.cache traverse: path seg depth 0 'f6f' snapid head
2011-06-16 17:52:39.329771 7f3f8629e710 mds0.cache.dir(40000000022) lookup (head, 'f6f')
2011-06-16 17:52:39.329776 7f3f8629e710 mds0.cache.dir(40000000022)   hit -> (f6f,head)
2011-06-16 17:52:39.329781 7f3f8629e710 mds0.cache remote link to 4000000002e, which i don't have
2011-06-16 17:52:39.329785 7f3f8629e710 mds0.cache _get_waiter retryrequest
2011-06-16 17:52:39.329789 7f3f8629e710 mds0.cache open_remote_ino on 4000000002e
2011-06-16 17:52:39.329800 7f3f8629e710 -- 10.0.1.205:6803/14232 --> mds0 10.0.1.205:6803/14232 -- mds_table_request(anchortable query 8 bytes) v1 -- ?+0 0x158ba00
2011-06-16 17:52:39.329813 7f3f8629e710 -- 10.0.1.205:6803/14232 <== mds0 10.0.1.205:6803/14232 0 ==== mds_table_request(anchortable query 8 bytes) v1 ==== 0+0+0 (0 0 0) 0x158ba00 con 0x1137280
2011-06-16 17:52:39.329820 7f3f8629e710 mds0.anchorserver handle_lookup mds_table_request(anchortable query 8 bytes) v1 ino 4000000002e
2011-06-16 17:52:39.329827 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(4000000002e 40000000026/2685483254 1 v99)
2011-06-16 17:52:39.329832 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(40000000026 40000000022/3293366682 5 v525)
2011-06-16 17:52:39.329837 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(40000000022 1000000001c/3920418541 8 v525)
2011-06-16 17:52:39.329842 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(1000000001c 1000000000d/2876722588 18 v525)
2011-06-16 17:52:39.329847 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(1000000000d 1/3591908529 18 v525)
2011-06-16 17:52:39.329856 7f3f8629e710 -- 10.0.1.205:6803/14232 --> 10.0.1.205:6803/14232 -- mds_table_request(anchortable query_reply tid 536 177 bytes) v1 -- ?+0 0x135ea00 con 0x1137280
2011-06-16 17:52:39.329868 7f3f8629e710 -- 10.0.1.205:6803/14232 <== mds0 10.0.1.205:6803/14232 0 ==== mds_table_request(anchortable query_reply tid 536 177 bytes) v1 ==== 0+0+0 (0 0 0) 0x135ea00 con 0x1137280
2011-06-16 17:52:39.329875 7f3f8629e710 mds0.tableclient(anchortable) handle_request mds_table_request(anchortable query_reply tid 536 177 bytes) v1
2011-06-16 17:52:39.329880 7f3f8629e710 mds0.anchorclient handle_anchor_reply mds_table_request(anchortable query_reply tid 536 177 bytes) v1
2011-06-16 17:52:39.329886 7f3f8629e710 mds0.cache open_remote_ino_2 on 4000000002e, trace depth is 5
2011-06-16 17:52:39.329891 7f3f8629e710 mds0.cache  5: a(4000000002e 40000000026/2685483254 1 v99)
2011-06-16 17:52:39.329896 7f3f8629e710 mds0.cache  4: a(40000000026 40000000022/3293366682 5 v525)
2011-06-16 17:52:39.329900 7f3f8629e710 mds0.cache  3: a(40000000022 1000000001c/3920418541 8 v525)
2011-06-16 17:52:39.329919 7f3f8629e710 mds0.cache deepest cached inode at 3 is [inode 40000000022 [...2,head] /p7/d3/d24/ auth{1=1,2=1,3=1} v552 na=2 f(v7 m2011-06-16 17:52:09.020522 11=8+3) n(v18 rc2011-06-16 17:52:14.318441 b2420
5204 a10 100=70+30) (inest mix) (ifile excl) (iversion lock) caps={4115=pAsLsXsFsx/-@10},l=4115 | dirfrag caps replicated dirty 0x16e43a0]
2011-06-16 17:52:39.329933 7f3f8629e710 mds0.cache expected ino 40000000026 in complete dir [dir 40000000022 /p7/d3/d24/ [2,head] auth{1=1,2=1,3=1} v=386 cv=0/0 na=2 state=1610612738|complete f(v7 m2011-06-16 17:52:09.020522 11=8+3)
 n(v18 rc2011-06-16 17:52:14.318441 b24205204 a10 99=70+29) hs=11+6,ss=0+0 dirty=14 | child replicated dirty 0x19a8000], requerying anchortable
2011-06-16 17:52:39.329937 7f3f8629e710 mds0.cache open_remote_ino on 4000000002e
2011-06-16 17:52:39.329950 7f3f8629e710 -- 10.0.1.205:6803/14232 --> mds0 10.0.1.205:6803/14232 -- mds_table_request(anchortable query 8 bytes) v1 -- ?+0 0x158ba00
2011-06-16 17:52:39.329963 7f3f8629e710 -- 10.0.1.205:6803/14232 <== mds0 10.0.1.205:6803/14232 0 ==== mds_table_request(anchortable query 8 bytes) v1 ==== 0+0+0 (0 0 0) 0x158ba00 con 0x1137280
2011-06-16 17:52:39.329969 7f3f8629e710 mds0.anchorserver handle_lookup mds_table_request(anchortable query 8 bytes) v1 ino 4000000002e
2011-06-16 17:52:39.329975 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(4000000002e 40000000026/2685483254 1 v99)
2011-06-16 17:52:39.329979 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(40000000026 40000000022/3293366682 5 v525)
2011-06-16 17:52:39.329984 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(40000000022 1000000001c/3920418541 8 v525)
2011-06-16 17:52:39.329989 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(1000000001c 1000000000d/2876722588 18 v525)
2011-06-16 17:52:39.329993 7f3f8629e710 mds0.anchorserver handle_lookup  adding a(1000000000d 1/3591908529 18 v525)
2011-06-16 17:52:39.330002 7f3f8629e710 -- 10.0.1.205:6803/14232 --> 10.0.1.205:6803/14232 -- mds_table_request(anchortable query_reply tid 536 177 bytes) v1 -- ?+0 0x135ea00 con 0x1137280
2011-06-16 17:52:39.330012 7f3f8629e710 -- 10.0.1.205:6803/14232 <== mds0 10.0.1.205:6803/14232 0 ==== mds_table_request(anchortable query_reply tid 536 177 bytes) v1 ==== 0+0+0 (0 0 0) 0x135ea00 con 0x1137280
2011-06-16 17:52:39.330018 7f3f8629e710 mds0.tableclient(anchortable) handle_request mds_table_request(anchortable query_reply tid 536 177 bytes) v1
2011-06-16 17:52:39.330023 7f3f8629e710 mds0.anchorclient handle_anchor_reply mds_table_request(anchortable query_reply tid 536 177 bytes) v1
2011-06-16 17:52:39.330028 7f3f8629e710 mds0.cache open_remote_ino_2 on 4000000002e, trace depth is 5
2011-06-16 17:52:39.330033 7f3f8629e710 mds0.cache  5: a(4000000002e 40000000026/2685483254 1 v99)
2011-06-16 17:52:39.330037 7f3f8629e710 mds0.cache  4: a(40000000026 40000000022/3293366682 5 v525)
2011-06-16 17:52:39.330042 7f3f8629e710 mds0.cache  3: a(40000000022 1000000001c/3920418541 8 v525)
2011-06-16 17:52:39.330057 7f3f8629e710 mds0.cache deepest cached inode at 3 is [inode 40000000022 [...2,head] /p7/d3/d24/ auth{1=1,2=1,3=1} v552 na=2 f(v7 m2011-06-16 17:52:09.020522 11=8+3) n(v18 rc2011-06-16 17:52:14.318441 b2420
5204 a10 100=70+30) (inest mix) (ifile excl) (iversion lock) caps={4115=pAsLsXsFsx/-@10},l=4115 | dirfrag caps replicated dirty 0x16e43a0]
2011-06-16 17:52:39.330071 7f3f8629e710 mds0.cache expected ino 40000000026 in complete dir [dir 40000000022 /p7/d3/d24/ [2,head] auth{1=1,2=1,3=1} v=386 cv=0/0 na=2 state=1610612738|complete f(v7 m2011-06-16 17:52:09.020522 11=8+3)
 n(v18 rc2011-06-16 17:52:14.318441 b24205204 a10 99=70+29) hs=11+6,ss=0+0 dirty=14 | child replicated dirty 0x19a8000], got same anchor a(40000000026 40000000022/3293366682 5 v525) 2x in a row
2011-06-16 17:52:39.330077 7f3f8629e710 mds0.server dispatch_client_request client_request(client4115:10004 lookup #40000000022/f6f) v1
2011-06-16 17:52:39.330082 7f3f8629e710 mds0.server rdlock_path_pin_ref request(client4115:10004 cr=0x1602480) #40000000022/f6f

Logs in kai:~gregf/logs/fsstress/odd_lookup_bug

Actions #1

Updated by Greg Farnum almost 13 years ago

Looks like this is the same root cause as #1199.

Actions #2

Updated by Sage Weil almost 13 years ago

  • Target version changed from v0.30 to v0.31
Actions #3

Updated by Greg Farnum almost 13 years ago

  • Status changed from New to Duplicate

Yep, this is the same as #1199.

Actions #4

Updated by Sage Weil almost 13 years ago

  • Target version changed from v0.31 to v0.32
Actions #5

Updated by Sage Weil almost 13 years ago

  • Translation missing: en.field_position set to 711
Actions #6

Updated by Sage Weil almost 13 years ago

  • Target version changed from v0.32 to v0.31
  • Translation missing: en.field_position deleted (710)
  • Translation missing: en.field_position set to 722
Actions #7

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Target version deleted (v0.31)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF