Project

General

Profile

Actions

Bug #3470

closed

ceph-fuse: hang when Fr cap revoked

Added by Sage Weil over 11 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the fuse client (0.53-309-gca89880) gets a revoke on Fr, but doesn't respond. processes using the fuse mount hang.

2012-11-09 07:09:08.028498 7f895cdfa700  1 -- 10.2.0.253:0/3905 --> mds.0 10.2.0.1:6800/16077 -- client_caps(update ino 1000000368a 9370 seq 32 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7f8940260cf0
2012-11-09 07:09:08.072645 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== mds.0 10.2.0.1:6800/16077 12831 ==== client_caps(revoke ino 1000000368a 9370 seq 33 caps=pAsLsXsFc dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+0+0 (3777778058 0 0) 0x7f8848433f10 con 0x7f8938007fe0
2012-11-09 07:09:08.072745 7f895cdfa700  1 -- 10.2.0.253:0/3905 --> 10.2.0.1:6810/19242 -- osd_op(client.4603.0:93027 1000000368a.000004d6 [write 4106752~16384 [2@0]] 0.c8d77fe2 snapc 1=[]) v4 -- ?+0 0x7f89402d9c40 con 0x7f894006ce30
2012-11-09 07:09:08.131507 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== mds.0 10.2.0.1:6800/16077 12832 ==== client_caps(revoke ino 1000000368a 9370 seq 34 caps=pAsLsXs dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+0+0 (2492815406 0 0) 0x7f8848433f10 con 0x7f8938007fe0
2012-11-09 07:09:08.132018 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== mds.0 10.2.0.1:6800/16077 12833 ==== client_caps(grant ino 1000000368a 9370 seq 35 caps=pAsLsXsFr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+0+0 (1267389476 0 0) 0x7f8848433f10 con 0x7f8938007fe0
2012-11-09 07:09:08.133110 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== mds.0 10.2.0.1:6800/16077 12834 ==== client_caps(grant ino 1000000368a 9370 seq 36 caps=pAsLsXsFrw dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+0+0 (1041501089 0 0) 0x7f8848433f10 con 0x7f8938007fe0
2012-11-09 07:09:08.134941 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== mds.0 10.2.0.1:6800/16077 12835 ==== client_caps(revoke ino 1000000368a 9370 seq 37 caps=pAsLsXsFr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+0+0 (4167948737 0 0) 0x7f8848433f10 con 0x7f8938007fe0
2012-11-09 07:09:08.140996 7f895cdfa700  1 -- 10.2.0.253:0/3905 <== osd.2 10.2.0.1:6810/19242 1287 ==== osd_op_reply(93027 1000000368a.000004d6 [write 4106752~16384 [2@0]] ondisk = 0) v4 ==== 119+0+0 (490479954 0 0) 0x7f88cc97a890 con 0x7f894006ce30

the mds is revoking because another node did a stat/getattr:

2012-11-09 07:09:08.008991 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(grant ino 1000000368a 9370 seq 26 caps=pAsxLsXsFsxcrwb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929
) v2 -- ?+0 0x7fc5fc604b10 con 0x7fc554005140
2012-11-09 07:09:08.009057 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_caps(revoke ino 1000000368a 9391 seq 16 caps=pLs dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7fc5fc6053c0 con 0x
7fc554003600
2012-11-09 07:09:08.009694 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7933 ==== client_request(client.4654:1769 getattr pAsLsXsFs #1000000368a) v1 ==== 114+0+0 (3209972863 0 0) 0x7fc518000dd0 con 0x7fc554003600
2012-11-09 07:09:08.011280 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 27 caps=pAsxLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v
2 -- ?+0 0x7fc5fcea5dc0 con 0x7fc554005140
2012-11-09 07:09:08.011389 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_caps(grant ino 1000000368a 9391 seq 17 caps=pLsFcb dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7fc5fc41a000 con 
0x7fc554003600
2012-11-09 07:09:08.011574 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7934 ==== client_caps(update ino 1000000368a 9391 seq 16 caps=pLs dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+
0+0 (1928355378 0 0) 0x7fc5180028a0 con 0x7fc554003600
2012-11-09 07:09:08.012494 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4603 10.2.0.253:0/3905 11080 ==== client_caps(update ino 1000000368a 9370 seq 27 caps=pAsxLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07
.559929) v2 ==== 180+0+0 (2563743033 0 0) 0x7fc5080027c0 con 0x7fc554005140
2012-11-09 07:09:08.014180 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 28 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2
 -- ?+0 0x7fc5fc7cd580 con 0x7fc554005140
2012-11-09 07:09:08.014287 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_caps(revoke ino 1000000368a 9391 seq 18 caps=pLs dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7fc5fc616810 con 0x
7fc554003600
2012-11-09 07:09:08.015336 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(grant ino 1000000368a 9370 seq 29 caps=pAsLsXsxFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2
 -- ?+0 0x7fc5fc75e5b0 con 0x7fc554005140
2012-11-09 07:09:08.016324 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4603 10.2.0.253:0/3905 11081 ==== client_caps(update ino 1000000368a 9370 seq 28 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.
559929) v2 ==== 180+0+0 (4210975244 0 0) 0x7fc508002be0 con 0x7fc554005140
2012-11-09 07:09:08.018277 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 30 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2
 -- ?+0 0x7fc5fc794a30 con 0x7fc554005140
2012-11-09 07:09:08.018930 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(grant ino 1000000368a 9370 seq 31 caps=pAsxLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2
 -- ?+0 0x7fc5fc698e80 con 0x7fc554005140
2012-11-09 07:09:08.018995 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7935 ==== client_caps(update ino 1000000368a 9391 seq 18 caps=pLs dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 180+
0+0 (2013859213 0 0) 0x7fc5180028a0 con 0x7fc554003600
2012-11-09 07:09:08.019691 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4603 10.2.0.253:0/3905 11082 ==== client_caps(update ino 1000000368a 9370 seq 30 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.
559929) v2 ==== 180+0+0 (4027003315 0 0) 0x7fc5080027c0 con 0x7fc554005140
2012-11-09 07:09:08.021607 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_reply(???:1769 = 0 Success) v1 -- ?+0 0x7fc5fc732ee0 con 0x7fc554003600
2012-11-09 07:09:08.022473 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_caps(revoke ino 1000000368a 9391 seq 20 caps=pLs dirty=- wanted=Fc follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7fc5fc69d930 con 0x
7fc554003600
2012-11-09 07:09:08.022917 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7936 ==== client_request(client.4654:1770 open #1000000368a) v1 ==== 114+0+0 (2178596178 0 0) 0x7fc518004a80 con 0x7fc554003600
2012-11-09 07:09:08.024356 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 32 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2
 -- ?+0 0x7fc5fc7ae8b0 con 0x7fc554005140
2012-11-09 07:09:08.024545 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_reply(???:1770 = 0 Success) v1 -- ?+0 0x7fc5fc64f000 con 0x7fc554003600
2012-11-09 07:09:08.025058 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7937 ==== client_caps(update ino 1000000368a 9391 seq 20 caps=pLs dirty=- wanted=pFscr follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 ==== 1
80+0+0 (3952929753 0 0) 0x7fc5180028a0 con 0x7fc554003600
2012-11-09 07:09:08.025866 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4603 10.2.0.253:0/3905 11083 ==== client_caps(update ino 1000000368a 9370 seq 32 caps=pAsLsXsFcb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.
559929) v2 ==== 180+0+0 (1018572166 0 0) 0x7fc508002be0 con 0x7fc554005140
2012-11-09 07:09:08.026549 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7938 ==== client_request(client.4654:1771 getattr pAsLsXsFs #1000000368a) v1 ==== 114+0+0 (2247241141 0 0) 0x7fc518004a80 con 0x7fc554003600
2012-11-09 07:09:08.068532 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 33 caps=pAsLsXsFc dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 
-- ?+0 0x7fc5fc6b4930 con 0x7fc554005140
2012-11-09 07:09:08.126587 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_reply(???:1771 = 0 Success) v1 -- ?+0 0x7fc5fc6d9400 con 0x7fc554003600
2012-11-09 07:09:08.127461 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 34 caps=pAsLsXs dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 --
 ?+0 0x7fc5fc6852e0 con 0x7fc554005140
2012-11-09 07:09:08.127542 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.246:0/3727 -- client_caps(revoke ino 1000000368a 9391 seq 23 caps=pAsLsXsFr dirty=- wanted=pFscr follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -- ?+0 0x7fc5fdbad1
40 con 0x7fc554003600
2012-11-09 07:09:08.127923 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(grant ino 1000000368a 9370 seq 35 caps=pAsLsXsFr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 -
- ?+0 0x7fc5fdbaf510 con 0x7fc554005140
2012-11-09 07:09:08.128238 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7939 ==== client_caps(update ino 1000000368a 9391 seq 23 caps=pAsLsXsFr dirty=- wanted=pFscr follows 0 size 5196677120/0 ts 2 mtime 2012-11-09 07:09:07.559929) v2 
==== 180+0+0 (1554033813 0 0) 0x7fc5180028a0 con 0x7fc554003600
2012-11-09 07:09:08.129012 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(grant ino 1000000368a 9370 seq 36 caps=pAsLsXsFrw dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 
-- ?+0 0x7fc5fc6f3050 con 0x7fc554005140
2012-11-09 07:09:08.129218 7fc618c0e700  1 -- 10.2.0.1:6800/16077 <== client.4654 10.2.0.246:0/3727 7940 ==== client_request(client.4654:1772 getattr pAsLsXsFs #1000000368a) v1 ==== 114+0+0 (3612065550 0 0) 0x7fc518003e90 con 0x7fc554003600
2012-11-09 07:09:08.130854 7fc618c0e700  1 -- 10.2.0.1:6800/16077 --> 10.2.0.253:0/3905 -- client_caps(revoke ino 1000000368a 9370 seq 37 caps=pAsLsXsFr dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 5196677120/17179869184 ts 2 mtime 2012-11-09 07:09:07.559929) v2 
-- ?+0 0x7fc5fc66e720 con 0x7fc554005140

mds is 0.48.2argonaut-30-g96888dd

Actions #1

Updated by Sage Weil over 11 years ago

  • Status changed from 12 to Resolved

fixed by commit:ad4bd4e23e1c3e1d8156e57c83c26e66d475cdd7

Actions #2

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF