Project

General

Profile

Actions

Bug #4832

closed

mds: failed auth_unpin assert

Added by Greg Farnum almost 11 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-04-26T05:27:42.650 INFO:teuthology.task.ceph.mds.a.err:mds/CInode.cc: In function 'virtual void CInode::auth_unpin(void*)' thread 7fd8d1ac1700 time 2013-04-26 05:27:44.459767
2013-04-26T05:27:42.651 INFO:teuthology.task.ceph.mds.a.err:mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
2013-04-26T05:27:42.687 INFO:teuthology.task.ceph.mds.a.err: ceph version 0.60-666-ga5cade1 (a5cade1fe7338602fb2bbfa867433d825f337c87)
2013-04-26T05:27:42.687 INFO:teuthology.task.ceph.mds.a.err: 1: (CInode::auth_unpin(void*)+0x3b6) [0x698e96]
2013-04-26T05:27:42.687 INFO:teuthology.task.ceph.mds.a.err: 2: (Mutation::drop_local_auth_pins()+0x34) [0x576e44]
2013-04-26T05:27:42.687 INFO:teuthology.task.ceph.mds.a.err: 3: (Mutation::cleanup()+0x9) [0x577209]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 4: (Locker::file_update_finish(CInode*, Mutation*, bool, client_t, Capability*, MClientCaps*)+0x188) [0x61fb98]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 5: (C_Locker_FileUpdate_finish::finish(int)+0x2c) [0x632bbc]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 6: (Context::complete(int)+0xa) [0x4bf4ba]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 7: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x95) [0x4e3835]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 8: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x19f) [0x6e47df]
2013-04-26T05:27:42.688 INFO:teuthology.task.ceph.mds.a.err: 9: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe55) [0x700705]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 10: (MDS::handle_core_message(Message*)+0xae8) [0x4df168]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 11: (MDS::_dispatch(Message*)+0x2f) [0x4df32f]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 12: (MDS::ms_dispatch(Message*)+0x1d3) [0x4e0db3]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 13: (DispatchQueue::entry()+0x3f1) [0x82e8e1]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 14: (DispatchQueue::DispatchThread::entry()+0xd) [0x7ab89d]
2013-04-26T05:27:42.689 INFO:teuthology.task.ceph.mds.a.err: 15: (()+0x7e9a) [0x7fd8d5997e9a]
2013-04-26T05:27:42.690 INFO:teuthology.task.ceph.mds.a.err: 16: (clone()+0x6d) [0x7fd8d43baccd]
2013-04-26T05:27:42.690 INFO:teuthology.task.ceph.mds.a.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

/a/teuthology-2013-04-26_02:29:14-fs-next-testing-basic/1454


Files

ceph-mds.0.log.bz2 (8.5 MB) ceph-mds.0.log.bz2 Sage Weil, 06/06/2013 09:53 PM
patch (690 Bytes) patch Zheng Yan, 06/07/2013 09:36 PM

Related issues 1 (0 open1 closed)

Related to CephFS - Bug #5105: mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)Duplicate05/17/2013

Actions
Actions #1

Updated by Greg Farnum almost 11 years ago

  • Description updated (diff)
Actions #2

Updated by Sage Weil almost 11 years ago

  • Priority changed from Urgent to High
Actions #3

Updated by Sage Weil almost 11 years ago

hit this again:

ubuntu@teuthology:/a/teuthology-2013-05-02_01:00:12-rados-next-testing-basic/5291$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: 17e474922926008cb3ac8693bec20348f1db6fd3
machine_type: plana
nuke-on-error: true
overrides:
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
      osd:
        filestore flush min: 0
    fs: xfs
    log-whitelist:
    - slow request
    sha1: 17c14b251d0b81a7443ce669d86a04dbfeae962e
  s3tests:
    branch: next
  workunit:
    sha1: 17c14b251d0b81a7443ce669d86a04dbfeae962e
roles:
- - mon.0
  - mon.1
  - mon.2
  - mds.0
  - client.0
- - osd.0
- - osd.1
- - osd.2
tasks:
- chef: null
- clock.check: null
- install: null
- ceph: null
- thrashosds:
    chance_down: 1.0
    powercycle: true
- ceph-fuse: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh

Actions #4

Updated by Sage Weil almost 11 years ago

  • Priority changed from High to Urgent

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-08_01:00:07-rados-master-testing-basic/8600

Actions #5

Updated by Sage Weil almost 11 years ago

  • Assignee set to Sage Weil
Actions #6

Updated by Tamilarasi muthamizhan almost 11 years ago

ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:06-rados-master-testing-basic/10278

ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:06-rados-master-testing-basic/10278$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: b5b09be30cf99f9c699e825629f02e3bce555d44
machine_type: plana
nuke-on-error: true
overrides:
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
      osd:
        filestore flush min: 0
    fs: xfs
    log-whitelist:
    - slow request
    sha1: b353da6f682d223ba14812da0fe814eca72ad6f5
  s3tests:
    branch: master
  workunit:
    sha1: b353da6f682d223ba14812da0fe814eca72ad6f5
roles:
- - mon.0
  - mon.1
  - mon.2
  - mds.0
  - client.0
- - osd.0
- - osd.1
- - osd.2
targets:
  ubuntu@plana59.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2ijQopwlP92QQaUVB/Dx8HLmRokvfKoN4+2xGBvej1Qb/P67By9WNF92tQspMpLSpvqodRaXxcKixvrlLEJJQ16lY9xje2jNDWRjJ/AjnAg79qRA7N6L6BC7rccZ8oZWXqOpNKIKRNUUyn+XwtVgkbbcUr5NjQG2CfsLPtoNkdX98a/UyvhefomanU6oStJHFIptLXe2WjgW6DoXofu9qIpyZrO75XFOIlEpC2QTXU6eTVGRxhL2qtVVR42z7bcrNuVFr4TTTNvqaabyvW2mPZNz3a8ojh/pb8gIZGFYydiG8aOsS9dapltOysmdwfBPXHsAoD/49vbPEHIdyjarp
  ubuntu@plana65.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+uJxIXpb3ibScQKkwN34uM+kyyhn7pDbju4+u6A0Dj+c9sOAr2QZpHEyMpJycM6qiYZa8tavxE0fUaTwWNQE3mGQYQZIJDIld/mWvVpVsIC4q6ExwaCTr0b4+AxFmo4vsAKwIWKYu3JJUfvS++8VZdxeJpENaQ5ZU9Uyi1ssQZokzEjQafnQ1KgW8jyo5uOgd2JOXh7cWHyJodVGaCAW5wYEplCRF1m+Z4i7rRuVb8FA47bEXQb8Qe0pA0tc+VWy/poYsh8OPFHyklaJ0qSPp3ph6DvrDINKKLUBqFXt7GiMxvHcRzSHTp6jxMANYg8jnnf28tdbjL0bWnhNfy2D9
  ubuntu@plana72.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQCJofY7+jrgKw/C8b97BugITAXus6OM2HLdrJWiSu8PIyqMRYYLBcDS8n1rBEcRIYREU2S8EiARcmyE0wRRrVNDtTZ4pH8d2L1uIJDhAj0Kt2fYscas7MQvESY9DwEnyno+Id60cRIJ5u5ZIBAAtocLeSE69CL/mbewJxgj9H2wOMvjDLfubN0ybMx9DEbYttPfVXRkpZngYDk9MzovOTr7rxDFJvJbz8VJfn8MguE0lXefM+OlVeqDZ8Y4DH/bsLopZ874LTF7L7h1MFHgSPaA2pDGDH6kOo7bY+lFjmVll1jRxjCS3zUlTyep7VGjXnKkRthjQuIjoqEnesGEhh
  ubuntu@plana77.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDaTe7DFODgdj462dM6uttDNslntHcsOcsgmcFPfRT2ziIh3SkjZy8hgHu1aGIUcIn81GxUUcsSmo1Q9mvLe+rg6jBwW7tZpm8ntCsK6MKH1Xa77bvODWNoXQ+RU68RwPMkYUOFgxMNLXFZsAJojaNwKYnITVABS0Y20tiOKsxiIjCA9X1RIOTuPNwCQRvejE1iOnwhKE0kuo3iXhdw//hmuoO9lc0F53/kT/TPZWSXc3SvVbpdVlUG7H1ZUNZsLKjhWSFDnbAdftu/3Bv4VWxSGQgKBPFQzEw5MuHFw6WYrupm/C8iRICe76lrjeYCAdqASFuesSoPtV8+bIWvi5HB
tasks:
- internal.lock_machines:
  - 4
  - plana
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install: null
- ceph: null
- thrashosds:
    chance_down: 1.0
    powercycle: true
- ceph-fuse: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh

Actions #7

Updated by Tamilarasi muthamizhan almost 11 years ago

recent logs: ubuntu@teuthology:/a/teuthology-2013-05-14_01:00:46-kernel-next-testing-basic/13128

Actions #8

Updated by Sage Weil almost 11 years ago

  • Status changed from New to Need More Info

cranked up mds logs in qa.. should get useful info next time we hit this.

Actions #9

Updated by Sage Weil almost 11 years ago

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-21_01:00:40-fs-next-testing-basic/18590

Actions #10

Updated by Sage Weil almost 11 years ago

  • Status changed from Need More Info to 12
Actions #11

Updated by Sage Weil almost 11 years ago

ubuntu@teuthology:/a/teuthology-2013-05-23_01:00:08-rados-next-testing-basic/20276

Actions #12

Updated by Sage Weil almost 11 years ago

it's getting recovered twice:

ubuntu@plana95:/var/log/ceph$ grep 'inode 1000000005c' ceph-mds.0.log  | grep recovered
2013-06-05 17:35:31.579351 7fbc4a516700 10 mds.0.cache _recovered r=0 size=10264 mtime=2013-06-05 17:11:06.112971 for [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsLsXsFscr/Fxcb@12},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:35:43.693341 7fbc4a516700 10 mds.0.cache _recovered r=0 size=10264 mtime=2013-06-05 17:11:06.112402 for [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v1907 s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsLsXsFsxcrwb/Fxcb@13},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
 -1512> 2013-06-05 17:35:31.579351 7fbc4a516700 10 mds.0.cache _recovered r=0 size=10264 mtime=2013-06-05 17:11:06.112971 for [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsLsXsFscr/Fxcb@12},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
   -67> 2013-06-05 17:35:43.693341 7fbc4a516700 10 mds.0.cache _recovered r=0 size=10264 mtime=2013-06-05 17:11:06.112402 for [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v1907 s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsLsXsFsxcrwb/Fxcb@13},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]

Actions #13

Updated by Sage Weil almost 11 years ago

lgo is here flab:/home/sage/tmp/4832

Actions #14

Updated by Sage Weil almost 11 years ago

ck=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:11:29.451546 7fbc4a516700 10 mds.0.locker  issue_seq 2 != 1 on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsXsFsxcrwb/Fxcb@2},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420355 7fbc48411700 10 mds.0.locker  revoking pAsXsFsxcrwb on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsXsFsxcrwb/Fxcb@2},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420372 7fbc48411700 10 mds.0.locker eval 2496 [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 needsrecover s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420385 7fbc48411700  7 mds.0.locker file_eval wanted= loner_wanted= other_wanted=  filelock=(ifile excl) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 needsrecover s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420402 7fbc48411700  7 mds.0.locker simple_sync on (ifile excl) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 needsrecover s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420415 7fbc48411700 10 mds.0.cache queue_file_recover [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 needsrecover s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420430 7fbc48411700 15 mds.0.cache _queue_file_recover [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 needsrecover s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=0 0x20935a0]
2013-06-05 17:30:58.420443 7fbc48411700 10 mds.0.cache.ino(1000000005c) auth_pin by 0x1699000 on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=1+0(0x1699000) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0] now 1+0
2013-06-05 17:30:58.420478 7fbc48411700 10 mds.0.cache.ino(1000000005c) auth_pin by 0x2093d20 on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0] now 2+0
2013-06-05 17:30:58.420506 7fbc48411700 10 mds.0.locker simple_eval (iauth sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.420520 7fbc48411700 10 mds.0.locker simple_eval (ilink sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.420533 7fbc48411700 10 mds.0.locker simple_eval (ixattr sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.420550 7fbc48411700  7 mds.0.locker issue_caps allowed=pAsLsXs, xlocker allowed=pAsLsXsFscr on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3} | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509686 7fbc4a516700 10 mds.0.locker  clearing stale flag on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3} | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509700 7fbc4a516700 10 mds.0.locker eval 2496 [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3} | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509715 7fbc4a516700 10 mds.0.locker eval_gather (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509732 7fbc4a516700  7 mds.0.locker eval_gather finished gather on (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509747 7fbc4a516700 10 mds.0.locker simple_eval (iauth sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509760 7fbc4a516700 10 mds.0.locker simple_eval (ilink sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509773 7fbc4a516700 10 mds.0.locker simple_eval (ixattr sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509786 7fbc4a516700 10 mds.0.locker scatter_eval (inest lock) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509799 7fbc4a516700 10 mds.0.locker simple_eval (iflock sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509812 7fbc4a516700 10 mds.0.locker simple_eval (ipolicy sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:30:58.509825 7fbc4a516700  7 mds.0.locker issue_caps loner client.4119 allowed=pAsLsXsFscr, xlocker allowed=pAsLsXsFscr, others allowed=pAsLsXs on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@3},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402312 7fbc48411700 10 mds.0.locker  revoking pAsLsXsFscr on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=pAsLsXsFscr/Fxcb@4},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402330 7fbc48411700 10 mds.0.locker eval_gather (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) needsrecover recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402347 7fbc48411700  7 mds.0.locker eval_gather finished gather on (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) needsrecover recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402361 7fbc48411700 10 mds.0.cache queue_file_recover [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) needsrecover recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402377 7fbc48411700 15 mds.0.cache _queue_file_recover [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) needsrecover recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402394 7fbc48411700 10 mds.0.locker eval 2496 [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119 | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402407 7fbc48411700 10 mds.0.locker eval_gather (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402424 7fbc48411700  7 mds.0.locker eval_gather finished gather on (ifile excl->sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402439 7fbc48411700 10 mds.0.locker simple_eval (iauth sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402452 7fbc48411700 10 mds.0.locker simple_eval (ilink sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402466 7fbc48411700 10 mds.0.locker simple_eval (ixattr sync) on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5},l=4119(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
2013-06-05 17:32:03.402482 7fbc48411700  7 mds.0.locker issue_caps allowed=pAsLsXs, xlocker allowed=pAsLsXsFscr on [inode 1000000005c [2,head] ~mds0/stray2/1000000005c auth v317 ap=2+0(0x1699000,0x2093d20) recovering s=10264 nl=0 n(v0 b10264 1=1+0) (ifile excl->sync) (iversion lock) cr={4119=0-4194304@1} caps={4119=-/Fxcb@5} | ptrwaiter=0 request=0 lock=0 caps=1 dirty=1 authpin=1 0x20935a0]
Actions #15

Updated by Sage Weil almost 11 years ago

2013-06-05 17:30:58.396512 7fbc48411700 10 mds.0.locker revoke_stale_caps for client.4119
2013-06-05 17:30:58.504088 7fbc4a516700 10 mds.0.locker resume_stale_caps for client.4119
2013-06-05 17:32:03.401133 7fbc48411700 10 mds.0.locker revoke_stale_caps for client.4119
2013-06-05 17:32:03.413379 7fbc4a516700 10 mds.0.locker resume_stale_caps for client.4119
2013-06-05 17:33:08.405577 7fbc48411700 10 mds.0.locker revoke_stale_caps for client.4119
2013-06-05 17:33:08.417829 7fbc4a516700 10 mds.0.locker resume_stale_caps for client.4119
2013-06-05 17:34:13.409957 7fbc48411700 10 mds.0.locker revoke_stale_caps for client.4119
2013-06-05 17:34:13.422847 7fbc4a516700 10 mds.0.locker resume_stale_caps for client.4119
2013-06-05 17:35:18.414239 7fbc48411700 10 mds.0.locker revoke_stale_caps for client.4119
2013-06-05 17:35:18.427285 7fbc4a516700 10 mds.0.locker resume_stale_caps for client.4119

- need to make sure teh stale > unstale -> stale etc transitions interact properly with the recovery
why is it going stale in the first place?
Actions #16

Updated by Sage Weil almost 11 years ago

full log attached for posterity. see wip-4832

Actions #17

Updated by Sage Weil almost 11 years ago

  • Status changed from 7 to Resolved

commit:a08d62045657713bf0a5372bf14136082ec3b17e

Actions #18

Updated by Zheng Yan almost 11 years ago

  • File patch patch added
  • Status changed from Resolved to 12

that commit breaks filelock eval gather

Actions #19

Updated by Sage Weil almost 11 years ago

aie.. thanks

Actions #20

Updated by Sage Weil almost 11 years ago

  • Status changed from 12 to Resolved
Actions #21

Updated by Greg Farnum over 10 years ago

I've backported these two patches to the cuttlefish branch as well.
(Plus a cuttlefish-4832 branch on top of v0.61.8, but that will go away when we do another point release.)

Actions #22

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF