Project

General

Profile

Actions

Bug #23837

closed

client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail

Added by Ivan Guan about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
mimic,luminous,jewel
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, osdc
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

#0  0x00007fecc1c5bfcb in raise () from /lib64/libpthread.so.0
#1  0x00007fecc316a1b5 in reraise_fatal (signum=6) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:133
#3  <signal handler called>
#4  0x00007fecc0a7b5f7 in raise () from /lib64/libc.so.6
#5  0x00007fecc0a7cce8 in abort () from /lib64/libc.so.6
#6  0x00007fecc3265957 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x7fecc33f2daf "in->oset.objects.empty()",
    file=file@entry=0x7fecc33f1373 "client/Client.cc", line=line@entry=3136,
    func=func@entry=0x7fecc33f6a20 <Client::put_inode(Inode*, int)::__PRETTY_FUNCTION__> "void Client::put_inode(Inode*, int)") at common/assert.cc:78
#7  0x00007fecc30863a4 in Client::put_inode (this=this@entry=0x7fecce02c000, in=in@entry=0x7fecf5792d00, n=n@entry=1) at client/Client.cc:3136
#8  0x00007fecc30b9766 in Client::_ll_put (this=this@entry=0x7fecce02c000, in=in@entry=0x7fecf5792d00, num=num@entry=4) at client/Client.cc:9859
#9  0x00007fecc30ba106 in Client::ll_forget (this=0x7fecce02c000, in=0x7fecf5792d00, count=count@entry=4) at client/Client.cc:9900
#10 0x00007fecc30572d1 in fuse_ll_forget (req=0x7fed1a37bf00, ino=1099512610042, nlookup=3) at client/fuse_ll.cc:136
#11 0x00007fecc2a72054 in do_batch_forget () from /lib64/libfuse.so.2
#12 0x00007fecc2a71bdb in fuse_ll_process_buf () from /lib64/libfuse.so.2
#13 0x00007fecc2a6e471 in fuse_do_work () from /lib64/libfuse.so.2
#14 0x00007fecc1c54dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007fecc0b3c28d in clone () from /lib64/libc.so.6

When we write a file, probably can be summarized as the following steps:

But sometimes the client will dropped inode's FILE_BUFFER | FILE_CACHE caps before the write of bh
callback, and then if the last one who have ref the anode start put_inode will found the set of the
inode can't be emptied. When client found one indoe's size smaller than mds have recorded, it will
do invalidate_inode_cache operation.

Code Path:

handle_client_replay -> insert_trace -> add_update_inode -> update_inode_file_bits -> _invalidate_inode_cache -> discard_set -> ObjectCacher::bh_remove

When we write a bh will increase its object's ref and decrease the ref in callback function.So if the writer hasn't callback and we delete the Tx bh may
lead the Object's data be empty but we can't do lru_unpin because the Object's ref is not equal to 1. When the inode's ref is decrease to zero by others
want empty the oset of the inode int put_inode. But due to have one object can't be closed, so will lead a assert fail.


Files

fuse_write_1.png (56.6 KB) fuse_write_1.png Ivan Guan, 04/24/2018 08:47 AM

Related issues 5 (0 open5 closed)

Related to CephFS - Bug #24101: mds: deadlock during fsstress workunit with 9 activesClosedZheng Yan05/11/2018

Actions
Has duplicate CephFS - Bug #24087: client: assert during shutdown after blacklistedDuplicatePatrick Donnelly05/10/2018

Actions
Copied to CephFS - Backport #24207: luminous: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert failResolvedZheng YanActions
Copied to CephFS - Backport #24208: jewel: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert failRejectedActions
Copied to CephFS - Backport #24209: mimic: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert failResolvedZheng YanActions
Actions #2

Updated by Patrick Donnelly about 6 years ago

  • Subject changed from ceps-fuse deleted inode's Bufferhead witch was in STATE::Tx would lead a assert fail. to ceps-fuse: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail
  • Due date deleted (04/25/2018)
  • Category set to Correctness/Safety
  • Status changed from New to Fix Under Review
  • Assignee set to Ivan Guan
  • Priority changed from Normal to High
  • Target version changed from v14.0.0 to v13.0.0
  • Start date deleted (04/24/2018)
  • Source set to Community (dev)
  • Backport set to luminous
  • Labels (FS) crash added
Actions #3

Updated by Patrick Donnelly almost 6 years ago

  • Subject changed from ceps-fuse: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail to client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail
  • Target version changed from v13.0.0 to v14.0.0
  • Backport changed from luminous to mimic,luminous
  • Component(FS) deleted (ceph-fuse)
Actions #4

Updated by Patrick Donnelly almost 6 years ago

  • Target version changed from v14.0.0 to v13.2.0
  • Backport changed from mimic,luminous to luminous
Actions #5

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Pending Backport to In Progress

Kicking this back to In Progress. Please see comments in original PR. It has been reverted by https://github.com/ceph/ceph/pull/21975

Actions #8

Updated by Patrick Donnelly almost 6 years ago

  • Related to Bug #24101: mds: deadlock during fsstress workunit with 9 actives added
Actions #9

Updated by Zheng Yan almost 6 years ago

  • Status changed from In Progress to Fix Under Review
Actions #10

Updated by Patrick Donnelly almost 6 years ago

  • Assignee changed from Ivan Guan to Zheng Yan
  • Backport changed from luminous to luminous,jewel
  • Affected Versions v10.2.2 added
  • Component(FS) osdc added
Actions #11

Updated by Patrick Donnelly almost 6 years ago

  • Description updated (diff)

Fixed formatting.

Actions #12

Updated by Patrick Donnelly almost 6 years ago

  • Has duplicate Bug #24087: client: assert during shutdown after blacklisted added
Actions #13

Updated by Patrick Donnelly almost 6 years ago

  • Priority changed from High to Urgent
Actions #14

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version changed from v13.2.0 to v14.0.0
  • Backport changed from luminous,jewel to mimic,luminous,jewel
Actions #15

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24207: luminous: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added
Actions #16

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24208: jewel: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added
Actions #17

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24209: mimic: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added
Actions #18

Updated by Patrick Donnelly over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF