Project

General

Profile

Bug #23837

client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail

Added by Ivan Guan 8 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
Start date:
Due date:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
mimic,luminous,jewel
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, osdc
Labels (FS):
crash
Pull request ID:

Description

#0  0x00007fecc1c5bfcb in raise () from /lib64/libpthread.so.0
#1  0x00007fecc316a1b5 in reraise_fatal (signum=6) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:133
#3  <signal handler called>
#4  0x00007fecc0a7b5f7 in raise () from /lib64/libc.so.6
#5  0x00007fecc0a7cce8 in abort () from /lib64/libc.so.6
#6  0x00007fecc3265957 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x7fecc33f2daf "in->oset.objects.empty()",
    file=file@entry=0x7fecc33f1373 "client/Client.cc", line=line@entry=3136,
    func=func@entry=0x7fecc33f6a20 <Client::put_inode(Inode*, int)::__PRETTY_FUNCTION__> "void Client::put_inode(Inode*, int)") at common/assert.cc:78
#7  0x00007fecc30863a4 in Client::put_inode (this=this@entry=0x7fecce02c000, in=in@entry=0x7fecf5792d00, n=n@entry=1) at client/Client.cc:3136
#8  0x00007fecc30b9766 in Client::_ll_put (this=this@entry=0x7fecce02c000, in=in@entry=0x7fecf5792d00, num=num@entry=4) at client/Client.cc:9859
#9  0x00007fecc30ba106 in Client::ll_forget (this=0x7fecce02c000, in=0x7fecf5792d00, count=count@entry=4) at client/Client.cc:9900
#10 0x00007fecc30572d1 in fuse_ll_forget (req=0x7fed1a37bf00, ino=1099512610042, nlookup=3) at client/fuse_ll.cc:136
#11 0x00007fecc2a72054 in do_batch_forget () from /lib64/libfuse.so.2
#12 0x00007fecc2a71bdb in fuse_ll_process_buf () from /lib64/libfuse.so.2
#13 0x00007fecc2a6e471 in fuse_do_work () from /lib64/libfuse.so.2
#14 0x00007fecc1c54dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007fecc0b3c28d in clone () from /lib64/libc.so.6

When we write a file, probably can be summarized as the following steps:

But sometimes the client will dropped inode's FILE_BUFFER | FILE_CACHE caps before the write of bh
callback, and then if the last one who have ref the anode start put_inode will found the set of the
inode can't be emptied. When client found one indoe's size smaller than mds have recorded, it will
do invalidate_inode_cache operation.

Code Path:

handle_client_replay -> insert_trace -> add_update_inode -> update_inode_file_bits -> _invalidate_inode_cache -> discard_set -> ObjectCacher::bh_remove

When we write a bh will increase its object's ref and decrease the ref in callback function.So if the writer hasn't callback and we delete the Tx bh may
lead the Object's data be empty but we can't do lru_unpin because the Object's ref is not equal to 1. When the inode's ref is decrease to zero by others
want empty the oset of the inode int put_inode. But due to have one object can't be closed, so will lead a assert fail.

fuse_write_1.png View (56.6 KB) Ivan Guan, 04/24/2018 08:47 AM


Related issues

Related to fs - Bug #24101: mds: deadlock during fsstress workunit with 9 actives Closed 05/11/2018
Duplicated by fs - Bug #24087: client: assert during shutdown after blacklisted Duplicate 05/10/2018
Copied to fs - Backport #24207: luminous: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail Resolved
Copied to fs - Backport #24208: jewel: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail Rejected
Copied to fs - Backport #24209: mimic: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail Resolved

History

#2 Updated by Patrick Donnelly 8 months ago

  • Subject changed from ceps-fuse deleted inode's Bufferhead witch was in STATE::Tx would lead a assert fail. to ceps-fuse: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail
  • Due date deleted (04/25/2018)
  • Category set to Correctness/Safety
  • Status changed from New to Need Review
  • Assignee set to Ivan Guan
  • Priority changed from Normal to High
  • Target version changed from v14.0.0 to v13.0.0
  • Start date deleted (04/24/2018)
  • Source set to Community (dev)
  • Backport set to luminous
  • Labels (FS) crash added

#3 Updated by Patrick Donnelly 8 months ago

  • Subject changed from ceps-fuse: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail to client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail
  • Target version changed from v13.0.0 to v14.0.0
  • Backport changed from luminous to mimic,luminous
  • Component(FS) deleted (ceph-fuse)

#4 Updated by Patrick Donnelly 7 months ago

  • Target version changed from v14.0.0 to v13.2.0
  • Backport changed from mimic,luminous to luminous

#5 Updated by Patrick Donnelly 7 months ago

  • Status changed from Need Review to Pending Backport

#7 Updated by Patrick Donnelly 7 months ago

  • Status changed from Pending Backport to In Progress

Kicking this back to In Progress. Please see comments in original PR. It has been reverted by https://github.com/ceph/ceph/pull/21975

#8 Updated by Patrick Donnelly 7 months ago

  • Related to Bug #24101: mds: deadlock during fsstress workunit with 9 actives added

#9 Updated by Zheng Yan 7 months ago

  • Status changed from In Progress to Need Review

#10 Updated by Patrick Donnelly 7 months ago

  • Assignee changed from Ivan Guan to Zheng Yan
  • Backport changed from luminous to luminous,jewel
  • Affected Versions v10.2.2 added
  • Component(FS) osdc added

#11 Updated by Patrick Donnelly 7 months ago

  • Description updated (diff)

Fixed formatting.

#12 Updated by Patrick Donnelly 7 months ago

  • Duplicated by Bug #24087: client: assert during shutdown after blacklisted added

#13 Updated by Patrick Donnelly 7 months ago

  • Priority changed from High to Urgent

#14 Updated by Patrick Donnelly 7 months ago

  • Status changed from Need Review to Pending Backport
  • Target version changed from v13.2.0 to v14.0.0
  • Backport changed from luminous,jewel to mimic,luminous,jewel

#15 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #24207: luminous: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added

#16 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #24208: jewel: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added

#17 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #24209: mimic: client: deleted inode's Bufferhead which was in STATE::Tx would lead a assert fail added

#18 Updated by Patrick Donnelly about 2 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF