Project

General

Profile

Actions

Bug #55329

closed

qa: add test case for fsync crash issue

Added by Xiubo Li about 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
qa
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is the test case for https://tracker.ceph.com/issues/55327.


Related issues 4 (0 open4 closed)

Related to CephFS - Feature #55283: qa: add fsync/sync stuck waiting for unsafe request testResolvedXiubo Li

Actions
Related to Linux kernel client - Bug #55327: kclient: BUG: kernel NULL pointer dereference, address: 0000000000000008ResolvedXiubo Li

Actions
Copied to CephFS - Backport #55660: pacific: qa: add test case for fsync crash issueResolvedXiubo LiActions
Copied to CephFS - Backport #55661: quincy: qa: add test case for fsync crash issueResolvedXiubo LiActions
Actions #1

Updated by Xiubo Li about 2 years ago

This could be reproduce very easy by using the following kernel patch:

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 6a9bf58478c8..1e65771acd1d 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -2333,6 +2333,8 @@ static int unsafe_request_wait(struct inode *inode)
                        list_for_each_entry(req, &ci->i_unsafe_dirops,
                                            r_unsafe_dir_item) {
                                s = req->r_session;                                                                                                    
+                               if (!s)
+                                       pr_err("file sync s = NULL\n");
                                if (unlikely(s && s->s_mds >= max_sessions)) {
                                        spin_unlock(&ci->i_unsafe_lock);
                                        for (i = 0; i < max_sessions; i++) {
@@ -2353,6 +2355,8 @@ static int unsafe_request_wait(struct inode *inode)
                        list_for_each_entry(req, &ci->i_unsafe_iops,
                                            r_unsafe_target_item) {
                                s = req->r_session;
+                               if (!s)
+                                       pr_err("file sync s = NULL\n");
                                if (unlikely(s && s->s_mds >= max_sessions)) {
                                        spin_unlock(&ci->i_unsafe_lock);
                                        for (i = 0; i < max_sessions; i++) {
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 4aaa7b14136e..b7a549831e80 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -12,6 +12,7 @@
 #include <linux/bits.h>
 #include <linux/ktime.h>
 #include <linux/bitmap.h>
+#include <linux/delay.h>

 #include "super.h" 
 #include "crypto.h" 
@@ -3331,6 +3332,7 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
        dout("submit_request on %p for inode %p\n", req, dir);
        mutex_lock(&mdsc->mutex);
        __register_request(mdsc, req, dir);
+       msleep(50);
        __do_request(mdsc, req);
        err = req->r_err;
        mutex_unlock(&mdsc->mutex);
@@ -5121,6 +5123,8 @@ static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)

                        /* send flush mdlog request to MDS */
                        s = req->r_session;
+                       if (!s)
+                               pr_err("filesystem sync s = NULL\n");
                        if (s && last_session != s) {
                                send_flush_mdlog(s);
                                ceph_put_mds_session(last_session); 

And you can see the following logs:

[root@lxbceph1 ceph-client]# cat /proc/kmsg 
<5>[157326.712297] Key type ceph registered
<6>[157326.717311] libceph: loaded (mon/osd proto 15/24)
<6>[157326.884746] ceph: loaded (mds proto 32)
<6>[157326.964666] libceph: mon1 (1)10.72.47.117:40452 session established
<6>[157326.969637] libceph: client7954 fsid 21ceb5eb-f83c-49b5-aa2e-aa688552c0ae
<6>[157329.006844] libceph: mon0 (1)10.72.47.117:40450 session established
<6>[157329.011848] libceph: client8172 fsid 21ceb5eb-f83c-49b5-aa2e-aa688552c0ae
<3>[157331.567093] ceph: file sync s = NULL
<3>[157333.491179] ceph: file sync s = NULL
<3>[157337.656049] ceph: file sync s = NULL
<3>[157339.402788] ceph: file sync s = NULL
<3>[157343.138492] ceph: file sync s = NULL
<3>[157347.785086] ceph: file sync s = NULL
<3>[157352.702209] ceph: file sync s = NULL
<3>[157358.280873] ceph: file sync s = NULL
<3>[157362.672753] ceph: file sync s = NULL
<3>[157367.634606] ceph: file sync s = NULL
<3>[157372.784346] ceph: file sync s = NULL
<3>[157379.742185] ceph: file sync s = NULL
<3>[157383.058874] ceph: file sync s = NULL
......
Actions #2

Updated by Xiubo Li about 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 45886
Actions #3

Updated by Xiubo Li about 2 years ago

  • Related to Feature #55283: qa: add fsync/sync stuck waiting for unsafe request test added
Actions #4

Updated by Xiubo Li about 2 years ago

  • Related to Bug #55327: kclient: BUG: kernel NULL pointer dereference, address: 0000000000000008 added
Actions #5

Updated by Venky Shankar about 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Backport Bot about 2 years ago

  • Copied to Backport #55660: pacific: qa: add test case for fsync crash issue added
Actions #7

Updated by Backport Bot about 2 years ago

  • Copied to Backport #55661: quincy: qa: add test case for fsync crash issue added
Actions #8

Updated by Xiubo Li almost 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF