Bug #53859
openqa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm)
0%
Description
Started seeing this lately - https://pulpito.ceph.com/vshankar-2022-01-13_04:42:40-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/6612000/
Also happens in master - https://pulpito.ceph.com/vshankar-2022-01-13_11:30:06-fs-master-testing-default-smithi/6612583/
Updated by Patrick Donnelly over 2 years ago
Relevant bits from the log:
2022-01-13T05:23:31.208+0000 7f32cffff700 3 client.5688 ll_open 0x10000000000.head 2 = 0 (0x7f32c8002d00) 2022-01-13T05:23:31.209+0000 7f32cffff700 8 client.5688 _ll_forget 0x10000000000 1 2022-01-13T05:23:31.209+0000 7f32cffff700 20 client.5688 _ll_put 0x7f32e0009670 0x10000000000 1 -> 1 2022-01-13T05:23:31.209+0000 7f32e77fe700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1 2022-01-13T05:23:31.209+0000 7f32e77fe700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1 2022-01-13T05:23:31.209+0000 7f32e4ff9700 20 client.5688 _ll_get 0x7f32e0009670 0x10000000000 -> 2 2022-01-13T05:23:31.209+0000 7f32e4ff9700 8 client.5688 _ll_getattr 0x10000000000.head 2022-01-13T05:23:31.209+0000 7f32e4ff9700 10 client.5688 _getattr mask pAsLsXsFs issued=1 2022-01-13T05:23:31.209+0000 7f32e4ff9700 10 client.5688 fill_stat on 0x10000000000 snap/devhead mode 0100664 mtime 2022-01-13T05:23:18.645267+0000 ctime 2022-01-13T05:23:18.646363+0000 2022-01-13T05:23:31.209+0000 7f32e4ff9700 3 client.5688 ll_getattr 0x10000000000.head = 0 2022-01-13T05:23:31.209+0000 7f32e4ff9700 8 client.5688 _ll_forget 0x10000000000 1 2022-01-13T05:23:31.209+0000 7f32e4ff9700 20 client.5688 _ll_put 0x7f32e0009670 0x10000000000 1 -> 1 2022-01-13T05:23:31.212+0000 7f32e77fe700 1 -- 192.168.0.1:0/2718057660 <== mds.0 v2:172.21.15.153:6835/2865775467 6 ==== client_caps(grant ino 0x10000000000 5 seq 3 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2022-01-13T05:23:18.645267+0000 tws 2) v11 ==== 252+0+0 (crc 0 0 0) 0x7f32f400e040 con 0x560ab8f738a0 2022-01-13T05:23:31.212+0000 7f32e77fe700 5 client.5688 set_cap_epoch_barrier epoch = 57 2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 mds.0 seq now 1 2022-01-13T05:23:31.212+0000 7f32e77fe700 5 client.5688 handle_cap_grant on in 0x10000000000 mds.0 seq 3 caps now pAsxLsXsxFsxcrwb was pAsxLsXsxFsxcrwb 2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 update_inode_file_time 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) pAsxLsXsxFsxcrwb ctime 2022-01-13T05:23:18.646363+0000 mtime 2022-01-13T05:23:18.645267+0000 2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 max_size 0 -> 4194304 2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 caps unchanged at pAsxLsXsxFsxcrwb 2022-01-13T05:23:31.213+0000 7f32cffff700 3 client.5688 ll_flush 0x7f32c8002d00 0x10000000000 2022-01-13T05:23:31.213+0000 7f32cffff700 10 client.5688 _flush: 0x7f32c8002d00 on inode 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) no async_err state 2022-01-13T05:23:31.213+0000 7f32e4ff9700 3 client.5688 ll_release (fh)0x7f32c8002d00 0x10000000000 2022-01-13T05:23:31.213+0000 7f32e4ff9700 8 client.5688 _release_fh 0x7f32c8002d00 mode 3 on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 _flush 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 nothing to flush 2022-01-13T05:23:31.213+0000 7f32e4ff9700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 check_caps on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) wanted - used - issued pAsxLsXsxFsxcrwb revoking - flags=0 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 cap mds.0 issued pAsxLsXsxFsxcrwb implemented pAsxLsXsxFsxcrwb revoking - 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 delaying cap release 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 cap_delay_requeue on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) 2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 _release_fh 0x7f32c8002d00 on inode 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) no async_err state 2022-01-13T05:23:31.213+0000 7f32e4ff9700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1
From: /ceph/teuthology-archive/vshankar-2022-01-13_04:42:40-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/6612000/remote/smithi052/log/ceph-client.0.51171.log.gz
Don't see a read call... maybe it's being optimized out since the file is known to be empty? If so, not sure how this ever worked. Would suggest experimenting with a file with contents to see if that makes a difference.
Updated by Xiubo Li over 2 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 44679
Updated by Xiubo Li over 2 years ago
When the 'test_file' is empty the kernel won't trigger go do the real read, so the libcephfs won't get a chance to call read. Seems will call a getattr before calling read do decide whether to call read or not.
Updated by Xiubo Li over 2 years ago
From kernel fs and fuse code:
1567 static ssize_t fuse_file_read_iter(struct kiocb *iocb, struct iov_iter *to) 1568 { 1569 struct file *file = iocb->ki_filp; 1570 struct fuse_file *ff = file->private_data; 1571 struct inode *inode = file_inode(file); 1572 1573 if (fuse_is_bad(inode)) 1574 return -EIO; 1575 1576 if (FUSE_IS_DAX(inode)) 1577 return fuse_dax_read_iter(iocb, to); 1578 1579 if (!(ff->open_flags & FOPEN_DIRECT_IO)) 1580 return fuse_cache_read_iter(iocb, to); 1581 else 1582 return fuse_direct_read_iter(iocb, to); 1583 }
In Line#1580 above when doing the cache read it will update the attribute first, which will fire a 'getattr' request to libcephfs:
989 static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to) 990 { 991 struct inode *inode = iocb->ki_filp->f_mapping->host; 992 struct fuse_conn *fc = get_fuse_conn(inode); 993 994 /* 995 * In auto invalidate mode, always update attributes on read. 996 * Otherwise, only update if we attempt to read past EOF (to ensure 997 * i_size is up to date). 998 */ 999 if (fc->auto_inval_data || 1000 (iocb->ki_pos + iov_iter_count(to) > i_size_read(inode))) { 1001 int err; 1002 err = fuse_update_attributes(inode, iocb->ki_filp, STATX_SIZE); 1003 if (err) 1004 return err; 1005 } 1006 1007 return generic_file_read_iter(iocb, to); 1008 }
The 'getattr' will get the inode size related fields, which is zero.
And then when doing the generic_file_read_iter()-->filemap_read():
2628 do { 2629 cond_resched(); 2630 2631 /* 2632 * If we've already successfully copied some data, then we 2633 * can no longer safely return -EIOCBQUEUED. Hence mark 2634 * an async read NOWAIT at that point. 2635 */ 2636 if ((iocb->ki_flags & IOCB_WAITQ) && already_read) 2637 iocb->ki_flags |= IOCB_NOWAIT; 2638 2639 if (unlikely(iocb->ki_pos >= i_size_read(inode))) 2640 break; 2641 2642 error = filemap_get_pages(iocb, iter, &pvec); 2643 if (error < 0) 2644 break;
When it see that "iocb->ki_pos >= i_size_read(inode)" it will return directly without fire real read to libcephfs.
This is why when the read offset >= file size, we won't see the 'read' in libcephfs.
I have checked the direct IO in kernel & fuse code, also has the same logic.
Updated by Patrick Donnelly over 2 years ago
That explains the current behavior but why did this test not fail in the past?
Updated by Xiubo Li over 2 years ago
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
Updated by Venky Shankar over 2 years ago
Xiubo Li wrote:
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
That's possible. I've run into such things earlier where a previous test does not cleanup files and the subsequent test (depending on what its expecting) fails.
Updated by Xiubo Li about 2 years ago
Venky Shankar wrote:
Xiubo Li wrote:
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
That's possible. I've run into such things earlier where a previous test does not cleanup files and the subsequent test (depending on what its expecting) fails.
Okay, before I thought the previous test files should be cleaned up anyway.