Project

General

Profile

Actions

Bug #64961

open

ceph-fuse: crash when try to open & trunc a encrypted file

Added by Xiubo Li about 1 month ago. Updated about 1 month ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Mount a kclient and encrypt it and then use a ceph-fuse client tries to open & trunc it, though it returned failure with read-only filesystem, but the size will be truncated successfully and at the same time the ceph-fuse daemon will crash:

1503     -9> 2024-03-18T14:53:28.233+0800 7fbade000640 10 client.4286 _release_fh 0x7fba800032e0 on inode 0x1000000000d.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={2=0} mode=100644 size=0/0 nlink=1 btime=2024-03-18T14:15:08     .337113+0800 mtime=2024-03-18T14:53:27.906965+0800 ctime=2024-03-18T14:53:27.906965+0800 change_attr=14 caps=pAsxLsXsxFscb(0=pAsxLsXsxFscb) objectset[0x1000000000d ts 6/0 objects 0 dirty_or_tx 0] parents=0x1000000000c.head["+R     3FPdfhbcIGXae7a2+ZSg"] 0x7fbaa8009220) no async_err state
1504     -8> 2024-03-18T14:53:28.233+0800 7fbade000640 20 client.4286 put_inode on 0x1000000000d.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={2=0} mode=100644 size=0/0 nlink=1 btime=2024-03-18T14:15:08.337113+0800 mtime=2024     -03-18T14:53:27.906965+0800 ctime=2024-03-18T14:53:27.906965+0800 change_attr=14 caps=pAsxLsXsxFscb(0=pAsxLsXsxFscb) objectset[0x1000000000d ts 6/0 objects 0 dirty_or_tx 0] parents=0x1000000000c.head["+R3FPdfhbcIGXae7a2+ZSg"]      0x7fbaa8009220) n = 1
1505     -7> 2024-03-18T14:53:28.233+0800 7fbaea000640  1 -- 192.168.0.100:0/1186915634 <== mds.0 v2:192.168.0.100:6813/4235335776 12 ==== client_caps(trunc ino 0x1000000000d 62 seq 5 caps=pAsxLsXsxFscb dirty=- wanted=pAsxXsxFxwb f     ollows 0 size 0/4194304 ts 6/0 mtime 2024-03-18T14:53:27.906965+0800 ctime 2024-03-18T14:53:27.906965+0800 change_attr 14 tws 4) v12 ==== 316+0+0 (crc 0 0 0) 0x7fbae400e7b0 con 0x5627ae355150
1506     -6> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286  mds.0 seq now 4
1507     -5> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5540 lxb ---------------------------
1508     -4> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5542 lxb ---------------------------
1509     -3> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5544 lxb ---------------------------
1510     -2> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5547 lxb ---------------------------
1511     -1> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5549 lxb ---------------------------
1512      0> 2024-03-18T14:53:28.239+0800 7fbaea000640 -1 *** Caught signal (Aborted) **
1513  in thread 7fbaea000640 thread_name:ms_dispatch
1514 
1515  ceph version 19.0.0-2007-g49f7bc2afcf (49f7bc2afcf1d8c2a0c49dc924490ec7189afe5e) squid (dev)
1516  1: /lib64/libc.so.6(+0x54df0) [0x7fbb0b254df0]
1517  2: /lib64/libc.so.6(+0xa157c) [0x7fbb0b2a157c]
1518  3: raise()
1519  4: abort()
1520  5: /lib64/libstdc++.so.6(+0xa1a01) [0x7fbb0b6a1a01]
1521  6: /lib64/libstdc++.so.6(+0xad37c) [0x7fbb0b6ad37c]
1522  7: /lib64/libstdc++.so.6(+0xad3e7) [0x7fbb0b6ad3e7]
1523  8: /lib64/libstdc++.so.6(+0xad649) [0x7fbb0b6ad649]
1524  9: (std::__throw_invalid_argument(char const*)+0x41) [0x7fbb0b6a4363]
1525  10: (long long __gnu_cxx::__stoa<long long, long long, char, int>(long long (*)(char const*, char**, int), char const*, char const*, unsigned long*, int)+0xa7) [0x5627acf40629]
1526  11: (Client::handle_cap_trunc(MetaSession*, Inode*, boost::intrusive_ptr<MClientCaps const> const&)+0x17d) [0x5627aceb18e3]
1527  12: (Client::handle_caps(boost::intrusive_ptr<MClientCaps const> const&)+0x284) [0x5627acf1ef14]
1528  13: (Client::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x251) [0x5627acf32119]
1529  14: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xb6) [0x7fbb0c6389a4]
1530  15: (DispatchQueue::entry()+0xb0b) [0x7fbb0c6360bb]
1531  16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fbb0c7169a9]
1532  17: (Thread::entry_wrapper()+0x3f) [0x7fbb0c4861a7]
1533  18: (Thread::_entry_func(void*)+0x9) [0x7fbb0c4861bf]
1534  19: /lib64/libc.so.6(+0x9f832) [0x7fbb0b29f832]
1535  20: /lib64/libc.so.6(+0x3f450) [0x7fbb0b23f450]                                                                                                                                                                                  
1536  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
1537 
Actions #1

Updated by Xiubo Li about 1 month ago

Found a new issue, when decoding the fscrypt_file in the client request we get fscrypt_file len=8 fscrypt_file =[^,^,^,^,^,^,^,^]:

1939 2024-03-19T15:39:41.700+0800 7fdc6a800640  1 -- v2:192.168.0.106:6813/2768574324 <== client.4252 192.168.0.106:0/1007140338 53 ==== client_request(client.4252:3 create owner_uid=1000, owner_gid=1000 #0x10000000000/NONdNre1Nm8P     ZdX580vvNA 2024-03-19T15:39:41.698758+0800 fscrypt_file len=8 fscrypt_file =[^@,^@,^@,^@,^@,^@,^@,^@] caller_uid=1000, caller_gid=1000{10,1000,}) v6 ==== 306+0+38 (crc 0 0 0) 0x5576004f5880 con 0x557600568400                  
1940 2024-03-19T15:39:41.700+0800 7fdc6a800640  4 mds.0.server handle_client_request client_request(client.4252:3 create owner_uid=1000, owner_gid=1000 #0x10000000000/NONdNre1Nm8PZdX580vvNA 2024-03-19T15:39:41.698758+0800 fscrypt_file len=8 fscrypt_file =[^@,^@,^@,^@,^@,^@,^@,^@] caller_uid=1000, caller_gid=1000{10,1000,}) v6

While the kclient just encoded 0:

<3>[24576.987630] ceph: encode_mclientrequest_tail 0000000039566a71 tid 3 r_fscrypt_file 0

And then in the ceph-fuse:

3017     -9> 2024-03-19T15:39:59.461+0800 7f151cc00640  1 -- 192.168.0.106:0/2825619564 <== mds.0 v2:192.168.0.106:6813/2768574324 24 ==== client_caps(trunc ino 0x10000000001 4 seq 7 caps=pAsxLsXsxFscb dirty=- wanted=pAsxXsxFxwb fo     llows 0 size 0/4194304 ts 2/0 mtime 2024-03-19T15:39:59.201395+0800 ctime 2024-03-19T15:39:59.201395+0800 change_attr 2) v12 ==== 316+0+0 (crc 0 0 0) 0x7f1518019370 con 0x5637918bc770
3018     -8> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250  mds.0 seq now 5
3019     -7> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5540 lxb ---------------------------
3020     -6> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5542 lxb ---------------------------
3021     -5> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5544 lxb ---------------------------
3022     -4> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5547 lxb ---------------------------
3023     -3> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5549 lxb --------------------------- m->fscrypt_file[0]: ^S
3024     -2> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5551 lxb --------------------------- string: ^@^@^@^@^@^@^@^S
3025     -1> 2024-03-19T15:39:59.462+0800 7f1510c00640  3 client.4250 ll_flush 0x7f14c00245e0 0x10000000001 
3026      0> 2024-03-19T15:39:59.467+0800 7f151cc00640 -1 *** Caught signal (Aborted) **
3027  in thread 7f151cc00640 thread_name:ms_dispatch
3028 
3029  ceph version 19.0.0-2011-g96925e4bc94 (96925e4bc9488c7c248702de9666f3e9b67b8862) squid (dev)
3030  1: /lib64/libc.so.6(+0x54df0) [0x7f153e654df0]
3031  2: /lib64/libc.so.6(+0xa157c) [0x7f153e6a157c]
3032  3: raise()
3033  4: abort()
3034  5: /lib64/libstdc++.so.6(+0xa1a01) [0x7f153eaa1a01]
3035  6: /lib64/libstdc++.so.6(+0xad37c) [0x7f153eaad37c]
3036  7: /lib64/libstdc++.so.6(+0xad3e7) [0x7f153eaad3e7]
3037  8: /lib64/libstdc++.so.6(+0xad649) [0x7f153eaad649]
3038  9: (std::__throw_invalid_argument(char const*)+0x41) [0x7f153eaa4363]
3039  10: (long long __gnu_cxx::__stoa<long long, long long, char, int>(long long (*)(char const*, char**, int), char const*, char const*, unsigned long*, int)+0xa7) [0x56378fd59ae3]
3040  11: (Client::handle_cap_trunc(MetaSession*, Inode*, boost::intrusive_ptr<MClientCaps const> const&)+0x1fa) [0x56378fcca964]
3041  12: (Client::handle_caps(boost::intrusive_ptr<MClientCaps const> const&)+0x284) [0x56378fd382d2]
3042  13: (Client::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x251) [0x56378fd4b5d3]
3043  14: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xb6) [0x7f153fa389a4]
3044  15: (DispatchQueue::entry()+0xb0b) [0x7f153fa360bb]
3045  16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f153fb16aa7]
3046  17: (Thread::entry_wrapper()+0x3f) [0x7f153f8861a7]
3047  18: (Thread::_entry_func(void*)+0x9) [0x7f153f8861bf]
3048  19: /lib64/libc.so.6(+0x9f832) [0x7f153e69f832]
3049  20: /lib64/libc.so.6(+0x3f450) [0x7f153e63f450]
3050  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

And the ^^^@^S are invalidate arguments for std::stroll().

I am still checking exactly where corrupted the fscrypt_file, it seems when decoding...

Actions #2

Updated by Xiubo Li about 1 month ago

Found the root cause, not corrupted when decoding. It's just because in kclient the fscrypt_file is an u64 type,and then just covert it to a vector <uint8_t> in MDS and ceph-fuse, which couldn't be parsed by the stoll() directly, the following is a example code for this, which the stoll() will crash, while the std::accumulate() won't:

#include <bits/stdc++.h>
using namespace std;

int main()
{

    string s1; 
    // converting decimal number.
    std::vector<uint8_t> v = {0, 2, 0x3, 0, 0, 0}; 
    long long size = std::accumulate(v.rbegin(), v.rend(), 0ll, [](long long acc, unsigned char val) {
                    return (acc << 8) + val;
                    }); 
    cout << std::hex << "size = " << size << "\n";                                                                                                                                                                                     

#if 0
    int size = stoll(string(rbegin(fscrypt_file), rend(fscrypt_file)));

    cout << "size = " << size << "\n";
#endif
    return 0;
}

Actions #3

Updated by Xiubo Li about 1 month ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 56326
Actions

Also available in: Atom PDF