Bug #64961
openceph-fuse: crash when try to open & trunc a encrypted file
0%
Description
Mount a kclient and encrypt it and then use a ceph-fuse client tries to open & trunc it, though it returned failure with read-only filesystem, but the size will be truncated successfully and at the same time the ceph-fuse daemon will crash:
1503 -9> 2024-03-18T14:53:28.233+0800 7fbade000640 10 client.4286 _release_fh 0x7fba800032e0 on inode 0x1000000000d.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={2=0} mode=100644 size=0/0 nlink=1 btime=2024-03-18T14:15:08 .337113+0800 mtime=2024-03-18T14:53:27.906965+0800 ctime=2024-03-18T14:53:27.906965+0800 change_attr=14 caps=pAsxLsXsxFscb(0=pAsxLsXsxFscb) objectset[0x1000000000d ts 6/0 objects 0 dirty_or_tx 0] parents=0x1000000000c.head["+R 3FPdfhbcIGXae7a2+ZSg"] 0x7fbaa8009220) no async_err state 1504 -8> 2024-03-18T14:53:28.233+0800 7fbade000640 20 client.4286 put_inode on 0x1000000000d.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={2=0} mode=100644 size=0/0 nlink=1 btime=2024-03-18T14:15:08.337113+0800 mtime=2024 -03-18T14:53:27.906965+0800 ctime=2024-03-18T14:53:27.906965+0800 change_attr=14 caps=pAsxLsXsxFscb(0=pAsxLsXsxFscb) objectset[0x1000000000d ts 6/0 objects 0 dirty_or_tx 0] parents=0x1000000000c.head["+R3FPdfhbcIGXae7a2+ZSg"] 0x7fbaa8009220) n = 1 1505 -7> 2024-03-18T14:53:28.233+0800 7fbaea000640 1 -- 192.168.0.100:0/1186915634 <== mds.0 v2:192.168.0.100:6813/4235335776 12 ==== client_caps(trunc ino 0x1000000000d 62 seq 5 caps=pAsxLsXsxFscb dirty=- wanted=pAsxXsxFxwb f ollows 0 size 0/4194304 ts 6/0 mtime 2024-03-18T14:53:27.906965+0800 ctime 2024-03-18T14:53:27.906965+0800 change_attr 14 tws 4) v12 ==== 316+0+0 (crc 0 0 0) 0x7fbae400e7b0 con 0x5627ae355150 1506 -6> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 mds.0 seq now 4 1507 -5> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5540 lxb --------------------------- 1508 -4> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5542 lxb --------------------------- 1509 -3> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5544 lxb --------------------------- 1510 -2> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5547 lxb --------------------------- 1511 -1> 2024-03-18T14:53:28.233+0800 7fbaea000640 10 client.4286 handle_cap_trunc : 5549 lxb --------------------------- 1512 0> 2024-03-18T14:53:28.239+0800 7fbaea000640 -1 *** Caught signal (Aborted) ** 1513 in thread 7fbaea000640 thread_name:ms_dispatch 1514 1515 ceph version 19.0.0-2007-g49f7bc2afcf (49f7bc2afcf1d8c2a0c49dc924490ec7189afe5e) squid (dev) 1516 1: /lib64/libc.so.6(+0x54df0) [0x7fbb0b254df0] 1517 2: /lib64/libc.so.6(+0xa157c) [0x7fbb0b2a157c] 1518 3: raise() 1519 4: abort() 1520 5: /lib64/libstdc++.so.6(+0xa1a01) [0x7fbb0b6a1a01] 1521 6: /lib64/libstdc++.so.6(+0xad37c) [0x7fbb0b6ad37c] 1522 7: /lib64/libstdc++.so.6(+0xad3e7) [0x7fbb0b6ad3e7] 1523 8: /lib64/libstdc++.so.6(+0xad649) [0x7fbb0b6ad649] 1524 9: (std::__throw_invalid_argument(char const*)+0x41) [0x7fbb0b6a4363] 1525 10: (long long __gnu_cxx::__stoa<long long, long long, char, int>(long long (*)(char const*, char**, int), char const*, char const*, unsigned long*, int)+0xa7) [0x5627acf40629] 1526 11: (Client::handle_cap_trunc(MetaSession*, Inode*, boost::intrusive_ptr<MClientCaps const> const&)+0x17d) [0x5627aceb18e3] 1527 12: (Client::handle_caps(boost::intrusive_ptr<MClientCaps const> const&)+0x284) [0x5627acf1ef14] 1528 13: (Client::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x251) [0x5627acf32119] 1529 14: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xb6) [0x7fbb0c6389a4] 1530 15: (DispatchQueue::entry()+0xb0b) [0x7fbb0c6360bb] 1531 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fbb0c7169a9] 1532 17: (Thread::entry_wrapper()+0x3f) [0x7fbb0c4861a7] 1533 18: (Thread::_entry_func(void*)+0x9) [0x7fbb0c4861bf] 1534 19: /lib64/libc.so.6(+0x9f832) [0x7fbb0b29f832] 1535 20: /lib64/libc.so.6(+0x3f450) [0x7fbb0b23f450] 1536 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 1537
Updated by Xiubo Li about 1 month ago
Found a new issue, when decoding the fscrypt_file in the client request we get fscrypt_file len=8 fscrypt_file =[^,^
,^,^
,^,^
,^,^
]:
1939 2024-03-19T15:39:41.700+0800 7fdc6a800640 1 -- v2:192.168.0.106:6813/2768574324 <== client.4252 192.168.0.106:0/1007140338 53 ==== client_request(client.4252:3 create owner_uid=1000, owner_gid=1000 #0x10000000000/NONdNre1Nm8P ZdX580vvNA 2024-03-19T15:39:41.698758+0800 fscrypt_file len=8 fscrypt_file =[^@,^@,^@,^@,^@,^@,^@,^@] caller_uid=1000, caller_gid=1000{10,1000,}) v6 ==== 306+0+38 (crc 0 0 0) 0x5576004f5880 con 0x557600568400 1940 2024-03-19T15:39:41.700+0800 7fdc6a800640 4 mds.0.server handle_client_request client_request(client.4252:3 create owner_uid=1000, owner_gid=1000 #0x10000000000/NONdNre1Nm8PZdX580vvNA 2024-03-19T15:39:41.698758+0800 fscrypt_file len=8 fscrypt_file =[^@,^@,^@,^@,^@,^@,^@,^@] caller_uid=1000, caller_gid=1000{10,1000,}) v6
While the kclient just encoded 0:
<3>[24576.987630] ceph: encode_mclientrequest_tail 0000000039566a71 tid 3 r_fscrypt_file 0
And then in the ceph-fuse:
3017 -9> 2024-03-19T15:39:59.461+0800 7f151cc00640 1 -- 192.168.0.106:0/2825619564 <== mds.0 v2:192.168.0.106:6813/2768574324 24 ==== client_caps(trunc ino 0x10000000001 4 seq 7 caps=pAsxLsXsxFscb dirty=- wanted=pAsxXsxFxwb fo llows 0 size 0/4194304 ts 2/0 mtime 2024-03-19T15:39:59.201395+0800 ctime 2024-03-19T15:39:59.201395+0800 change_attr 2) v12 ==== 316+0+0 (crc 0 0 0) 0x7f1518019370 con 0x5637918bc770 3018 -8> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 mds.0 seq now 5 3019 -7> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5540 lxb --------------------------- 3020 -6> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5542 lxb --------------------------- 3021 -5> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5544 lxb --------------------------- 3022 -4> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5547 lxb --------------------------- 3023 -3> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5549 lxb --------------------------- m->fscrypt_file[0]: ^S 3024 -2> 2024-03-19T15:39:59.461+0800 7f151cc00640 10 client.4250 handle_cap_trunc : 5551 lxb --------------------------- string: ^@^@^@^@^@^@^@^S 3025 -1> 2024-03-19T15:39:59.462+0800 7f1510c00640 3 client.4250 ll_flush 0x7f14c00245e0 0x10000000001 3026 0> 2024-03-19T15:39:59.467+0800 7f151cc00640 -1 *** Caught signal (Aborted) ** 3027 in thread 7f151cc00640 thread_name:ms_dispatch 3028 3029 ceph version 19.0.0-2011-g96925e4bc94 (96925e4bc9488c7c248702de9666f3e9b67b8862) squid (dev) 3030 1: /lib64/libc.so.6(+0x54df0) [0x7f153e654df0] 3031 2: /lib64/libc.so.6(+0xa157c) [0x7f153e6a157c] 3032 3: raise() 3033 4: abort() 3034 5: /lib64/libstdc++.so.6(+0xa1a01) [0x7f153eaa1a01] 3035 6: /lib64/libstdc++.so.6(+0xad37c) [0x7f153eaad37c] 3036 7: /lib64/libstdc++.so.6(+0xad3e7) [0x7f153eaad3e7] 3037 8: /lib64/libstdc++.so.6(+0xad649) [0x7f153eaad649] 3038 9: (std::__throw_invalid_argument(char const*)+0x41) [0x7f153eaa4363] 3039 10: (long long __gnu_cxx::__stoa<long long, long long, char, int>(long long (*)(char const*, char**, int), char const*, char const*, unsigned long*, int)+0xa7) [0x56378fd59ae3] 3040 11: (Client::handle_cap_trunc(MetaSession*, Inode*, boost::intrusive_ptr<MClientCaps const> const&)+0x1fa) [0x56378fcca964] 3041 12: (Client::handle_caps(boost::intrusive_ptr<MClientCaps const> const&)+0x284) [0x56378fd382d2] 3042 13: (Client::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x251) [0x56378fd4b5d3] 3043 14: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xb6) [0x7f153fa389a4] 3044 15: (DispatchQueue::entry()+0xb0b) [0x7f153fa360bb] 3045 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f153fb16aa7] 3046 17: (Thread::entry_wrapper()+0x3f) [0x7f153f8861a7] 3047 18: (Thread::_entry_func(void*)+0x9) [0x7f153f8861bf] 3048 19: /lib64/libc.so.6(+0x9f832) [0x7f153e69f832] 3049 20: /lib64/libc.so.6(+0x3f450) [0x7f153e63f450] 3050 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
And the ^
^
^
@^S are invalidate arguments for std::stroll().
I am still checking exactly where corrupted the fscrypt_file, it seems when decoding...
Updated by Xiubo Li about 1 month ago
Found the root cause, not corrupted when decoding. It's just because in kclient the fscrypt_file is an u64 type,and then just covert it to a vector <uint8_t> in MDS and ceph-fuse, which couldn't be parsed by the stoll() directly, the following is a example code for this, which the stoll() will crash, while the std::accumulate() won't:
#include <bits/stdc++.h> using namespace std; int main() { string s1; // converting decimal number. std::vector<uint8_t> v = {0, 2, 0x3, 0, 0, 0}; long long size = std::accumulate(v.rbegin(), v.rend(), 0ll, [](long long acc, unsigned char val) { return (acc << 8) + val; }); cout << std::hex << "size = " << size << "\n"; #if 0 int size = stoll(string(rbegin(fscrypt_file), rend(fscrypt_file))); cout << "size = " << size << "\n"; #endif return 0; }
Updated by Xiubo Li about 1 month ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 56326