Bug #36561
msg: big messenger performance regression
0%
Description
I use following code to check how fast messenger is. Request rate of current master code (commit 98fc7ebc99) is only about half of mimic code (commit f51c50eb17)
#define _FILE_OFFSET_BITS 64 #include <features.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <fcntl.h> #include <time.h> #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <cephfs/libcephfs.h> int main(int argc, char *argv[]) { srandom(time(nullptr)); struct ceph_mount_info *cmount = NULL; ceph_create(&cmount, "admin"); ceph_conf_read_file(cmount, "ceph.conf"); ceph_mount(cmount, NULL); ceph_chdir(cmount, argv[1]); int count = 0; time_t start = time(NULL); while (true) { char buf[128]; int ret = ceph_getxattr(cmount, ".", "ceph.dir.entries", buf, sizeof(buf)); if (ret < 0) { printf("getxattr %d\n", ret); return ret; } count++; if (time(NULL) > start) { printf("%d\n", count); count = 0; start = time(NULL); } } ceph_unmount(cmount); return 0; }
It got following results when running single instance of above program,
Master:
About 2700 requests per second, mds uses about 93% cpu.
perf record -g -p <pid of mds>; perf report
Samples: 187K of event 'cycles', Event count (approx.): 80719416629 Children Self Command Shared Object Symbol + 32.78% 0.00% ms_dispatch libpthread-2.27.so [.] start_thread ◆ + 32.78% 0.00% ms_dispatch libceph-common.so.0 [.] Thread::_entry_func ▒ + 32.78% 0.00% ms_dispatch libceph-common.so.0 [.] Thread::entry_wrapper ▒ + 32.78% 0.00% ms_dispatch libceph-common.so.0 [.] DispatchQueue::DispatchThread::entry ▒ + 32.73% 0.03% ms_dispatch libceph-common.so.0 [.] DispatchQueue::entry ▒ + 29.40% 0.00% msgr-worker-0 libstdc++.so.6.0.25 [.] 0x00007fffeae81543 ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > > >::_M_run ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::thread::_Invoker<std::tuple<std::function<void ()> > >::operator() ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::thread::_Invoker<std::tuple<std::function<void ()> > >::_M_invoke<0ul> ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::__invoke<std::function<void ()>> ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::__invoke_impl<void, std::function<void ()>> ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::function<void ()>::operator() ▒ + 29.40% 0.00% msgr-worker-0 libceph-common.so.0 [.] std::_Function_handler<void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}>::_M_invoke ▒ + 29.40% 0.06% msgr-worker-0 libceph-common.so.0 [.] NetworkStack::add_thread(unsigned int)::{lambda()#1}::operator() ▒ + 29.36% 0.07% ms_dispatch libceph-common.so.0 [.] Messenger::ms_deliver_dispatch ▒ + 28.96% 0.40% msgr-worker-0 libceph-common.so.0 [.] EventCenter::process_events ▒ + 28.75% 0.02% ms_dispatch ceph-mds [.] MDSDaemon::ms_dispatch2 ▒ + 28.50% 0.01% ms_dispatch ceph-mds [.] MDSRankDispatcher::ms_dispatch ▒ + 28.45% 0.05% ms_dispatch ceph-mds [.] MDSRank::_dispatch ▒ + 27.54% 0.02% ms_dispatch ceph-mds [.] MDSRank::handle_deferrable_message ▒ + 27.49% 0.02% ms_dispatch ceph-mds [.] Server::dispatch ▒ + 27.22% 0.09% ms_dispatch ceph-mds [.] Server::handle_client_request ▒ + 22.45% 0.04% ms_dispatch ceph-mds [.] Server::dispatch_client_request ▒ + 22.27% 0.07% ms_dispatch ceph-mds [.] Server::handle_client_getattr ▒ + 15.87% 0.02% msgr-worker-0 libceph-common.so.0 [.] C_handle_read::do_request ▒ + 15.79% 0.08% msgr-worker-0 libceph-common.so.0 [.] AsyncConnection::process ▒ + 14.94% 0.02% ms_dispatch ceph-mds [.] Server::respond_to_request ▒ + 14.91% 0.02% msgr-worker-0 libceph-common.so.0 [.] std::function<void (char*, long)>::operator() ▒ + 14.87% 0.02% msgr-worker-0 libceph-common.so.0 [.] std::_Function_handler<void (char*, long), ProtocolV1::read(CtFun<ProtocolV1, char*, int>*, int, char*)::{lambda(ch▒ + 14.81% 0.09% msgr-worker-0 libceph-common.so.0 [.] ProtocolV1::read(CtFun<ProtocolV1, char*, int>*, int, char*)::{lambda(char*, int)#1}::operator() ▒ + 14.27% 0.11% ms_dispatch ceph-mds [.] Server::reply_client_request ▒ + 12.18% 0.03% msgr-worker-0 libceph-common.so.0 [.] CtFun<ProtocolV1, char*, int>::call ▒ + 12.13% 0.04% msgr-worker-0 libceph-common.so.0 [.] CtFun<ProtocolV1, char*, int>::_call<0ul, 1ul> ▒ + 10.73% 0.04% msgr-worker-0 [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe ▒ + 10.69% 0.07% msgr-worker-0 [kernel.vmlinux] [k] do_syscall_64 ▒ + 9.74% 0.03% msgr-worker-0 libceph-common.so.0 [.] C_handle_write::do_request ▒ + 9.64% 0.07% msgr-worker-0 libceph-common.so.0 [.] AsyncConnection::handle_write ▒ + 9.48% 0.15% msgr-worker-0 libceph-common.so.0 [.] ProtocolV1::handle_message_footer ▒ + 7.46% 0.11% msgr-worker-0 libceph-common.so.0 [.] ProtocolV1::write_event ▒ + 6.35% 0.05% ms_dispatch ceph-mds [.] Server::set_trace_dist ▒ + 6.22% 0.00% msgr-worker-0 [unknown] [k] 0000000000000000
Mimic:
About 5000 request per second, mds uses about 45% cpu.
perf record -g -p <pid of mds>; perf report
Samples: 220K of event 'cycles', Event count (approx.): 56214781785 Children Self Command Shared Object Symbol + 25.01% 0.00% ms_dispatch libpthread-2.27.so [.] start_thread + 25.01% 0.00% ms_dispatch libceph-common.so.0 [.] DispatchQueue::DispatchThread::entry + 24.39% 0.56% ms_dispatch libceph-common.so.0 [.] DispatchQueue::entry + 22.86% 0.09% msgr-worker-0 [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe + 22.77% 0.19% msgr-worker-0 [kernel.vmlinux] [k] do_syscall_64 + 20.27% 0.06% ms_dispatch ceph-mds [.] MDSDaemon::ms_dispatch + 20.01% 0.04% ms_dispatch ceph-mds [.] MDSRankDispatcher::ms_dispatch + 19.88% 0.08% ms_dispatch ceph-mds [.] MDSRank::_dispatch + 18.94% 0.04% ms_dispatch ceph-mds [.] MDSRank::handle_deferrable_message + 18.89% 0.03% ms_dispatch ceph-mds [.] Server::dispatch + 18.61% 0.12% ms_dispatch ceph-mds [.] Server::handle_client_request + 16.89% 0.00% msgr-worker-0 libstdc++.so.6.0.25 [.] 0x00007fffec63a543 + 16.86% 0.09% msgr-worker-0 libceph-common.so.0 [.] std::_Function_handler<void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}>::_M_invoke + 16.48% 0.61% msgr-worker-0 libceph-common.so.0 [.] EventCenter::process_events + 15.08% 0.13% ms_dispatch ceph-mds [.] Server::handle_client_getattr + 11.35% 0.00% msgr-worker-0 [unknown] [k] 0000000000000000 + 10.52% 0.19% ms_dispatch ceph-mds [.] Server::reply_client_request + 9.15% 0.05% msgr-worker-0 libpthread-2.27.so [.] __libc_sendmsg + 8.23% 0.05% msgr-worker-0 [kernel.vmlinux] [k] __sys_sendmsg + 8.14% 0.67% msgr-worker-0 libceph-common.so.0 [.] AsyncConnection::process + 8.10% 0.08% msgr-worker-0 [kernel.vmlinux] [k] ___sys_sendmsg + 7.58% 0.03% msgr-worker-0 [kernel.vmlinux] [k] sock_sendmsg + 7.49% 0.02% msgr-worker-0 [kernel.vmlinux] [k] tcp_sendmsg + 7.37% 0.17% msgr-worker-0 [kernel.vmlinux] [k] tcp_sendmsg_locked + 7.29% 0.00% msgr-worker-0 [unknown] [k] 0x0000000100000000 + 7.29% 0.18% msgr-worker-0 libc-2.27.so [.] epoll_wait + 6.87% 0.17% msgr-worker-0 libpthread-2.27.so [.] __libc_read + 6.71% 0.38% msgr-worker-0 [kernel.vmlinux] [k] __tcp_transmit_skb + 6.65% 0.07% ms_dispatch [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe + 6.58% 0.08% ms_dispatch [kernel.vmlinux] [k] do_syscall_64 + 6.52% 0.02% msgr-worker-0 [kernel.vmlinux] [k] __x64_sys_epoll_wait + 6.50% 0.11% msgr-worker-0 [kernel.vmlinux] [k] do_epoll_wait + 6.27% 0.16% msgr-worker-0 [kernel.vmlinux] [k] ep_poll + 5.92% 0.03% msgr-worker-0 [kernel.vmlinux] [k] ksys_read + 5.71% 0.08% msgr-worker-0 [kernel.vmlinux] [k] vfs_read + 5.42% 0.20% msgr-worker-0 [kernel.vmlinux] [k] __vfs_read + 5.32% 0.02% msgr-worker-0 [kernel.vmlinux] [k] __tcp_push_pending_frames + 5.30% 0.19% msgr-worker-0 [kernel.vmlinux] [k] tcp_write_xmit + 5.14% 0.30% msgr-worker-0 libceph-common.so.0 [.] AsyncConnection::handle_write
History
#1 Updated by Zheng Yan over 5 years ago
- Category set to msgr
- Priority changed from Normal to High
- Target version set to v14.0.0
#2 Updated by Zheng Yan over 5 years ago
- Subject changed from msg: big performance regression to msg: big messenger performance regression
#3 Updated by Ricardo Dias over 5 years ago
- Assignee set to Ricardo Dias
#4 Updated by Ricardo Dias over 5 years ago
@Zheng could you also provide the perf.data file for those runs?
#5 Updated by Zheng Yan over 5 years ago
perf.data relies on my environment. all symbols will get lost in your environment. do you really want it?
#6 Updated by Ricardo Dias over 5 years ago
Zheng Yan wrote:
my perf.data relies on my environment. all symbols will get lost in your environment. do you really want it?
Oh, you're right. Nevermind.
I run some quick experiments in my computer and recorded some perf data using the ceph_perf_msgr_(server|client) test application, and I only see similar results to what you reported (for master branch), if I compile in Debug mode. When I compile in Release mode, most of the time is spent on kernel functions and on the "crc32_iscsi_00" function.
In which mode did you compiled your test application?
#7 Updated by Zheng Yan over 5 years ago
Debug mode. I will try ceph compiled in Release mode.
#8 Updated by Zheng Yan over 5 years ago
I checked ceph complied in Release mode. Its performance is about the same as mimic. Sorry for the noisy.
#9 Updated by Ricardo Dias over 5 years ago
- Status changed from New to Won't Fix
#10 Updated by Greg Farnum about 5 years ago
- Project changed from Ceph to Messengers
- Category deleted (
msgr)