Bug #58027
op slow from throttled to header_read
0%
Description
ceph version 15.2.7
Op spend 500ms from throttled to header_read
"description": "osd_repop(client.10960891.0:99657833 3.2d e4817/3582 3:b5e0c452:::rbd_data.a43b952dd64c1a.0000000000013581:head v 4817'36005738, mlcod=4817'36005736)", "initiated_at": "2022-11-15T06:39:34.428593+0000", "age": 456.631042654, "duration": 0.50463920799999995, "type_data": { "flag_point": "commit sent; apply or cleanup", "events": [ { "event": "initiated", "time": "2022-11-15T06:39:34.428593+0000", "duration": 0 }, { "event": "throttled", "time": "2022-11-15T06:39:34.428593+0000", "duration": 0.50122236099999995 }, { "event": "header_read", "time": "2022-11-15T06:39:34.929815+0000", "duration": 3.9079999999999997e-06 }, { "event": "all_read", "time": "2022-11-15T06:39:34.929819+0000", "duration": 1.6670000000000001e-06 }, { "event": "dispatched", "time": "2022-11-15T06:39:34.929821+0000", "duration": 2.6900000000000001e-06 }, { "event": "queued_for_pg", "time": "2022-11-15T06:39:34.929823+0000", "duration": 1.9052999999999999e-05 }, { "event": "reached_pg", "time": "2022-11-15T06:39:34.929842+0000", "duration": 9.4639999999999995e-06 }, { "event": "started", "time": "2022-11-15T06:39:34.929852+0000", "duration": 0.0033629210000000001 }, { "event": "commit_sent", "time": "2022-11-15T06:39:34.933215+0000", "duration": 1.7144000000000001e-05 }, { "event": "done", "time": "2022-11-15T06:39:34.933232+0000", "duration": 0.50463920799999995 } } ] } },
History
#1 Updated by Radoslaw Zarzynski about 1 year ago
- Status changed from New to Closed
Hello! The most important thing is Octopus is EOL. Second, I'm also not sure whether this is really a bug. Seeing 0,5s delay between these stages on an overloaded system wouldn't be a surprise. Unfortunately, the description field misses the info about the system state.
#2 Updated by yite gu about 1 year ago
Radoslaw Zarzynski wrote:
Hello! The most important thing is Octopus is EOL. Second, I'm also not sure whether this is really a bug. Seeing 0,5s delay between these stages on an overloaded system wouldn't be a surprise. Unfortunately, the description field misses the info about the system state.
thank your reply, I'm planning to use version 16. But about this question, I capture io stack by perf on msgr-work thread:
- 5.59% 0.00% msgr-worker-2 libc-2.28.so [.] __GI___clone (inlined) ▒ __GI___clone (inlined) ▒ start_thread ▒ - 0x7fef46a23b72 ▒ - 5.27% 0x56026850997b ▒ - EventCenter::process_events ▒ - 4.62% AsyncConnection::process ▒ - 2.55% AsyncConnection::read ▒ AsyncConnection::read_until ▒ - AsyncConnection::read_bulk ▒ - PosixConnectedSocketImpl::read ▒ __libc_read ▒ entry_SYSCALL_64_after_hwframe ▒ do_syscall_64 ▒ ksys_read ▒ vfs_read ▒ new_sync_read ▒ sock_read_iter ▒ inet_recvmsg ▒ - tcp_recvmsg ▒ - 2.36% skb_copy_datagram_iter ▒ - __skb_datagram_iter ▒ - 1.36% _copy_to_iter ▒ - copyout ▒ - copy_user_enhanced_fast_string ▒ - page_fault ▒ - do_page_fault ▒ do_user_addr_fault ▒ - handle_mm_fault ▒ - 0.55% __lru_cache_add ▒ pagevec_lru_move_fn ▒ - __pagevec_lru_add_fn ▒ 0.52% mem_cgroup_update_lru_size ▒ - 0.99% __skb_datagram_iter ▒ - 0.68% simple_copy_to_iter ▒ __check_object_size
Will page fault cause the cost of latency?
The osd page fault count:
# pidstat -p 39 -r 1 Linux 5.4.56.bsk.9-amd64 (n6-011-034) 11/20/22 _x86_64_ (128 CPU) 13:39:23 UID PID minflt/s majflt/s VSZ RSS %MEM Command 13:39:24 167 39 13604.00 0.00 5238560 3206228 0.61 ceph-osd 13:39:25 167 39 7502.00 0.00 5238560 3197992 0.61 ceph-osd 13:39:26 167 39 655.00 0.00 5238560 3180840 0.60 ceph-osd 13:39:27 167 39 1588.00 0.00 5238560 3185420 0.60 ceph-osd 13:39:28 167 39 7014.00 0.00 5238560 3181972 0.60 ceph-osd 13:39:29 167 39 6058.00 0.00 5238560 3182320 0.60 ceph-osd 13:39:30 167 39 11866.00 0.00 5238560 3185644 0.60 ceph-osd 13:39:31 167 39 168.00 0.00 5238560 3180964 0.60 ceph-osd 13:39:32 167 39 4155.00 0.00 5238560 3192220 0.60 ceph-osd 13:39:33 167 39 16519.00 0.00 5238560 3196880 0.61 ceph-osd 13:39:34 167 39 493.00 0.00 5238560 3181448 0.60 ceph-osd 13:39:35 167 39 12038.00 0.00 5238560 3204436 0.61 ceph-osd 13:39:36 167 39 5057.00 0.00 5238560 3181728 0.60 ceph-osd 13:39:37 167 39 4273.00 0.00 5238560 3197488 0.61 ceph-osd 13:39:38 167 39 2989.00 0.00 5238560 3183068 0.60 ceph-osd 13:39:39 167 39 2379.00 0.00 5238560 3183604 0.60 ceph-osd 13:39:40 167 39 3536.00 0.00 5238560 3187676 0.60 ceph-osd 13:39:41 167 39 8212.00 0.00 5238560 3185692 0.60 ceph-osd 13:39:42 167 39 803.00 0.00 5238560 3185140 0.60 ceph-osd 13:39:43 167 39 3699.00 0.00 5238560 3184828 0.60 ceph-osd 13:39:44 167 39 6535.00 0.00 5238560 3184524 0.60 ceph-osd 13:39:45 167 39 349.00 0.00 5238560 3185384 0.60 ceph-osd 13:39:46 167 39 7476.00 0.00 5238560 3192228 0.60 ceph-osd 13:39:47 167 39 1294.00 0.00 5238560 3184652 0.60 ceph-osd 13:39:48 167 39 9576.00 0.00 5238560 3184756 0.60 ceph-osd 13:39:49 167 39 2484.00 0.00 5238560 3188288 0.60 ceph-osd 13:39:50 167 39 12107.00 0.00 5238560 3216644 0.61 ceph-osd 13:39:51 167 39 6238.00 0.00 5238560 3197620 0.61 ceph-osd 13:39:52 167 39 1791.00 0.00 5238560 3186704 0.60 ceph-osd 13:39:53 167 39 5636.00 0.00 5238560 3196064 0.61 ceph-osd 13:39:54 167 39 175.00 0.00 5238560 3186432 0.60 ceph-osd 13:39:55 167 39 606.00 0.00 5238560 3186968 0.60 ceph-osd 13:39:56 167 39 4737.00 0.00 5238560 3187940 0.60 ceph-osd 13:39:57 167 39 214.00 0.00 5238560 3187364 0.60 ceph-osd 13:39:58 167 39 8543.00 0.00 5238560 3205596 0.61 ceph-osd 13:39:59 167 39 7011.00 0.00 5238560 3187972 0.60 ceph-osd 13:40:00 167 39 6673.00 0.00 5238560 3193500 0.61 ceph-osd 13:40:01 167 39 15690.00 0.00 5238560 3195520 0.61 ceph-osd 13:40:02 167 39 5459.00 0.00 5238560 3205512 0.61 ceph-osd 13:40:03 167 39 8159.00 0.00 5238560 3225232 0.61 ceph-osd 13:40:04 167 39 5601.00 0.00 5238560 3237804 0.61 ceph-osd 13:40:05 167 39 5106.00 0.00 5238560 3232152 0.61 ceph-osd 13:40:06 167 39 5089.00 0.00 5238560 3235572 0.61 ceph-osd 13:40:07 167 39 7844.00 0.00 5238560 3232300 0.61 ceph-osd 13:40:08 167 39 4734.00 0.00 5238560 3238236 0.61 ceph-osd 13:40:09 167 39 568.00 0.00 5238560 3229796 0.61 ceph-osd