Actions
Bug #1148
closedo_direct crash in msgr
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
test_sync_io makes us crash on
root@uml:~/mnt# /host/home/sage/ceph/src/test_sync_io writing pattern read_direct buf_align 0 offset 4190208 len 1024 read_sync buf_align 0 offset 4190208 len 1024 read_direct buf_align 0 offset 4190208 len 2048 read_sync buf_align 0 offset 4190208 len 2048 read_direct buf_align 0 offset 4190208 len 4096 read_sync buf_align 0 offset 4190208 len 4096 read_direct buf_align 0 offset 4190208 len 8192 read_sync buf_align 0 offset 4190208 len 8192 read_direct buf_align 0 offset 4190208 len 16384 read_sync buf_align 0 offset 4190208 len 16384 read_direct buf_align 0 offset 4190720 len 1024 read_sync buf_align 0 offset 4190720 len 1024 read_direct buf_align 0 offset 4190720 len 2048 read_sync buf_align 0 offset 4190720 len 2048 read_direct buf_align 0 offset 4190720 len 4096
with
#12 <signal handler called> #13 memcpy () at arch/um/sys-x86_64/../../x86/lib/memcpy_64.S:70 #14 0x0000000060018991 in copy_to_user (to=0x9696967e34a69e00, from=0x7d0ea91b, n=448) at arch/um/kernel/skas/uaccess.c:165 #15 0x00000000601bf1a1 in memcpy_toiovec (iov=0x8110db20, kdata=0x7d0ea91b "", len=512) at net/core/iovec.c:87 #16 0x00000000601bfb0c in skb_copy_datagram_iovec (skb=0x7dcc9f28, offset=2098112827, to=0x8110db20, len=512) at net/core/datagram.c:317 #17 0x00000000601ea3a1 in tcp_recvmsg (iocb=<value optimized out>, sk=0x80978948, msg=0x8110dae0, len=512, nonblock=<value optimized out>, flags=16384, addr_len=0x8110d924) at net/ipv4/tcp.c:1693 f 22#18 0x0000000060205eef in inet_recvmsg (iocb=0x9696967e34a69e00, sock=<value optimized out>, msg=0x1c0, size=512, flags=<value optimized out>) at net/ipv4/af_inet.c:763 #19 0x00000000601b5c10 in sock_recvmsg (sock=<value optimized out>, msg=<value optimized out>, size=<value optimized out>, flags=<value optimized out>) at net/socket.c:698 #20 0x00000000601b5c79 in kernel_recvmsg (sock=0x9696967e34a69e00, msg=0x7d0ea93b, vec=<value optimized out>, num=<value optimized out>, size=4194320, flags=4194304) at net/socket.c:767 #21 0x000000006023ed2c in ceph_tcp_recvmsg (sock=0x400010, buf=<value optimized out>, len=4194304) at net/ceph/messenger.c:258 #22 0x000000006024059d in try_read (con=0x80f3f820) at net/ceph/messenger.c:1452 #23 0x0000000060240e7e in con_work (work=0x80f3fc38) at net/ceph/messenger.c:2006
Updated by Henry Chang almost 13 years ago
Hi Sage,
I refactored stripe_read a bit and fixed the calculation of page count in ceph_osdc_new_request.
See:
https://github.com/henrycc/ceph-kclient/commit/ff68ae13de400521c1742933df3c55b8f170fe18
https://github.com/henrycc/ceph-kclient/commit/67c9611a622d40488f219258e2839f08f4c07b59
It can pass both fsstress and your test program - test_sync_io now.
Actions