Project

General

Profile

Actions

Bug #61622

open

bdev(0x55fbdb181180 /var/lib/ceph/osd/ceph-0/block) _aio_thread got r=-28 ((28) No space left on device

Added by Xiubo Li 11 months ago. Updated 8 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298111/
https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298113/

2023-06-08T10:29:08.849 INFO:tasks.ceph.osd.1.smithi070.stderr:2023-06-08T10:29:08.842+0000 7fcf7d52d640 -1 Fail to open '/proc/578558/cmdline' error = (2) No such file or directory
2023-06-08T10:29:08.849 INFO:tasks.ceph.osd.1.smithi070.stderr:2023-06-08T10:29:08.842+0000 7fcf7d52d640 -1 received  signal: Hangup from <unknown> (PID: 578558) UID: 0
2023-06-08T10:29:08.850 INFO:tasks.ceph.osd.3.smithi070.stderr:2023-06-08T10:29:08.846+0000 7fc2e70b4640 -1 Fail to open '/proc/578558/cmdline' error = (2) No such file or directory
2023-06-08T10:29:08.850 INFO:tasks.ceph.osd.3.smithi070.stderr:2023-06-08T10:29:08.846+0000 7fc2e70b4640 -1 received  signal: Hangup from <unknown> (PID: 578558) UID: 0
2023-06-08T10:29:08.881 INFO:tasks.ceph.osd.2.smithi070.stderr:2023-06-08T10:29:08.846+0000 7f05e45c9640 -1 Fail to open '/proc/578558/cmdline' error = (2) No such file or directory
2023-06-08T10:29:08.881 INFO:tasks.ceph.osd.2.smithi070.stderr:2023-06-08T10:29:08.846+0000 7f05e45c9640 -1 received  signal: Hangup from <unknown> (PID: 578558) UID: 0
2023-06-08T10:29:08.898 INFO:tasks.ceph.osd.0.smithi070.stderr:2023-06-08T10:29:08.842+0000 7f0da9555640 -1 Fail to open '/proc/578558/cmdline' error = (2) No such file or directory
2023-06-08T10:29:08.898 INFO:tasks.ceph.osd.0.smithi070.stderr:2023-06-08T10:29:08.842+0000 7f0da9555640 -1 received  signal: Hangup from <unknown> (PID: 578558) UID: 0
2023-06-08T10:41:48.405 INFO:tasks.ceph.osd.0.smithi070.stderr:2023-06-08T10:41:48.336+0000 7f0d9f35d640 -1 bdev(0x55fbdb181180 /var/lib/ceph/osd/ceph-0/block) _aio_thread got r=-28 ((28) No space left on device)
2023-06-08T10:41:48.705 INFO:tasks.ceph.osd.0.smithi070.stderr:./src/blk/kernel/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7f0d9f35d640 time 2023-06-08T10:41:48.370137+0000
2023-06-08T10:41:48.705 INFO:tasks.ceph.osd.0.smithi070.stderr:./src/blk/kernel/KernelDevice.cc: 633: ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. Please check your kernel log!")
2023-06-08T10:41:48.705 INFO:tasks.ceph.osd.0.smithi070.stderr: ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
2023-06-08T10:41:48.705 INFO:tasks.ceph.osd.0.smithi070.stderr: 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xc6) [0x55fbd824d360]
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr: 2: (KernelDevice::_aio_thread()+0x9bc) [0x55fbd8b0746c]
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr: 3: ceph-osd(+0x157dd21) [0x55fbd8b07d21]
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr: 4: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f0dace94b43]
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr: 5: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f0dacf26a00]
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr:*** Caught signal (Aborted) **
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr: in thread 7f0d9f35d640 thread_name:bstore_aio
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr:2023-06-08T10:41:48.680+0000 7f0d9f35d640 -1 ./src/blk/kernel/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7f0d9f35d640 time 2023-06-08T10:41:48.370137+0000
2023-06-08T10:41:48.706 INFO:tasks.ceph.osd.0.smithi070.stderr:./src/blk/kernel/KernelDevice.cc: 633: ceph_abort_msg("Unexpected IO error. This may suggest a hardware issue. Please check your kernel log!")
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr:
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xc6) [0x55fbd824d360]
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: 2: (KernelDevice::_aio_thread()+0x9bc) [0x55fbd8b0746c]
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: 3: ceph-osd(+0x157dd21) [0x55fbd8b07d21]
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: 4: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f0dace94b43]
2023-06-08T10:41:48.707 INFO:tasks.ceph.osd.0.smithi070.stderr: 5: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f0dacf26a00]
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr:
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: 1: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f0dace42520]
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: 2: pthread_kill()
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: 3: raise()
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: 4: abort()
2023-06-08T10:41:48.708 INFO:tasks.ceph.osd.0.smithi070.stderr: 5: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x17b) [0x55fbd824d415]
2023-06-08T10:41:48.709 INFO:tasks.ceph.osd.0.smithi070.stderr: 6: (KernelDevice::_aio_thread()+0x9bc) [0x55fbd8b0746c]
2023-06-08T10:41:48.709 INFO:tasks.ceph.osd.0.smithi070.stderr: 7: ceph-osd(+0x157dd21) [0x55fbd8b07d21]
2023-06-08T10:41:48.709 INFO:tasks.ceph.osd.0.smithi070.stderr: 8: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f0dace94b43]
2023-06-08T10:41:48.709 INFO:tasks.ceph.osd.0.smithi070.stderr: 9: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f0dacf26a00]
2023-06-08T10:41:48.787 INFO:tasks.ceph.osd.0.smithi070.stderr:2023-06-08T10:41:48.756+0000 7f0d9f35d640 -1 *** Caught signal (Aborted) **
2023-06-08T10:41:48.787 INFO:tasks.ceph.osd.0.smithi070.stderr: in thread 7f0d9f35d640 thread_name:bstore_aio
2023-06-08T10:41:48.787 INFO:tasks.ceph.osd.0.smithi070.stderr:

Actions #1

Updated by Radoslaw Zarzynski 11 months ago

  • Project changed from RADOS to bluestore
  • Status changed from New to Need More Info

1. Has it been observed also on main or solely on the testing branch?
2. This truly might be hardware issue. Is the dmesg clean?

Actions #2

Updated by Xiubo Li 11 months ago

Radoslaw Zarzynski wrote:

1. Has it been observed also on main or solely on the testing branch?

Could be seen in main branch too.

2. This truly might be hardware issue. Is the dmesg clean?

Is that possible the disk space were used up ? This could be hit when run the xfstests-dev test suites, which will run hundreds of test cases for each teuthology job ?

Thanks

Actions #3

Updated by Adam Kupczyk 8 months ago

  • Priority changed from Urgent to Normal
Actions

Also available in: Atom PDF