Actions
Bug #22609
closedthrash-eio + bluestore fails with "reached maximum tries (3650) after waiting for 21900 seconds"
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2018-01-06T19:31:37.799 INFO:teuthology.orchestra.run.smithi191:Running: "/bin/sh -c 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados --no-log-to-stderr --name client.0 -p unique_pool_0 bench 720 rand'" 2018-01-06T19:31:37.804 DEBUG:teuthology.run_tasks:Unwinding manager radosbench 2018-01-06T19:31:37.827 INFO:tasks.radosbench:joining radosbench (timing out after 21900s) ... 2018-01-06T19:31:37.853 INFO:tasks.radosbench.radosbench.0.smithi191.stdout: sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) ... 2018-01-06T19:41:19.991 INFO:tasks.radosbench.radosbench.0.smithi191.stdout: 582 16 414059 414043 711.241 2 0.00481413 0.0194955 2018-01-07T01:42:56.505 INFO:tasks.radosbench.radosbench.0.smithi191.stdout:22276 16 414059 414043 18.5847 0 - 0.0194955 2018-01-07T01:43:00.507 INFO:tasks.radosbench.radosbench.0.smithi191.stdout:22280 16 414059 414043 18.5814 0 - 0.0194955 2018-01-07T01:43:02.238 ERROR:teuthology.run_tasks:Manager failed: radosbench
the cluster stopped processing the requests since 19:41.
the last couple log messages from osd.5 were
2018-01-06 19:34:02.420 7f30fc281700 20 bluestore(/var/lib/ceph/osd/ceph-5) _do_read blob Blob(0x558c57e53030 blob([0x58050000~10000] csum crc32c/0x10000) use_tracker(0x10000 0x10000) SharedBlob(0x558c57e530a0 sbid 0x0)) need 0x0~10000 cache has 0x[] 2018-01-06 19:34:02.420 7f30fc281700 20 bluestore(/var/lib/ceph/osd/ceph-5) _do_read blob Blob(0x558c57e53110 blob([0x58060000~10000] csum crc32c/0x10000) use_tracker(0x10000 0x10000) SharedBlob(0x558c57e53180 sbid 0x0)) need 0x0~10000 cache has 0x[] 2018-01-06 19:34:02.420 7f30fc281700 20 bluestore(/var/lib/ceph/osd/ceph-5) _do_read blob Blob(0x558c57e531f0 blob([0x58070000~10000] csum crc32c/0x10000) use_tracker(0x10000 0x10000) SharedBlob(0x558c57e532
from osd.3
2018-01-06 19:30:42.645 7fc99add3700 20 bluestore.blob(0x55b618d791e0) get_ref init 0x10000, 10000 2018-01-06 19:30:42.645 7fc99add3700 20 bluestore(/var/lib/ceph/osd/ceph-3) _do_alloc_write lex 0xe0000~10000: 0x0~10000 Blob(0x55b618d791e0 blob([0x5fe88000~10000] csum crc32c/0x10000) use_tracker(0x10000 0x10000) SharedBlob(0x55b618d79250 sbid 0x0)) 2018-01-06 19:30:42.645 7fc
also:
$ ls -lh remote/*/log/ceph-osd.*.log.gz -rw-rw-r-- 1 teuthworker teuthworker 266M Jan 7 04:32 remote/smithi062/log/ceph-osd.0.log.gz -rw-rw-r-- 1 teuthworker teuthworker 626M Jan 7 04:30 remote/smithi062/log/ceph-osd.1.log.gz -rw-rw-r-- 1 teuthworker teuthworker 335M Jan 7 04:31 remote/smithi062/log/ceph-osd.2.log.gz -rw-rw-r-- 1 teuthworker teuthworker 43G Jan 7 04:29 remote/smithi191/log/ceph-osd.3.log.gz -rw-rw-r-- 1 teuthworker teuthworker 198M Jan 7 04:29 remote/smithi191/log/ceph-osd.4.log.gz -rw-rw-r-- 1 teuthworker teuthworker 253M Jan 7 04:29 remote/smithi191/log/ceph-osd.5.log.gz
so, i guess smithi191 was out of disk space by then.
Updated by Josh Durgin over 6 years ago
- Project changed from RADOS to bluestore
Is this bluestore not handling out of space conditions well?
Updated by Sage Weil about 6 years ago
- Status changed from New to Can't reproduce
Haven't seen this failure in quite a while.. I think it may be resolved! reopen if it reappears
Actions