Bug #8100
closedRados Bench seq read errors on tiered configuration
0%
Description
Attempting to perform the following rados bench seq test on a tiered pool setup that had data written to it via rados bench is resulting in the following error:
nhm@burnupiY:~$ /usr/bin/rados -c /tmp/cbt/ceph/ceph.conf -p rados-bench-`hostname -s`-3 -b 4096 bench 300 seq --concurrent-ios 32 --no-cleanup sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 read got 4096 error during benchmark: -5 error 5: (5) Input/output error
ceph -s shows all OSDs are up and in, but some cache pools are at/near their target max:
nhm@burnupiY:~$ ceph -s cluster 72746997-0aea-479e-bc48-6153b319cf35 health HEALTH_WARN 'rados-bench-burnupiY-0-cache' at/near target max; 'rados-bench-burnupiY-1-cache' at/near target max; 'rados-bench-burnupiY-2-cache' at/near target max; 'rados-bench-burnupiY-3-cache' at/near target max monmap e1: 1 mons at {a=192.168.10.2:6789/0}, election epoch 2, quorum 0 a osdmap e220: 36 osds: 36 up, 36 in pgmap v1977: 21248 pgs, 11 pools, 278 GB data, 703 kobjects 694 GB used, 27907 GB / 28602 GB avail 21248 active+clean client io 6747 kB/s wr, 131 op/s
Not much to go on yet, but seems to be repeatable. Will do more testing.
Updated by Mark Nelson about 10 years ago
This appears to be happening on non-tiered pools as well, regardless if erasure coding or replication is used.
Updated by Greg Farnum about 10 years ago
Did you check for typos? :p Right pool name? That "-3" looks easy to get wrong.
Updated by Mark Nelson about 10 years ago
It's all automated, though I did try manually testing reads from the command line as well. FWIW, with debugging enabled we seem to succeed with a number of reads with ondisk = 0 before we hit the Input/output error.
Updated by Mark Nelson about 10 years ago
Through some bisecting and a well-informed guess by Yehuda, it appears that this is being caused by d99f1d9f.
Updated by David Zafman about 10 years ago
- Status changed from New to 7
- Assignee set to David Zafman
Updated by David Zafman about 10 years ago
- Status changed from 7 to Resolved
a3d452acdf2fcf9ad10002c5f24c2548d12952bd