Project

General

Profile

Bug #20333

RBD bench in EC pool w/ overwrites overwhelms OSDs

Added by Greg Farnum about 2 years ago. Updated about 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
06/14/2017
Due date:
% Done:

0%

Source:
Tags:
EC pool, bluestore, RBD
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

When running "rbd bench-write" using an RBD image stored in an EC pool, the some OSD threads start to timeout and eventually suicide, and some OSDs are marked as down.
The OSDs are then marked as up again after some time with the benchmark already stopped.

This behavior does not happen when using an RBD image in a replicated pool.

ceph version 12.0.3

Steps to reproduce:
// Assuming cluster deployed with blustore OSDs

ceph osd pool create ecpool 12 12 erasure
ceph osd pool set ecpool allow_ec_overwrites true
rbd --data-pool ecpool create --size 1024G test1
rbd bench-write test1 --io-pattern=rand

Output of "ceph -s" before benchmark.

  cluster:
    id:     4508f518-5b6d-32f4-85c3-acdda7675409
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node2,node3
    mgr: e8: node3(active), standbys: node2, node1
    osd: 6 osds: 6 up, 6 in

  data:
    pools:   2 pools, 76 pgs
    objects: 1933 objects, 3869 MB
    usage:   12923 MB used, 23334 MB / 36257 MB avail
    pgs:     76 active+clean

  io:
    client:   211 B/s rd, 0 op/s rd, 0 op/s wr
    recovery: 3180 kB/s, 0 keys/s, 1 objects/s

Output of "rbd bench-write"

headnode:~ # rbd bench-write test1 --io-pattern=rand
2017-06-14 13:55:20.765851 7f467a432c80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore
2017-06-14 13:55:20.765972 7f467a432c80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore
rbd: bench-write is deprecated, use rbd bench --io-type write ...
2017-06-14 13:55:20.770011 7f467a432c80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore
bench  type write io_size 4096 io_threads 16 bytes 1073741824 pattern random
  SEC       OPS   OPS/SEC   BYTES/SEC
2017-06-14 13:56:25.772052 7f4667fff700  1 heartbeat_map is_healthy 'tp_librbd thread tp_librbd' had timed out after 60
^C

Output of "ceph osd tree" after benchmark.

ID WEIGHT  TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.03479 root default                                     
-2 0.01160     host node1                                   
 0 0.00580         osd.0       up  1.00000          1.00000 
 3 0.00580         osd.3     down  1.00000          1.00000 
-3 0.01160     host node3                                   
 2 0.00580         osd.2     down  1.00000          1.00000 
 4 0.00580         osd.4       up  1.00000          1.00000 
-4 0.01160     host node2                                   
 1 0.00580         osd.1     down  1.00000          1.00000 
 5 0.00580         osd.5       up  1.00000          1.00000 

I attached the logs of osd.2 and osd.3 that were marked as down. I run the OSDs with --debug-osd=20/20

osd.2.log.tar.bz2 (678 KB) Ricardo Dias, 06/14/2017 02:23 PM

osd.3.log.tar.bz2 (739 KB) Ricardo Dias, 06/14/2017 02:23 PM


Related issues

Copied from RADOS - Bug #20295: bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool w/ overwrites Resolved 06/14/2017

History

#1 Updated by Greg Farnum about 2 years ago

  • Copied from Bug #20295: bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool w/ overwrites added

#2 Updated by Greg Farnum about 2 years ago

Hopefully the RBD client can do something to be a little friendlier? Tracking OSD throttling improvements in the original ticket but suspect there won't be a quick fix there.

#3 Updated by Jason Dillaman about 2 years ago

  • Status changed from New to Need More Info

I'm not really sure what RBD can do in this situation. That test was only 16 concurrent IOs in-flight, so when you have N number of VMs performing IO and OSDs start crashing, ....?

#4 Updated by Greg Farnum about 2 years ago

  • Status changed from Need More Info to Rejected

Sorry, I heard today from Josh that this report involved a vstart cluster and wasn't unique to EC pools in any case.

Also available in: Atom PDF