Project

General

Profile

Actions

Bug #3238

closed

ceph-client: osd BUG_ON() tripped

Added by Alex Elder over 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I was trying to reproduce this bug:
    http://tracker.newdream.net/issues/3204
when this BUG_ON() was hit on the client:

static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
{
        dout("__remove_osd %p\n", osd);
        BUG_ON(!list_empty(&osd->o_requests));    <------- HERE
        rb_erase(&osd->o_node, &osdc->osds);
        list_del_init(&osd->o_osd_lru);
        ceph_con_close(&osd->o_con);
        put_osd(osd);
}

I don't have time to look into it right now but it seems like it
might not be very hard to work through what's leading up to this.

I was running with the current testing branch:
  testing  8767fe8  ceph: convert to use le32_add_cpu()

The test that led to this was repeatedly shutting down and
restarting the network interface on one of my OSD nodes while
running this "fio" command:

    fio \
        --iodepth=32 \
        --numjobs=8 \
        --runtime=5 \
        --ioengine=libaio \
        --group_reporting \
        --direct=1 \
        --eta=always \
        --name=job \
        --bs=65536 \
        --rw=w \
        --filename=/dev/rbd1

On the osd node I ran this:

    while true; do
        ifdown eth0
        sleep 10
        ifup eth0
        sleep 10
    done
Actions #1

Updated by Sage Weil about 11 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF