Bug #3946: rbd fsx failing in nightly - rbd - Ceph

Actions

Copy link

Bug #3946

closed

rbd fsx failing in nightly

Added by Sage Weil about 11 years ago. Updated about 11 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Josh Durgin

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

6376 FAIL scheduled_teuthology@teuthology collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:btrfs.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writeback.yaml 294s
6380 FAIL scheduled_teuthology@teuthology collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:ext4.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writeback.yaml 249s
6381 FAIL scheduled_teuthology@teuthology collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:ext4.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writethrough.yaml 234s
6384 FAIL scheduled_teuthology@teuthology collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writeback.yaml 284s
6385 FAIL scheduled_teuthology@teuthology collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writethrough.yaml 398s
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-01-27_01:00:03-regression-master-testing-gcov$

Actions

Copy link

Updated by Sage Weil about 11 years ago

Project changed from Ceph to rbd

Actions

Copy link

Updated by Ian Colle about 11 years ago

Assignee set to Josh Durgin

Actions

Copy link

Updated by Josh Durgin about 11 years ago

I'm guessing these are related to recent objectcacher changes, since they didn't affect runs without caching. The core files seem to corrupt for some reason, and there's no trace of what happened in the logs.

Actions

Copy link

Updated by Josh Durgin about 11 years ago

Reproducing locally seems to confirm this, since there was a recent change to replace commit_set() with flush_set():

#0  0x0000000000000031 in ?? ()
#1  0x00007f52b94e8800 in ~C_GatherBuilder (this=0x7fff34b20c60, __in_chrg=<value optimized out>) at ./include/Context.h:273
#2  0x00007f52b94e34fe in ObjectCacher::flush_set (this=0x1558910, oset=0x1558dd0, onfinish=0x1675660) at osdc/ObjectCacher.cc:1501
#3  0x00007f52b94aa808 in librbd::ImageCtx::flush_cache (this=0x1558010) at librbd/ImageCtx.cc:517
#4  0x00007f52b94c887c in librbd::_flush (ictx=0x1558010) at librbd/internal.cc:2475
#5  0x00007f52b94c27b0 in librbd::ictx_refresh (ictx=0x1558010) at librbd/internal.cc:1725
#6  0x00007f52b94c0860 in librbd::ictx_check (ictx=0x1558010) at librbd/internal.cc:1530
#7  0x00007f52b94b3cfe in librbd::snap_protect (ictx=0x1558010, snap_name=0x40744a "snap") at librbd/internal.cc:528
#8  0x00007f52b948ef50 in rbd_snap_protect (image=0x1558010, snap_name=0x40744a "snap") at librbd/librbd.cc:841
#9  0x0000000000405006 in do_clone () at test/librbd/fsx.c:837
#10 0x0000000000405b31 in test () at test/librbd/fsx.c:1073
#11 0x00000000004067f1 in main (argc=2, argv=0x7fff34b231e8) at test/librbd/fsx.c:1551

This occurred after just 246 ops in test_librbd_fsx with rbd caching on.

Actions

Copy link

Updated by Josh Durgin about 11 years ago

Status changed from New to Resolved

Just an extra delete in a code path in flush_set that wasn't exercised before. Fixed by commit:3bc21143552b35698c9916c67494336de8964d2a

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rbd

Custom queries

Bug #3946

rbd fsx failing in nightly

Updated by Sage Weil about 11 years ago

Updated by Ian Colle about 11 years ago

Updated by Josh Durgin about 11 years ago

Updated by Josh Durgin about 11 years ago

Updated by Josh Durgin about 11 years ago