Project

General

Profile

Actions

Bug #16077

closed

Seeing a BT while writing and re-sizing on a RBD Image in parallel, with Journaling Enabled

Added by Tanay Ganguly almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Created attachment 1163026 [details]
RBD Log

Description of problem:
While reproducing BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1325932
I am hitting a crash, but this time i have enabled Journaling.

Version-Release number of selected component (if applicable):
ceph version 10.2.1-6.el7cp

How reproducible:
2 times

If its not getting reproduced easily, repeat the same steps
Start the bench-write and run resize in parallel.

Steps to Reproduce:
1. Create and Image, take snap, protect it, and take a clone.
rbd image 'NEW_CLone':
size 2000 GB in 512000 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.1254862ae8944a
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
flags:
parent: cephfs_data/NEW@snap1
overlap: 2000 GB
journal: 1254862ae8944a
mirroring state: disabled

2. Run Resize script and bench-write in parallel.
rbd bench-write -p cephfs_data --image NEW_CLone --io-size 1024 --io-pattern rand

Actual results:
Seeing a Crash

Expected results:
There should not be a crash

Additional info:
Logs


-4> 2016-05-31 04:42:02.168457 7fb750ff9700 -1 librbd::AioCompletion: 0x7fb73c09f980 fail: (22) Invalid argument
-3> 2016-05-31 04:42:02.168477 7fb750ff9700 -1 librbd::AioCompletion: completed invalid aio_type: 0
-2> 2016-05-31 04:42:02.168482 7fb750ff9700 -1 librbd::journal::Replay: AIO modify op failed: (22) Invalid argument
-1> 2016-05-31 04:42:02.168487 7fb750ff9700 -1 librbd::Journal: failed to commit journal event to disk: (22) Invalid argument
0> 2016-05-31 04:42:02.169581 7fb750ff9700 -1 ** Caught signal (Aborted) *
in thread 7fb750ff9700 thread_name:tp_librbd


Files

qemu-guest-146110.log (141 KB) qemu-guest-146110.log RBD Log Tanay Ganguly, 05/31/2016 05:18 AM
resie.py (486 Bytes) resie.py Resize Script Tanay Ganguly, 05/31/2016 05:18 AM
Actions #1

Updated by Jason Dillaman almost 8 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #2

Updated by Jason Dillaman almost 8 years ago

  • Backport set to jewel
Actions #3

Updated by Jason Dillaman almost 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #4

Updated by Jason Dillaman almost 8 years ago

  • Backport deleted (jewel)

Fix included with PR for issue #15791 and will be backported via that tracker ticket.

Actions #5

Updated by Jason Dillaman almost 8 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF