Project

General

Profile

Actions

Bug #15290

closed

rbd journal

Added by Anonymous about 8 years ago. Updated almost 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I use fio to test rbd-mirror, I do not use two rbd-mirror each run witin cluster.
I just use one rbd-mirror in cluster kk, it fetches journal from the other cluster which named ceph. When i use fio to write some data to a rbd in cluster ceph. Both rbd-mirror and fio will generate error.

The rbd-mirror says this:

--- begin dump of recent events ---
0> 2016-03-28 23:01:27.113322 7f2bbdffb700 -1 ** Caught signal (Aborted) *
in thread 7f2bbdffb700

ceph version 10.0.5-2265-ge530ade (e530ade4d4c825e6a5b036e016eff6d3affea0d7)
1: (()+0x2cf06a) [0x7f2c0606906a]
2: (()+0xf100) [0x7f2bfb695100]
3: (gsignal()+0x37) [0x7f2bfa4b35f7]
4: (abort()+0x148) [0x7f2bfa4b4ce8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x265) [0x7f2c0609d145]
6: (()+0x278031) [0x7f2c06012031]
7: (journal::JournalPlayer::get_object_player() const+0x16) [0x7f2c06012056]
8: (journal::JournalPlayer::try_pop_front(journal::Entry*, unsigned long*)+0x20a) [0x7f2c06014f3a]
9: (journal::Journaler::try_pop_front(journal::ReplayEntry*, unsigned long*)+0xbf) [0x7f2c0600653f]
10: (rbd::mirror::ImageReplayer::handle_replay_ready()+0xbe) [0x7f2c05f2003e]
11: (rbd::mirror::ImageReplayer::handle_replay_process_ready(int)+0x6c) [0x7f2c05f1c7ac]
12: (Context::complete(int)+0x9) [0x7f2c05f25519]
13: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0x7f2c0608e78e]
14: (ThreadPool::WorkThread::entry()+0x10) [0x7f2c0608f690]
15: (()+0x7dc5) [0x7f2bfb68ddc5]
16: (clone()+0x6d) [0x7f2bfa57428d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 newstore
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
1/ 5 kinetic
1/ 5 fuse
2/-2 (syslog threshold)
99/99 (stderr threshold)
max_recent 10000
max_new 1000
log_file
--
end dump of recent events ---
Aborted

The fio says this:
fio: time_based requires a runtime/timeout setting
randwrite-4k: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=64
fio-2.2.8
Starting 1 thread
rbd engine: RBD version: 0.1.10
journal/JournalPlayer.cc: In function 'const ObjectPlayers& journal::JournalPlayer::get_object_players() const' thread 7f58715f8700 time 2016-03-28 23:07:19.238363
journal/JournalPlayer.cc: 421: FAILED assert(it != m_object_players.end())
ceph version 10.0.5-2265-ge530ade (e530ade4d4c825e6a5b036e016eff6d3affea0d7)
1: (()+0x246d25) [0x7f58a8fd9d25]
2: (()+0x14bf21) [0x7f58a8edef21]
3: (()+0x14bf46) [0x7f58a8edef46]
4: (()+0x14c312) [0x7f58a8edf312]
5: (()+0x14f028) [0x7f58a8ee2028]
6: (()+0x1400aa) [0x7f58a8ed30aa]
7: (()+0xbf8c0) [0x7f58a8e528c0]
8: (()+0x5c4d9) [0x7f58a8def4d9]
9: (()+0x8b774) [0x7f58a8e1e774]
10: (()+0x2378ae) [0x7f58a8fca8ae]
11: (()+0x238780) [0x7f58a8fcb780]
12: (()+0x7dc5) [0x7f589e8f7dc5]
13: (clone()+0x6d) [0x7f589e42128d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Aborted

Actions #1

Updated by Greg Farnum about 7 years ago

  • Project changed from Ceph to rbd
Actions #2

Updated by Jason Dillaman about 7 years ago

  • Status changed from New to Need More Info

@Tianqing Li: are you still seeing this issue? 10.0.5 was a developer release of Jewel so most likely it wasn't completely stable at that point.

Actions #3

Updated by Jason Dillaman almost 7 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF