Project

General

Profile

Actions

Bug #16352

closed

rbd-mirror: ensure replay status formatter has completed before stopping replay

Added by Jason Dillaman almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Relevant Logs

2016-06-16 17:14:25.959655 7f5bed898700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 get_or_send_update: need to update tag cache
2016-06-16 17:14:25.959662 7f5bed898700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] send_mirror_status_update: waiting for replay status
2016-06-16 17:14:25.960090 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: retrieved tag 9: [tid=9, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 08 00 00 00 00 00 00  |................|
00000010  00 01 00 00 00 00 00 00  00                       |.........|
00000019
]
2016-06-16 17:14:25.960104 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: decoded tag 9: [mirror_uuid=, predecessor_mirror_uuid=, predecessor_tag_tid=8, predecessor_entry_tid=1]
2016-06-16 17:14:25.960107 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 send_update_tag_cache: master_tag_tid=8, mirror_tag_tid=2
2016-06-16 17:14:26.209375 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_start_replay: m_remote_journaler=[metadata=[oid=journal.d8eb79e2a9e3, initialized=1, order=24, splay_width=4, pool_id=-1, minimum_set=1, active_set=1, client_id=6faf69ce-0bb4-426a-a026-11d582c08bf2, commit_tid=0, commit_interval=5, commit_position=[positions=[]], registered_clients=[[id=, commit_position=[positions=[[object_number=5, tag_tid=9, entry_tid=1], [object_number=4, tag_tid=9, entry_tid=0], [object_number=7, tag_tid=2, entry_tid=25599], [object_number=6, tag_tid=2, entry_tid=25598]]], state=connected], [id=6faf69ce-0bb4-426a-a026-11d582c08bf2, commit_position=[positions=[[object_number=7, tag_tid=2, entry_tid=25599], [object_number=6, tag_tid=2, entry_tid=25598], [object_number=5, tag_tid=2, entry_tid=25597], [object_number=4, tag_tid=2, entry_tid=25596]]], state=connected]]]]
2016-06-16 17:14:26.224083 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: retrieved tag 8: [tid=8, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 07 00 00 00 00 00 00  |................|
00000010  00 01 00 00 00 00 00 00  00                       |.........|
00000019
]
2016-06-16 17:14:26.224097 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: decoded tag 8: [mirror_uuid=, predecessor_mirror_uuid=, predecessor_tag_tid=7, predecessor_entry_tid=1]
2016-06-16 17:14:26.224101 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 send_update_tag_cache: master_tag_tid=7, mirror_tag_tid=2
2016-06-16 17:14:26.381053 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: retrieved tag 7: [tid=7, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 06 00 00 00 00 00 00  |................|
00000010  00 01 00 00 00 00 00 00  00                       |.........|
00000019
]
2016-06-16 17:14:26.381069 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: decoded tag 7: [mirror_uuid=, predecessor_mirror_uuid=, predecessor_tag_tid=6, predecessor_entry_tid=1]
2016-06-16 17:14:26.381073 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 send_update_tag_cache: master_tag_tid=6, mirror_tag_tid=2
2016-06-16 17:14:26.399747 7f5bed898700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_ready: enter
2016-06-16 17:14:26.399751 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: retrieved tag 6: [tid=6, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 05 00 00 00 00 00 00  |................|
00000010  00 01 00 00 00 00 00 00  00                       |.........|
00000019
]
2016-06-16 17:14:26.399762 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: decoded tag 6: [mirror_uuid=, predecessor_mirror_uuid=, predecessor_tag_tid=5, predecessor_entry_tid=1]
2016-06-16 17:14:26.399767 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 send_update_tag_cache: master_tag_tid=5, mirror_tag_tid=2
2016-06-16 17:14:26.399771 7f5bed898700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] process_entry: processing entry tid=1
2016-06-16 17:14:26.399823 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_ready: 
2016-06-16 17:14:26.399829 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_ready: enter
2016-06-16 17:14:26.399846 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] process_entry: processing entry tid=2
2016-06-16 17:14:26.399859 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_safe: commit_tid=1, r=-85
2016-06-16 17:14:26.399861 7f5aebfff700 -1 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_safe: failed to commit journal event: (85) Interrupted system call should be restarted
2016-06-16 17:14:26.399871 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_complete: r=-85
2016-06-16 17:14:26.399873 7f5aebfff700 -1 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_complete: replay encountered an error: (85) Interrupted system call should be restarted
2016-06-16 17:14:26.399876 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] set_state_description: -85 failed to commit journal event
2016-06-16 17:14:26.399878 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] on_stop_journal_replay: enter
2016-06-16 17:14:26.399880 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] set_state_description: 0 
2016-06-16 17:14:26.399882 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] update_mirror_image_status: 
2016-06-16 17:14:26.399885 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] start_mirror_image_status_update: shut down in-progress: ignoring update
2016-06-16 17:14:26.399887 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] shut_down: r=0
2016-06-16 17:14:26.399895 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_safe: commit_tid=2, r=-85
2016-06-16 17:14:26.399897 7f5aebfff700 -1 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_safe: failed to commit journal event: (85) Interrupted system call should be restarted
2016-06-16 17:14:26.399900 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_complete: r=-85
2016-06-16 17:14:26.399901 7f5aebfff700 -1 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_complete: replay encountered an error: (85) Interrupted system call should be restarted
2016-06-16 17:14:26.399904 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] set_state_description: -85 failed to commit journal event
2016-06-16 17:14:26.399906 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] on_stop_journal_replay: enter
2016-06-16 17:14:26.399918 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_process_entry_ready: 
2016-06-16 17:14:26.399919 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] handle_replay_ready: enter
2016-06-16 17:14:26.399920 7f5aebfff700 20 rbd::mirror::ImageReplayer: 0x7f5b5c11b490 [3/79b9e6f4-ece0-4a63-91ab-8dc50ca36297] on_stop_journal_replay: enter
2016-06-16 17:14:26.402128 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: retrieved tag 5: [tid=5, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 04 00 00 00 00 00 00  |................|
00000010  00 01 00 00 00 00 00 00  00                       |.........|
00000019
]
2016-06-16 17:14:26.402170 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 handle_update_tag_cache: decoded tag 5: [mirror_uuid=, predecessor_mirror_uuid=, predecessor_tag_tid=4, predecessor_entry_tid=1]
2016-06-16 17:14:26.402178 7f5b7a7fc700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7f5ae426d680 send_update_tag_cache: master_tag_tid=4, mirror_tag_tid=2

Backtrace

#0  0x00007f5bf2d59fcb in raise () from /lib64/libpthread.so.0
#1  0x00007f5bfd8305a5 in reraise_fatal (signum=11) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:133
#3  <signal handler called>
#4  0x00007f5bfd7c9f9e in journal::JournalMetadata::get_tag (this=0x0, tag_tid=tag_tid@entry=4, tag=tag@entry=0x7f5ae426d728, 
    on_finish=on_finish@entry=0x7f595355ffa0) at journal/JournalMetadata.cc:578
#5  0x00007f5bfd7c2179 in journal::Journaler::get_tag (this=<optimized out>, tag_tid=tag_tid@entry=4, tag=tag@entry=0x7f5ae426d728, 
    on_finish=on_finish@entry=0x7f595355ffa0) at journal/Journaler.cc:303
#6  0x00007f5bfd68d731 in rbd::mirror::image_replayer::ReplayStatusFormatter<librbd::ImageCtx>::send_update_tag_cache (
    this=this@entry=0x7f5ae426d680, master_tag_tid=4, mirror_tag_tid=mirror_tag_tid@entry=2)
    at tools/rbd_mirror/image_replayer/ReplayStatusFormatter.cc:174
#7  0x00007f5bfd68f8e1 in rbd::mirror::image_replayer::ReplayStatusFormatter<librbd::ImageCtx>::handle_update_tag_cache (
    this=0x7f5ae426d680, master_tag_tid=5, mirror_tag_tid=2, r=<optimized out>)
    at tools/rbd_mirror/image_replayer/ReplayStatusFormatter.cc:207
#8  0x00007f5bfd65b50a in FunctionContext::finish (this=<optimized out>, r=<optimized out>)
    at /usr/include/boost/function/function_template.hpp:767
#9  0x00007f5bfd659209 in Context::complete (this=0x7f59daab77f0, r=<optimized out>) at include/Context.h:64
#10 0x00007f5bfd659209 in Context::complete (this=0x7f5953752e20, r=<optimized out>) at include/Context.h:64
#11 0x00007f5bfd7c5a17 in handle_get_tag (r=<optimized out>, this=0x7f5953752e20) at journal/JournalMetadata.cc:245
#12 journal::utils::rados_state_callback<journal::(anonymous namespace)::C_GetTag, &journal::(anonymous namespace)::C_GetTag::handle_get_tag> (c=<optimized out>, arg=0x7f5953752e20) at journal/Utils.h:36
#13 0x00007f5bf3b8665d in librados::C_AioSafe::finish (this=0x7f5a28964fe0, r=<optimized out>) at librados/AioCompletionImpl.h:199
#14 0x00007f5bf3b6eb49 in Context::complete (this=0x7f5a28964fe0, r=<optimized out>) at include/Context.h:64
#15 0x00007f5bf3c58646 in Finisher::finisher_thread_entry (this=0x7f5c071dec20) at common/Finisher.cc:68
#16 0x00007f5bf2d52dc5 in start_thread () from /lib64/libpthread.so.0
#17 0x00007f5bf1c38ced in clone () from /lib64/libc.so.6

Related issues 1 (0 open1 closed)

Copied to rbd - Backport #16371: jewel: rbd-mirror: ensure replay status formatter has completed before stopping replayResolvedLoïc DacharyActions
Actions #1

Updated by Jason Dillaman almost 8 years ago

  • Description updated (diff)
Actions #2

Updated by Jason Dillaman almost 8 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #3

Updated by Jason Dillaman almost 8 years ago

  • Status changed from In Progress to Fix Under Review
  • Backport set to jewel
Actions #4

Updated by Mykola Golub almost 8 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #5

Updated by Nathan Cutler almost 8 years ago

  • Copied to Backport #16371: jewel: rbd-mirror: ensure replay status formatter has completed before stopping replay added
Actions #6

Updated by Loïc Dachary over 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF