Project

General

Profile

Actions

Bug #11298

open

aio gets EPERM when update-grub runs

Added by Igor Megov about 9 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Osd.38 suddenly disappears fron cluster at 12:50. Here is a piece of osd log, the whole log is attached to issue.

2015-04-01 12:50:54.807675 7fb75fbb1700 -1 journal aio to 2809413632~12288 got (1) Operation not permitted
2015-04-01 12:50:54.809220 7fb75fbb1700 -1 os/FileJournal.cc: In function 'void FileJournal::write_finish_thread_entry()' thread 7fb75fbb1700 time 2015-04-01 12:50:54.807703
os/FileJournal.cc: 1357: FAILED assert(0 == "unexpected aio error")

 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
 1: (FileJournal::write_finish_thread_entry()+0x947) [0xa91467]
 2: (FileJournal::WriteFinisher::entry()+0xd) [0x9cc3bd]
 3: (()+0x80a4) [0x7fb76a2f40a4]
 4: (clone()+0x6d) [0x7fb768e84c2d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

We have 5 node cluster in production, journals are on ssd.
Ceph version is 0.80.7-1~bpo70+1 running on Debian GNU/Linux 7.6(wheezy)
with 3.16-2-amd64 kernel, builded from jessie/sid source package.

I think, that I can guess a reason of failuer: I've added "elevator=deadline"
parameter to grub config and did "update-grub" after that. Grub by itself
do a block device scanning for partitions and filesystems and maybe locks
a journal partition someway. Ceph journaling code got "not permitted" error
and exits abnormally.


Files

ceph-osd.38.log.1.bz2 (154 KB) ceph-osd.38.log.1.bz2 OSD log on crash Igor Megov, 04/01/2015 09:53 AM
Actions #1

Updated by Sage Weil almost 9 years ago

Does this still crash, or was it a one-time thing? I can't see why the aio code would return EPERM. :/

Actions #2

Updated by Igor Megov almost 9 years ago

Sage Weil wrote:

Does this still crash, or was it a one-time thing?

No. It works, and I think, it was a one-time issue.

I can't see why the aio code would return EPERM. :/

As I wrote above, the crash happened in the same time I
ran 'update-grub'. So, maybe there is some kind contention
issue between osd daemon, writing to journal partition,
and grub, scanning block devices for creating map?

Actions #3

Updated by Loïc Dachary almost 9 years ago

  • Status changed from New to Need More Info
  • Regression set to No

Do you have a way to reproduce this problem ? Did the OSD restart ok after the crash ? It could very well be that the update-grub interferes with ongoing operations, in which case that would be an update-grub bug to report. But if it can be reproduced, maybe we could find a workaround for Ceph. What do you think ?

Actions #4

Updated by Igor Megov almost 9 years ago

Loic Dachary wrote:

Do you have a way to reproduce this problem ?

The way is quite simple - I got OSD crush when I do update-intramfs and update-grub.

What is, in theory, the consequences of journal write error? Is it reported to a
upper levels (qemu, guest VM) as generic disk write error? May data be silently
corruped?

If you needed more detailed info, I'll try to reproduce the bug in day or two.
But cluster is under productive load, so I don't want to experiment on it too much
and have a slighly fear of data consistency.

Did the OSD restart ok after the crash ?

Yes, OSD restarted without any problems at start. Further scrubs also do not
find any erros.

It could very well be that the update-grub interferes with ongoing operations, in which case
that would be an update-grub bug to report.

But if it can be reproduced, maybe we could find a workaround for Ceph. What do you think ?

Lets try! What feedback you expected from me?

Actions #5

Updated by Igor Megov almost 9 years ago

Oh! I forgot, that a month ago or so, the same crash happens one more time.
Also on update-group. Unfortunanely the logs are overwritten now.

Actions #6

Updated by Sage Weil almost 9 years ago

  • Subject changed from Assert on journal aio to aio gets EPERM when update-grub runs
  • Source changed from other to Community (user)
Actions #7

Updated by Daniel Swarbrick over 8 years ago

I saw this same problem take out 4 OSDs on one cluster, and 5 on another, this morning whilst apt-get installing a kernel update (and presumably triggered by the subsequent update-initramfs / update-grub). All systems are Debian Jessie, running Hammer 0.94.2.

2015-07-23 08:09:15.544183 7f180a48d700 -1 journal aio to 1979523072~528384 wrote 18446744073709551615
2015-07-23 08:09:15.546550 7f180a48d700 -1 os/FileJournal.cc: In function 'void FileJournal::write_finish_thread_entry()' thread 7f180a48d700 time 2015-07-23 08:09:15.544219
os/FileJournal.cc: 1426: FAILED assert(0 == "unexpected aio error")

...

   -10> 2015-07-23 08:09:15.579774 7f17edd6b700  1 -- 10.3.1.31:6813/35194 <== osd.11 10.3.1.33:6813/3774 3180222 ==== osd_repop_reply(client.1496973.0:12306923 2.18b ondisk, result = 0) v1 ==== 83+0+0 (887356516 0 0) 0x19858900 con 0x15219760
    -9> 2015-07-23 08:09:15.579815 7f17edd6b700  5 -- op tracker -- seq: 25769856, time: 2015-07-23 08:09:15.579652, event: header_read, op: osd_repop_reply(client.1496973.0:12306923 2.18b ondisk, result = 0)
    -8> 2015-07-23 08:09:15.579825 7f17edd6b700  5 -- op tracker -- seq: 25769856, time: 2015-07-23 08:09:15.579656, event: throttled, op: osd_repop_reply(client.1496973.0:12306923 2.18b ondisk, result = 0)
    -7> 2015-07-23 08:09:15.579833 7f17edd6b700  5 -- op tracker -- seq: 25769856, time: 2015-07-23 08:09:15.579760, event: all_read, op: osd_repop_reply(client.1496973.0:12306923 2.18b ondisk, result = 0)
    -6> 2015-07-23 08:09:15.579840 7f17edd6b700  5 -- op tracker -- seq: 25769856, time: 0.000000, event: dispatched, op: osd_repop_reply(client.1496973.0:12306923 2.18b ondisk, result = 0)
    -5> 2015-07-23 08:09:15.580920 7f17ee16f700  1 -- 10.3.1.31:6813/35194 <== osd.9 10.3.1.33:6801/3084 1545337 ==== osd_repop_reply(client.1496973.0:12306742 2.124 ondisk, result = 0) v1 ==== 83+0+0 (759416097 0 0) 0x1bb4b3c0 con 0x15218b00
    -4> 2015-07-23 08:09:15.580949 7f17ee16f700  5 -- op tracker -- seq: 25769857, time: 2015-07-23 08:09:15.580812, event: header_read, op: osd_repop_reply(client.1496973.0:12306742 2.124 ondisk, result = 0)
    -3> 2015-07-23 08:09:15.580961 7f17ee16f700  5 -- op tracker -- seq: 25769857, time: 2015-07-23 08:09:15.580815, event: throttled, op: osd_repop_reply(client.1496973.0:12306742 2.124 ondisk, result = 0)
    -2> 2015-07-23 08:09:15.580968 7f17ee16f700  5 -- op tracker -- seq: 25769857, time: 2015-07-23 08:09:15.580911, event: all_read, op: osd_repop_reply(client.1496973.0:12306742 2.124 ondisk, result = 0)
    -1> 2015-07-23 08:09:15.580975 7f17ee16f700  5 -- op tracker -- seq: 25769857, time: 0.000000, event: dispatched, op: osd_repop_reply(client.1496973.0:12306742 2.124 ondisk, result = 0)
     0> 2015-07-23 08:09:15.661407 7f180a48d700 -1 *** Caught signal (Aborted) **
 in thread 7f180a48d700

 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
 1: /usr/bin/ceph-osd() [0xb1b383]
 2: (()+0xf8d0) [0x7f18161d78d0]
 3: (gsignal()+0x37) [0x7f181467b107]
 4: (abort()+0x148) [0x7f181467c4e8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f1814f66b3d]
 6: (()+0x5ebb6) [0x7f1814f64bb6]
 7: (()+0x5ec01) [0x7f1814f64c01]
 8: (()+0x5ee19) [0x7f1814f64e19]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc19447]
 10: (FileJournal::write_finish_thread_entry()+0x685) [0xab8c35]
 11: (FileJournal::WriteFinisher::entry()+0xd) [0x93c25d]
 12: (()+0x80a4) [0x7f18161d00a4]
 13: (clone()+0x6d) [0x7f181472c04d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions #8

Updated by Loïc Dachary about 8 years ago

  • Related to Bug #12100: OSD crash, uexpected aio error in FileJournal.cc added
Actions #9

Updated by Loïc Dachary about 8 years ago

I can't see why the aio code would return EPERM. :/

The error message was bugous in firefly and fixed later on, hence the confusing EPERM https://github.com/ceph/ceph/commit/7827e0035e3350ad2d9230f27a1629545f53af5c

Actions #10

Updated by Loïc Dachary about 8 years ago

  • Status changed from Need More Info to Duplicate

I think this is a duplicate of #9570.

The fact that it happens when update-grub runs is interesting. I kind of remember there was a Jessie specific issue that caused problems with on going I/O during package upgrade ? update-grub being run ? partprobe being run indirectly ? I can't remember but maybe it will come back, it was about two years ago.

Actions #11

Updated by Loïc Dachary about 8 years ago

  • Is duplicate of Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio added
Actions #12

Updated by Ilya Dryomov about 8 years ago

  • Status changed from Duplicate to 12

This isn't related to, let alone a duplicate of, #9570. The result (to the right of "wrote") is correct both here and in #9570; ai->off and ai->len (to the left of "wrote") are correct here and corrupt due to a use-after-free in #9570. io_getevents() can return an event with res being set to anything file->f_op->write_iter() can fail with, which is a lot of errors. This needs more investigation: what exactly is update-grub doing and why is it able to do it to a ceph journal device.

Actions #13

Updated by Ilya Dryomov about 8 years ago

  • Is duplicate of deleted (Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio)
Actions #14

Updated by Ilya Dryomov about 8 years ago

Not related to #12100 either - #12100 is half a #9570 half a hardware failure.

Actions #15

Updated by Ilya Dryomov about 8 years ago

  • Related to deleted (Bug #12100: OSD crash, uexpected aio error in FileJournal.cc)
Actions #16

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions #17

Updated by 冬冬 李 over 2 years ago

Not sure what the specific environment is, but we met a similar problem on CentOS 7 in our own storage system. Here's our guess and solution:

update-grub invoked grub2-mkconfig invoked os-prober invoked /usr/libexec/os-probes/50mounted-tests on a partition, which called ro_partition (in ./usr/share/os-prober/common.sh) to make the partition read-only, so all io requests to this partition return EPERM.

The solution is append GRUB_DISABLE_OS_PROBER=true into the file /etc/default/grub to disable os-prober. By the way, the ro_partition code path was removed by this commit (https://salsa.debian.org/installer-team/os-prober/-/commit/850988ef68fc7b06a41568d76f1f01dc55e243a2) 4 years ago.

Actions

Also available in: Atom PDF