Project

General

Profile

Actions

Bug #1759

closed

mds/client: truncate size overflow, fails with EINVAL

Added by Sam Lang over 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

My version of ceph is a minor variant of 0.38, running with ext4, and ceph-fuse. It looks like my fs has gotten corrupted somehow. I've seen this assert failure on two of the osds I have running, and it hits the same assertion on restart of the osd. It looks like the EINVAL is actually coming from a truncate of an object, due to the size passed to the truncate being extremely large (18446744073709551615). Any way to debug this or correct it?

Log from one of the failed osds:

Nov 29 18:52:29 sug-chifj21 osd.5919711: 7f1dfa92b700 filestore(/srv/ceph/osd.59) error error 22: Invalid argument not handled
Nov 29 18:52:29 sug-chifj21 osd.5919711: ../../src/os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&)', in thread '7f1dfa92b700'#012../../src/os/FileStore.cc: 2407: FAILED assert(0 == "unexpected error")
Nov 29 18:52:29 sug-chifj21 osd.5919711: ceph version (commit:)#012 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x9142d9]#012 2: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 4: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 5: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 6: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 7: (ThreadPool::worker()+0x42c) [0x914b54]#012 8: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 9: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 10: (()+0x6d8c) [0x7f1e05a10d8c]#012 11: (clone()+0x6d) [0x7f1e0425204d]
Nov 29 18:52:29 sug-chifj21 osd.5919711: ceph version (commit:)#012 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x9142d9]#012 2: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 4: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 5: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 6: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 7: (ThreadPool::worker()+0x42c) [0x914b54]#012 8: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 9: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 10: (()+0x6d8c) [0x7f1e05a10d8c]#012 11: (clone()+0x6d) [0x7f1e0425204d]
Nov 29 18:52:29 sug-chifj21 osd.5919711: ** Caught signal (Aborted) *#012 in thread 7f1dfa92b700
Nov 29 18:52:29 sug-chifj21 osd.5919711: ceph version (commit:)#012 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x914655]#012 2: /usr/bin/ceph-osd() [0xa650ff]#012 3: (()+0xfc60) [0x7f1e05a19c60]#012 4: (gsignal()+0x35) [0x7f1e0419fd05]#012 5: (abort()+0x186) [0x7f1e041a3ab6]#012 6: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7f1e04a566dd]#012 7: (()+0xb9926) [0x7f1e04a54926]#012 8: (()+0xb9953) [0x7f1e04a54953]#012 9: (()+0xb9a5e) [0x7f1e04a54a5e]#012 10: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1f3) [0x914443]#012 11: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 13: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 14: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 15: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 16: (ThreadPool::worker()+0x42c) [0x914b54]#012 17: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 18: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 19: (()+0x6d8c) [0x7f1e05a10d8c]#012 20: (clone()+0x6d) [0x7f1e0425204d]

strace truncate error for osd.59:

truncate("/srv/ceph/osd.59/current/0.274_head/1000000001a.00000002__head_9A0B7274", 18446744073709551615) = -1 EINVAL (Invalid argument)

strace truncate error for osd.65:

truncate("/srv/ceph/osd.65/current/0.274_head/1000000001a.00000002__head_9A0B7274", 18446744073709551615) = -1 EINVAL (Invalid argument)


Files

osd.0-crash.log (256 KB) osd.0-crash.log Syslog-formatted crash log of osd.0 replay David McBride, 12/14/2011 01:53 PM

Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #1862: filestore: EINVAL on replayDuplicate12/28/2011

Actions
Actions

Also available in: Atom PDF