Project

General

Profile

Bug #10998

Multiple OSDs are down and will not restart: os/Transaction.cc: 500: FAILED assert("Unkown op" == 0)

Added by Eric Eastman over 5 years ago. Updated over 5 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

On a cluster updated from v0.92 to v0.93, I have 5 OSDs that went down on separate OSD nodes during CephFS testing and the OSDs will not restart due to an assert error. From one of the the OSD logs:

2015-03-02 21:28:29.493210 7fdd3fed3900  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 20: 1072693248 bytes, block size 4096 bytes, directio = 1, aio = 1
2015-03-02 21:28:29.567222 7fdd3fed3900 -1 os/Transaction.cc: In function 'void ObjectStore::Transaction::_build_actions_from_tbl()' thread 7fdd3fed3900 time 2015-03-02 21:28:29.562995
os/Transaction.cc: 500: FAILED assert("Unkown op" == 0)

ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc57bb]
 2: (ObjectStore::Transaction::_build_actions_from_tbl()+0x2aaf) [0x9c036f]
 3: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x407b) [0x94682b]
 4: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x949894]
 5: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x96243b]
 6: (FileStore::mount()+0x3bb6) [0x932bf6]
 7: (OSD::init()+0x259) [0x6c42d9]
 8: (main()+0x2860) [0x651d50]
 9: (__libc_start_main()+0xf5) [0x7fdd3d010ec5]
 10: /usr/bin/ceph-osd() [0x66aa27]

I have attached 3 of the OSD log files from 3 different OSD nodes showing the same assert error.

OS: Trusty 14.04.1

ceph-osd.2.log.gz (125 KB) Eric Eastman, 03/03/2015 03:32 AM

ceph-osd.16.log.1.gz (314 KB) Eric Eastman, 03/03/2015 03:32 AM

ceph-osd.10.log.1.gz (261 KB) Eric Eastman, 03/03/2015 03:32 AM


Related issues

Duplicates Ceph - Bug #10985: Some OSDs don't get up after upgrade from v0.92 to v0.93 Won't Fix 03/02/2015

History

#1 Updated by Samuel Just over 5 years ago

  • Status changed from New to Duplicate

This is a known bug, you need to install v0.92 on those osds, flush their journals, upgrade the packages to v0.93, and start the osds.

Also available in: Atom PDF