Project

General

Profile

Actions

Bug #5459

closed

ceph-mon failure using wip-mon-pgmap on ARM

Added by Mark Nelson almost 11 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This happened on the mixed burnupi/calxeda cluster with wip-mon-pginfo. leveldb caching was set to 256MB.

    -4> 2013-06-25 20:27:45.361986 bebc4420  1 -- 10.214.143.113:6789/0 >> :/0 pipe(0x642dc5a0 sd=1064 :6789 s=0 pgs=0 cs=0 l=0 c=0x4ef67700).accept sd=1064 10.214.136.24:36741/0
    -3> 2013-06-25 20:27:45.362161 bebc4420  0 -- 10.214.143.113:6789/0 >> 10.214.149.24:6812/50137 pipe(0x642dc5a0 sd=1064 :6789 s=0 pgs=0 cs=0 l=1 c=0x4ef67700).accept replacing existing (lossy) channel (new one lossy=1)
    -2> 2013-06-25 20:27:45.363413 bedc4420  1 -- 10.214.143.113:6789/0 >> :/0 pipe(0x642dc3c0 sd=1065 :6789 s=0 pgs=0 cs=0 l=0 c=0x4ef67f00).accept sd=1065 10.214.135.16:46506/0
    -1> 2013-06-25 20:27:45.363566 bedc4420  0 -- 10.214.143.113:6789/0 >> 10.214.148.16:6816/29359 pipe(0x642dc3c0 sd=1065 :6789 s=0 pgs=0 cs=0 l=1 c=0x4ef67f00).accept replacing existing (lossy) channel (new one lossy=1)
     0> 2013-06-25 20:27:45.512883 b3134420 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread b3134420 time 2013-06-25 20:27:45.408109
common/Thread.cc: 110: FAILED assert(ret == 0)

 ceph version 0.64-660-gabc0253 (abc0253bda5fed3d1fb6c882d7a70358fc9122da)
 1: (Thread::create(unsigned int)+0x4f) [0x2cbb1c]
 2: (SimpleMessenger::add_accept_pipe(int)+0x3b) [0x2c8028]
 3: (Accepter::entry()+0x1b1) [0x33756e]
 4: (Thread::_entry_func(void*)+0x7) [0x2cb9d0]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   0/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 hadoop
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/mon.a.log
--- end dump of recent events ---
2013-06-25 20:27:45.970157 b3134420 -1 *** Caught signal (Aborted) **
 in thread b3134420

 ceph version 0.64-660-gabc0253 (abc0253bda5fed3d1fb6c882d7a70358fc9122da)
 1: /usr/bin/ceph-mon() [0x25dec2]
 2: (__default_sa_restorer_v2()+0) [0xb69789d0]
 3: (()+0x171e6) [0xb696a1e6]
 4: (gsignal()+0x29) [0xb6977db2]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
     0> 2013-06-25 20:27:45.970157 b3134420 -1 *** Caught signal (Aborted) **
 in thread b3134420

 ceph version 0.64-660-gabc0253 (abc0253bda5fed3d1fb6c882d7a70358fc9122da)
 1: /usr/bin/ceph-mon() [0x25dec2]
 2: (__default_sa_restorer_v2()+0) [0xb69789d0]
 3: (()+0x171e6) [0xb696a1e6]
 4: (gsignal()+0x29) [0xb6977db2]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   0/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 hadoop
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/mon.a.log
--- end dump of recent events ---
ubuntu@saya013:/var/log/ceph$ 
Actions #1

Updated by Sage Weil almost 11 years ago

this is almost certainly the max_open_files limit. add max open files = 16384 to ceph.conf and restart the mons.

Actions #2

Updated by Sage Weil over 10 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF