Project

General

Profile

Actions

Bug #24237

closed

mon: monitors run out of space during snapshot workflow

Added by Patrick Donnelly almost 6 years ago. Updated almost 6 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-05-13T19:47:41.007 INFO:tasks.ceph.mon.a.ovh032.stderr:2018-05-13 19:47:40.957 7fb58fe10700 -1 rocksdb: submit_common error: IO error: No space left on deviceWhile appending to file: /var/lib/ceph/mon/ceph-a/store.db/000085.sst: No space left on device code = 5 Rocksdb transaction:
2018-05-13T19:47:41.007 INFO:tasks.ceph.mon.a.ovh032.stderr:Put( Prefix = p key = 'xos'0x00343236'8' Value size = 24881)
2018-05-13T19:47:41.008 INFO:tasks.ceph.mon.a.ovh032.stderr:Put( Prefix = p key = 'xos'0x0070656e'ding_v' Value size = 8)
2018-05-13T19:47:41.008 INFO:tasks.ceph.mon.a.ovh032.stderr:Put( Prefix = p key = 'xos'0x0070656e'ding_pn' Value size = 8)
2018-05-13T19:47:41.010 INFO:tasks.ceph.mon.a.ovh032.stderr:/build/ceph-13.1.0-76-g1af66d5/src/mon/MonitorDBStore.h: In function 'int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef)' thread 7fb58fe10700 time 2018-05-13 19:47:40.963591
2018-05-13T19:47:41.010 INFO:tasks.ceph.mon.a.ovh032.stderr:/build/ceph-13.1.0-76-g1af66d5/src/mon/MonitorDBStore.h: 311: FAILED assert(0 == "failed to write to db")
2018-05-13T19:47:41.029 INFO:tasks.ceph.mon.a.ovh032.stderr: ceph version 13.1.0-76-g1af66d5 (1af66d54f173f933d6a26ffb617b429f2c700ac6) mimic (rc)
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fb598b2c202]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 2: (()+0x2e43c7) [0x7fb598b2c3c7]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 3: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xba4) [0x55b1b177d554]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 4: (Paxos::begin(ceph::buffer::list&)+0x3a1) [0x55b1b18dd461]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 5: (Paxos::propose_pending()+0xf6) [0x55b1b18defa6]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 6: (Paxos::trigger_propose()+0x126) [0x55b1b18e14a6]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 7: (PaxosService::propose_pending()+0x1f5) [0x55b1b18e5c55]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 8: (C_MonContext::finish(int)+0x39) [0x55b1b1785f39]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 9: (Context::complete(int)+0x9) [0x55b1b17bec19]
2018-05-13T19:47:41.030 INFO:tasks.ceph.mon.a.ovh032.stderr: 10: (SafeTimer::timer_thread()+0x18b) [0x7fb598b28b7b]
2018-05-13T19:47:41.031 INFO:tasks.ceph.mon.a.ovh032.stderr: 11: (SafeTimerThread::entry()+0xd) [0x7fb598b2a13d]
2018-05-13T19:47:41.031 INFO:tasks.ceph.mon.a.ovh032.stderr: 12: (()+0x76ba) [0x7fb59842a6ba]
2018-05-13T19:47:41.031 INFO:tasks.ceph.mon.a.ovh032.stderr: 13: (clone()+0x6d) [0x7fb59778841d]

From: /ceph/teuthology-archive/teuthology-2018-05-12_05:10:02-fs-mimic-distro-basic-ovh/2526529/teuthology.log

Seems to only happen with ovh? Also happened here:

Assertion: /build/ceph-13.1.0-129-gd3a3371/src/mon/MonitorDBStore.h: 311: FAILED assert(0 == "failed to write to db")
ceph version 13.1.0-129-gd3a3371 (d3a3371798df3316c173c967bc5300119f20cd22) mimic (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fe663c7f132]
 2: (()+0x2e42f7) [0x7fe663c7f2f7]
 3: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xba4) [0x564519121584]
 4: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3aa) [0x56451927feca]
 5: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x227) [0x564519284cd7]
 6: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x102f) [0x56451915d20f]
 7: (Monitor::_ms_dispatch(Message*)+0x7a2) [0x56451915dc22]
 8: (Monitor::ms_dispatch(Message*)+0x23) [0x5645191867c3]
 9: (DispatchQueue::entry()+0xb92) [0x7fe663cf7e02]
 10: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fe663d95ffd]
 11: (()+0x76ba) [0x7fe66357d6ba]
 12: (clone()+0x6d) [0x7fe6628db41d]
6 jobs: ['2546304', '2546316', '2546387', '2546371', '2546282', '2546290']
suites intersection: ['frag_enable.yaml', 'mount/fuse.yaml', 'overrides/{debug.yaml', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['ceph-thrash/default.yaml', 'clusters/fixed-2-ucephfs.yaml', 'clusters/mds-1active-1standby.yaml', 'frag_enable.yaml', 'fs/basic_workload/{begin.yaml', 'fs/snaps/{begin.yaml', 'fs/thrash/{begin.yaml', 'inline/no.yaml', 'mount/fuse.yaml', 'msgr-failures/none.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/bluestore.yaml', 'omap_limit/10000.yaml', 'overrides/{debug.yaml', 'tasks/cfuse_workunit_kernel_untar_build.yaml}', 'tasks/cfuse_workunit_snaptests.yaml}', 'tasks/snaptests.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

From: teuthology-2018-05-18_05:10:02-fs-mimic-distro-basic-ovh


Related issues 1 (0 open1 closed)

Is duplicate of CephFS - Bug #24238: test gets ENOSPC from bluestore block deviceResolvedSage Weil05/23/2018

Actions
Actions #1

Updated by Patrick Donnelly almost 6 years ago

  • Is duplicate of Bug #24238: test gets ENOSPC from bluestore block device added
Actions #2

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF