Project

General

Profile

Actions

Bug #5766

closed

osd: replay not closing fds? too many open fds on upgrade+restart

Added by Sage Weil over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

...
    -9> 2013-07-26 09:36:20.975163 7f12d4cd5780  2 journal read_entry 6180864 : seq 1501 2054 bytes
    -8> 2013-07-26 09:36:20.975165 7f12d4cd5780  3 journal journal_replay: applying op seq 1501
    -7> 2013-07-26 09:36:20.975803 7f12d4cd5780  3 journal journal_replay: r = 0, op_seq now 1501
    -6> 2013-07-26 09:36:20.975836 7f12d4cd5780  2 journal read_entry 6184960 : seq 1502 2054 bytes
    -5> 2013-07-26 09:36:20.975838 7f12d4cd5780  3 journal journal_replay: applying op seq 1502
    -4> 2013-07-26 09:36:20.976286 7f12d4cd5780  0 filestore(/var/lib/ceph/osd/ceph-0) write couldn't open 2.f_head/d81206bf/obj-6hP4_xTo7EjVVLY/head//2: (24) Too many open files
    -3> 2013-07-26 09:36:20.976309 7f12d4cd5780  0 filestore(/var/lib/ceph/osd/ceph-0)  error (24) Too many open files not handled on operation 10 (1502.1.0, or op 0, counting fr
om 0)
    -2> 2013-07-26 09:36:20.976313 7f12d4cd5780  0 filestore(/var/lib/ceph/osd/ceph-0) unexpected error code
    -1> 2013-07-26 09:36:20.976314 7f12d4cd5780  0 filestore(/var/lib/ceph/osd/ceph-0)  transaction dump:
{ "ops": [
        { "op_num": 0,
          "op_name": "write",
          "collection": "2.f_head",
          "oid": "d81206bf\/obj-6hP4_xTo7EjVVLY\/head\/\/2",
          "length": 1,
          "offset": 26707,
          "bufferlist length": 1},
        { "op_num": 1,
          "op_name": "setattr",
          "collection": "2.f_head",
          "oid": "d81206bf\/obj-6hP4_xTo7EjVVLY\/head\/\/2",
          "name": "_",
          "length": 226},
        { "op_num": 2,
          "op_name": "setattr",
          "collection": "2.f_head",
          "oid": "d81206bf\/obj-6hP4_xTo7EjVVLY\/head\/\/2",
          "name": "snapset",
          "length": 31}]}
     0> 2013-07-26 09:36:20.986642 7f12d4cd5780 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)' thread 7f12d4c
d5780 time 2013-07-26 09:36:20.976373
os/FileStore.cc: 2814: FAILED assert(0 == "unexpected error")

this was a cuttlefish -> next upgrade, on osd restart
Actions #1

Updated by Samuel Just over 10 years ago

  • Status changed from New to Resolved

leak in _check_global_replay_guard. backported to cuttlefish

Actions

Also available in: Atom PDF