https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2011-08-31T07:22:00Z
Ceph
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5687
2011-08-31T07:22:00Z
Damien Churchill
damoxc@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/download/296/osd.8.log">osd.8.log</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/296/osd.8.log">View</a> added</li></ul>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5697
2011-08-31T10:03:21Z
Sage Weil
sage@newdream.net
<ul><li><strong>Category</strong> set to <i>OSD</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>4</i></li><li><strong>Assignee</strong> set to <i>Sage Weil</i></li><li><strong>Target version</strong> set to <i>v0.35</i></li></ul><p>The assert.h #includes were screwed up so there's no line number in the log and I can't tell which assert this was hitting. Can you run cosd in the foreground and reproduce (-f on command line)? Probably 'cosd -f -i 7' will do it.</p>
<p>THanks!</p>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5700
2011-08-31T12:19:25Z
Damien Churchill
damoxc@gmail.com
<ul></ul><pre>
2011-08-31 20:17:13.966717 7fd5f3840760 filestore(/srv/osd7) collection_setattr /srv/osd7/current/0.2a5_head 'info' len 611
2011-08-31 20:17:13.966740 7fd5f3840760 filestore(/srv/osd7) collection_setattr /srv/osd7/current/0.2a5_head 'info' len 611 = 611
2011-08-31 20:17:13.966745 7fd5f3840760 filestore(/srv/osd7) truncate meta/pginfo_0.2a5/0 size 0
2011-08-31 20:17:13.966750 7fd5f3840760 filestore(/srv/osd7) lfn_get cid=meta oid=pginfo_0.2a5/0 pathname=/srv/osd7/current/meta/pginfo_0.2a5_0 lfn=pginfo_0.2a5_0 is_lfn=0
2011-08-31 20:17:13.966777 7fd5f3840760 filestore(/srv/osd7) truncate meta/pginfo_0.2a5/0 size 0 = 0
2011-08-31 20:17:13.966783 7fd5f3840760 filestore(/srv/osd7) write meta/pginfo_0.2a5/0 0~40
2011-08-31 20:17:13.966787 7fd5f3840760 filestore(/srv/osd7) lfn_get cid=meta oid=pginfo_0.2a5/0 pathname=/srv/osd7/current/meta/pginfo_0.2a5_0 lfn=pginfo_0.2a5_0 is_lfn=0
2011-08-31 20:17:13.966799 7fd5f3840760 filestore(/srv/osd7) queue_flusher ep 0 fd 34 0~40 qlen 20
2011-08-31 20:17:13.966803 7fd5f3840760 filestore(/srv/osd7) write meta/pginfo_0.2a5/0 0~40 = 40
2011-08-31 20:17:13.966806 7fd5f3840760 journal journal_replay: r = 0, op now seq 3733799
2011-08-31 20:17:13.966812 7fd5f3840760 journal read_entry 704397312 : seq 3733800 33 bytes
2011-08-31 20:17:13.966815 7fd5f3840760 journal journal_replay: applying op seq 3733800 (op_seq 3733799)
2011-08-31 20:17:13.966818 7fd5f3840760 filestore(/srv/osd7) _do_transaction on 0x29d0000
2011-08-31 20:17:13.966821 7fd5f3840760 journal journal_replay: r = 0, op now seq 3733800
2011-08-31 20:17:13.966825 7fd5f3840760 journal read_entry 704405504 : seq 3733801 49 bytes
2011-08-31 20:17:13.966828 7fd5f3840760 journal journal_replay: applying op seq 3733801 (op_seq 3733800)
2011-08-31 20:17:13.966831 7fd5f3840760 filestore(/srv/osd7) _do_transaction on 0x29d0000
2011-08-31 20:17:13.966835 7fd5f3840760 filestore(/srv/osd7) _destroy_collection /srv/osd7/current/0.2a5_f
2011-08-31 20:17:13.966848 7fd5f3840760 filestore(/srv/osd7) _destroy_collection /srv/osd7/current/0.2a5_f = -39
os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&)', in thread '0x7fd5f3840760'
os/FileStore.cc: 2539: FAILED assert(0 == "ENOTEMPTY suggests garbage data in osd data dir")
ceph version 0.34 (commit:2f039eeeb745622b866d80feda7afa055e15f6d6)
1: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x3f76) [0x6bc1b6]
2: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x76) [0x6bc816]
3: (JournalingObjectStore::journal_replay(unsigned long)+0x773) [0x6ccea3]
4: (FileStore::mount()+0x172a) [0x6b3f0a]
5: (OSD::init()+0x18d) [0x550c6d]
6: (main()+0x2b56) [0x4b3b06]
7: (__libc_start_main()+0xff) [0x7fd5f1ba8eff]
8: /usr/bin/cosd() [0x4b0b99]
ceph version 0.34 (commit:2f039eeeb745622b866d80feda7afa055e15f6d6)
1: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x3f76) [0x6bc1b6]
2: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x76) [0x6bc816]
3: (JournalingObjectStore::journal_replay(unsigned long)+0x773) [0x6ccea3]
4: (FileStore::mount()+0x172a) [0x6b3f0a]
5: (OSD::init()+0x18d) [0x550c6d]
6: (main()+0x2b56) [0x4b3b06]
7: (__libc_start_main()+0xff) [0x7fd5f1ba8eff]
8: /usr/bin/cosd() [0x4b0b99]
*** Caught signal (Aborted) **
in thread 0x7fd5f3840760
ceph version 0.34 (commit:2f039eeeb745622b866d80feda7afa055e15f6d6)
1: /usr/bin/cosd() [0x583642]
2: (()+0xfc60) [0x7fd5f3438c60]
3: (gsignal()+0x35) [0x7fd5f1bbdd05]
4: (abort()+0x186) [0x7fd5f1bc1ab6]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fd5f2474d6d]
6: (()+0xb9f16) [0x7fd5f2472f16]
7: (()+0xb9f43) [0x7fd5f2472f43]
8: (()+0xba03e) [0x7fd5f247303e]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x396) [0x5a8876]
10: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x3f76) [0x6bc1b6]
11: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x76) [0x6bc816]
12: (JournalingObjectStore::journal_replay(unsigned long)+0x773) [0x6ccea3]
13: (FileStore::mount()+0x172a) [0x6b3f0a]
14: (OSD::init()+0x18d) [0x550c6d]
15: (main()+0x2b56) [0x4b3b06]
16: (__libc_start_main()+0xff) [0x7fd5f1ba8eff]
</pre>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5710
2011-08-31T14:43:41Z
Damien Churchill
damoxc@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/download/297/osd.7.log.1">osd.7.log.1</a> added</li></ul>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5716
2011-08-31T15:34:17Z
Sage Weil
sage@newdream.net
<ul><li><strong>Subject</strong> changed from <i>osd crash</i> to <i>osd: destroy_collection on non-empty dir</i></li><li><strong>Status</strong> changed from <i>4</i> to <i>In Progress</i></li></ul>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=5719
2011-08-31T16:32:13Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>I think the root bug is fixed by <a class="changeset" title="osd: flush previous operations to fs before collection list + destroy We need to flush any prior..." href="https://tracker.ceph.com/projects/ceph/repository/revisions/f1cae577e3730a73dd5478785160745150095af5">f1cae577e3730a73dd5478785160745150095af5</a> and commit:e0776761a1d6866fda73fa5af217f5be4527d798.</p>
Ceph - Bug #1471: osd: destroy_collection on non-empty dir
https://tracker.ceph.com/issues/1471?journal_id=6759
2011-10-22T13:01:42Z
Wido den Hollander
wido@42on.com
<ul></ul><p>I'm actually hitting the same bug with v0.37</p>
<p>It was time to upgrade my old (and good running!) 0.27 cluster to the latest version and I encountered this crash:</p>
<pre>Oct 22 21:56:32 node03 osd.0[15433]: 7ff57621a720 filestore(/var/lib/ceph/osd.0) collection_setattr /var/lib/ceph/osd.0/current/3.51_head 'ondisklog' len 17 = 17
Oct 22 21:56:32 node03 osd.0[15433]: 7ff57621a720 filestore(/var/lib/ceph/osd.0) collection_setattr /var/lib/ceph/osd.0/current/3.51_head 'snap_collections' len 244
Oct 22 21:56:32 node03 osd.0[15433]: 7ff57621a720 filestore(/var/lib/ceph/osd.0) collection_setattr /var/lib/ceph/osd.0/current/3.51_head 'snap_collections' len 244 = 244
Oct 22 21:56:32 node03 osd.0[15433]: 7ff57621a720 filestore(/var/lib/ceph/osd.0) _destroy_collection /var/lib/ceph/osd.0/current/_temp
Oct 22 21:56:32 node03 osd.0[15433]: 7ff57621a720 filestore(/var/lib/ceph/osd.0) _destroy_collection /var/lib/ceph/osd.0/current/_temp = -39
Oct 22 21:56:32 node03 osd.0[15433]: os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&)', in thread '7ff57621a720'#012os/FileStore.cc: 2416: FAILED assert(0 == "ENOTEMPTY suggests garbage data in osd data dir")
Oct 22 21:56:32 node03 osd.0[15433]: ceph version 0.37-96-g1b846f4 (commit:1b846f43f181ac31a1a1911c74b7cf3ae063f35d)#012 1: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x410b) [0x6c181b]#012 2: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x75) [0x6c1e75]#012 3: (JournalingObjectStore::journal_replay(unsigned long)+0x793) [0x6ccfc3]#012 4: (FileStore::mount()+0x2803) [0x6abb13]#012 5: (OSD::convertfs(std::string const&, std::string const&)+0xc8) [0x524f18]#012 6: (main()+0x2c8d) [0x4a55dd]#012 7: (__libc_start_main()+0xfd) [0x7ff5745a9c4d]#012 8: /usr/bin/ceph-osd() [0x4a2529]</pre>
<p>Indeed, the directory is not empty:</p>
<pre>root@node03:/var/lib/ceph/osd.0/current/_temp# ls
rb.0.3.000000000486_40
root@node03:/var/lib/ceph/osd.0/current/_temp#</pre>
<p>The other OSD's are still working their way through the upgrade: "Updating collection...."</p>
<p>Do I have one broken PG on this OSD?</p>