Bug #36517
client crashes osd with empty object name
0%
Description
I found a RADOS client causing OSDs to crash running bluestore (haven't tried filestore) producing the following error in the log:
-3> 2018-10-16 21:36:33.362 7f9cb571c700 -1 bluestore(/home/nwatkins/src/ceph/build/dev/osd0) _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0) -2> 2018-10-16 21:36:33.362 7f9cb571c700 0 bluestore(/home/nwatkins/src/ceph/build/dev/osd0) _dump_transaction transaction dump: { "ops": [ { "op_num": 0, "op_name": "remove", "collection": "1.5_head", "oid": "#1:a0000000::::head#" }, { "op_num": 1, "op_name": "rmcoll", "collection": "1.5_head" } ] } -1> 2018-10-16 21:36:33.384 7f9cb571c700 -1 /home/nwatkins/src/ceph/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread 7f9cb571c700 time 2018-10-16 21:36:33.363887 /home/nwatkins/src/ceph/src/os/bluestore/BlueStore.cc: 9991: abort()
I'm not sure if this is an exact duplicate, but it is very closely related to http://tracker.ceph.com/issues/15636
Here is a basic reproducer
#> ceph osd pool create pool 8 8 librados::ObjectWriteOperation op; ceph::bufferlist bl; op.setxattr("key", bl); ioctx.operate("", &op); <== EMPTY OID #> ceph osd pool rm pool pool --yes-i-really-really-mean-it
Full trace here:
https://paste.fedoraproject.org/paste/CeugGole8u-tlsWAPJMxyA
History
#1 Updated by Neha Ojha over 4 years ago
Noah, the paste doesn't show now, could you paste the trace in the tracker.
#2 Updated by Noah Watkins over 4 years ago
#3 Updated by Noah Watkins over 4 years ago
Attached
#4 Updated by Jesse Williamson over 4 years ago
- Assignee set to Jesse Williamson
#5 Updated by Noah Watkins over 4 years ago
Jesse,
This is still a bug in the latest master. Also, it appears to be worse than before--it seems as though raw binary data is leaking out into the log files when the bug is triggered. Here is a full reproducer. Note I'm using a vstart cluster with 3 OSDs. 2 out of the 3 OSDs crashed.
[nwatkins@smash ceph]$ git diff --cached diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index c677d4100c..826bbfc32a 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -783,3 +783,6 @@ if (IS_DIRECTORY "${PROJECT_SOURCE_DIR}/.git") endif() add_subdirectory(script) + +add_executable(tracker-36517 tracker_36517.cc) +target_link_libraries(tracker-36517 librados-cxx global) diff --git a/src/blow-up-osd.sh b/src/blow-up-osd.sh new file mode 100644 index 0000000000..b405e58398 --- /dev/null +++ b/src/blow-up-osd.sh @@ -0,0 +1,3 @@ +ceph osd pool create pool 8 8 +CEPH_CONF=ceph.conf tracker-36517 +ceph osd pool rm pool pool --yes-i-really-really-mean-it diff --git a/src/tracker_36517.cc b/src/tracker_36517.cc new file mode 100644 index 0000000000..924942934b --- /dev/null +++ b/src/tracker_36517.cc @@ -0,0 +1,28 @@ +#include <assert.h> +#include "include/rados/librados.hpp" + +int main() +{ + librados::Rados rados; + librados::IoCtx ioctx; + + int ret = rados.init(NULL); + assert(ret == 0); + + ret = rados.conf_read_file(NULL); + assert(ret == 0); + + ret = rados.connect(); + assert(ret == 0); + + ret = rados.ioctx_create("pool", ioctx); + assert(ret == 0); + + librados::ObjectWriteOperation op; + + ceph::bufferlist bl; + op.setxattr("key", bl); + ioctx.operate("", &op); + + return 0; +}
#6 Updated by Jesse Williamson over 4 years ago
Ok, thank you Noah! Much appreciated. I'll have a look at this soon.