add error handle for leveldb or rocksdb in bluestore or filestore

Added by Xinxin Shu about 5 years ago. Updated almost 4 years ago.

recently i tested bluestore, i met error that rocksdb corruptted(submit_transaction always return -1)but ceph does not handle this error.


#1 Updated by shasha lu about 5 years ago

I met the error too. (submit_transaction_sync always return -1)

I added logs in RocksDBStore::submit_transaction_sync , found out rocksdb return error:
2016-04-08 08:46:02.064797 7f1093600700 -1 rocksdb: submit_transaction_sync IO error: /opt/ceph-bluestore/var/lib/blue/osd/osd4/db/024586.log: No such file or directory

My ceph version is 10.0.5
BlueStore_bluefs = false

rocksdb include these files:
#2 Updated by Yang Dongsheng about 5 years ago

Hi guys, I am trying to fix this problem by introducing a mechanism of _txc_abort(). It will reply an -EIO to client if we met any problem in transaction submitting.

does that sounds good enough?

#3 Updated by shasha lu about 5 years ago

This is rocksdb's bug.
Rocksdb report:
2016-04-14 14:37:13.705520 7f0d4f4e3700 2 rocksdb: Waiting after background compaction error: IO error: /opt/ceph-bluestore/var/lib/blue/osd/osd1/db/407669.log: No such file or directory, Accumulated background error counts: 1

Return -EIO to client maybe is not proper.

#4 Updated by Yang Dongsheng about 5 years ago

shasha lu wrote:

This is rocksdb's bug.
Rocksdb report:
2016-04-14 14:37:13.705520 7f0d4f4e3700 2 rocksdb: Waiting after background compaction error: IO error: /opt/ceph-bluestore/var/lib/blue/osd/osd1/db/407669.log: No such file or directory, Accumulated background error counts: 1

I am not sure is that a bug in socksdb here, but I am sure objectstore has no method to abort a transaction. That's what I am working for.

Return -EIO to client maybe is not proper.

Why not, please consider this scenario, there is something wrong in the device of rocksdb and we got an EIO from writing data into it. Then we return a -EIO to user, why not proper?

#5 Updated by Yang Dongsheng almost 5 years ago

Hi, guys, does this commit solve it?

#6 Updated by shasha lu almost 5 years ago

yes, osd will abort when submit_transaction met error.

BTW, the default bluestore rocksdb options is
bluestore_rocksdb_options = compression=kNoCompression,max_write_buffer_number=16,min_write_buffer_number_to_merge=3,recycle_log_file_num=16

I remove the option recycle_log_file_num=16, with ceph.conf
bluestore_rocksdb_options = compression=kNoCompression,max_write_buffer_number=16,min_write_buffer_number_to_merge=3

The error no longer appears.
Maybe the options recycle_log_file_num should be killed in config_opts.h

#7 Updated by Sage Weil almost 4 years ago

  Status changed from New to Resolved

