Project

General

Profile

Actions

Bug #24051

closed

"122 - unittest_bluefs (OTHER_FAULT)" during ctest run

Added by Yuri Weinstein almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This was caught during PRs testing ctest run:

122:    -14> 2018-05-08 01:47:46.198016 7f16b14f9840 20 bluefs _replay 0x0:  op_jump seq 513 offset 0x30000
122:    -13> 2018-05-08 01:47:46.198022 7f16b14f9840 10 bluefs _read h 0x55638d77a900 0x1f000~11000 from file(ino 1 size 0x30000 mtime 0.000000 bdev 0 allocated 430000 extents [1:0x68d0000+30000,1:0x64d0000+400000])
122:    -12> 2018-05-08 01:47:46.198031 7f16b14f9840 20 bluefs _read left 0x11000 len 0x11000
122:    -11> 2018-05-08 01:47:46.198034 7f16b14f9840 20 bluefs _read got 69632
122:    -10> 2018-05-08 01:47:46.198036 7f16b14f9840 10 bluefs _read h 0x55638d77a900 0x30000~1000 from file(ino 1 size 0x30000 mtime 0.000000 bdev 0 allocated 430000 extents [1:0x68d0000+30000,1:0x64d0000+400000])
122:     -9> 2018-05-08 01:47:46.198061 7f16b14f9840 20 bluefs _read fetching 0x0~100000 of 1:0x64d0000+400000
122:     -8> 2018-05-08 01:47:46.198067 7f16b14f9840  5 bdev(0x55638d7aab40 ceph_test_bluefs.tmp.block.16610.12) read 0x64d0000~100000 (direct)
122:     -7> 2018-05-08 01:47:46.198070 7f16b14f9840 20 bdev(0x55638d7aab40 ceph_test_bluefs.tmp.block.16610.12) _aio_log_start 0x64d0000~100000
122:     -6> 2018-05-08 01:47:46.198704 7f16b14f9840 20 bdev(0x55638d7aab40 ceph_test_bluefs.tmp.block.16610.12) _aio_log_finish 1 0x64d0000~100000
122:     -5> 2018-05-08 01:47:46.198713 7f16b14f9840 20 bluefs _read left 0x100000 len 0x1000
122:     -4> 2018-05-08 01:47:46.198715 7f16b14f9840 20 bluefs _read got 4096
122:     -3> 2018-05-08 01:47:46.198720 7f16b14f9840 10 bluefs _replay 0x30000: stop: uuid 00000000-0000-0000-0000-000000000000 != super.uuid 2093ae5f-4054-4a90-a43b-2cdccdea6286
122:     -2> 2018-05-08 01:47:46.198727 7f16b14f9840 10 bluefs _replay log file size was 0x30000
122:     -1> 2018-05-08 01:47:46.198843 7f16b14f9840 10 bluefs _replay done
122:      0> 2018-05-08 01:47:46.201461 7f16b14f9840 -1 /home/yuriw/wip3/src/include/interval_set.h: In function 'void interval_set<T, Map>::intersection_size_asym(const interval_set<T, Map>&, const interval_set<T, Map>&) [with T = long unsigned int; Map = btree::btree_map<long unsigned int, long unsigned int, std::less<long unsigned int>, mempool::pool_allocator<(mempool::pool_index_t)1u, std::pair<const long unsigned int, long unsigned int> > >]' thread 7f16b14f9840 time 2018-05-08 01:47:46.200223
122: /home/yuriw/wip3/src/include/interval_set.h: 249: FAILED assert(ps != s.m.end())
122: 
122:  ceph version 12.2.5-44-gc9be648 (c9be64881ef8471b139d15b0570951506f4bce5b) luminous (stable)
122:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f16a7fc4092]
122:  2: (interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> >::intersection_size_asym(interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&, interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&)+0x15aa) [0x55638b87386a]
122:  3: (StupidAllocator::init_rm_free(unsigned long, unsigned long)+0x269b) [0x55638b86ba1b]
122:  4: (BlueFS::mount()+0x3fa) [0x55638b83f63a]
122:  5: (BlueFS_test_replay_Test::TestBody()+0x542) [0x55638b801b82]
122:  6: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x43) [0x55638b82b863]
122:  7: (testing::Test::Run()+0xba) [0x55638b82257a]
122:  8: (testing::TestInfo::Run()+0x118) [0x55638b8226c8]
122:  9: (testing::TestCase::Run()+0xb5) [0x55638b8227a5]
122:  10: (testing::internal::UnitTestImpl::RunAllTests()+0x24f) [0x55638b822a7f]
122:  11: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x43) [0x55638b82bd43]
122:  12: (testing::UnitTest::Run()+0x44) [0x55638b822db4]
122:  13: (main()+0x16d) [0x55638b7f862d]
122:  14: (__libc_start_main()+0xf0) [0x7f16a6ea7830]
122:  15: (_start()+0x29) [0x55638b7f9359]
122:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
122: 
122: --- logging levels ---
122:    0/ 5 none
122:    0/ 1 lockdep
122:    0/ 1 context
122:    1/ 1 crush
122:    1/ 5 mds
122:    1/ 5 mds_balancer
122:    1/ 5 mds_locker
122:    1/ 5 mds_log
122:    1/ 5 mds_log_expire
122:    1/ 5 mds_migrator
122:    0/ 1 buffer
122:    0/ 1 timer
122:    0/ 1 filer
122:    0/ 1 striper
122:    0/ 1 objecter
122:    0/ 5 rados
122:    0/ 5 rbd
122:    0/ 5 rbd_mirror
122:    0/ 5 rbd_replay
122:    0/ 5 journaler
122:    0/ 5 objectcacher
122:    0/ 5 client
122:    1/ 5 osd
122:    0/ 5 optracker
122:    0/ 5 objclass
122:    1/ 3 filestore
122:    1/ 3 journal
122:    0/ 0 ms
122:    1/ 5 mon
122:    0/10 monc
122:    1/ 5 paxos
122:    0/ 5 tp
122:    1/ 5 auth
122:    1/ 5 crypto
122:    1/ 1 finisher
122:    1/ 1 reserver
122:    1/ 5 heartbeatmap
122:    1/ 5 perfcounter
122:    1/ 5 rgw
122:    1/10 civetweb
122:    1/ 5 javaclient
122:    1/ 5 asok
122:    1/ 1 throttle
122:    0/ 0 refs
122:    1/ 5 xio
122:    1/ 5 compressor
122:    1/ 5 bluestore
122:    1/20 bluefs
122:    1/20 bdev
122:    1/ 5 kstore
122:    4/ 5 rocksdb
122:    4/ 5 leveldb
122:    4/ 5 memdb
122:    1/ 5 kinetic
122:    1/ 5 fuse
122:    1/ 5 mgr
122:    1/ 5 mgrc
122:    1/ 5 dpdk
122:    1/ 5 eventtrace
122:   -2/-2 (syslog threshold)
122:   99/99 (stderr threshold)
122:   max_recent       500
122:   max_new         1000
122:   log_file 
122: --- end dump of recent events ---
122: *** Caught signal (Aborted) **
122:  in thread 7f16b14f9840 thread_name:unittest_bluefs
122:  ceph version 12.2.5-44-gc9be648 (c9be64881ef8471b139d15b0570951506f4bce5b) luminous (stable)
122:  1: (()+0xcb014) [0x55638b88b014]
122:  2: (()+0x11390) [0x7f16b10e8390]
122:  3: (gsignal()+0x38) [0x7f16a6ebc428]
122:  4: (abort()+0x16a) [0x7f16a6ebe02a]
122:  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7f16a7fc421e]
122:  6: (interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> >::intersection_size_asym(interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&, interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&)+0x15aa) [0x55638b87386a]
122:  7: (StupidAllocator::init_rm_free(unsigned long, unsigned long)+0x269b) [0x55638b86ba1b]
122:  8: (BlueFS::mount()+0x3fa) [0x55638b83f63a]
122:  9: (BlueFS_test_replay_Test::TestBody()+0x542) [0x55638b801b82]
122:  10: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x43) [0x55638b82b863]
122:  11: (testing::Test::Run()+0xba) [0x55638b82257a]
122:  12: (testing::TestInfo::Run()+0x118) [0x55638b8226c8]
122:  13: (testing::TestCase::Run()+0xb5) [0x55638b8227a5]
122:  14: (testing::internal::UnitTestImpl::RunAllTests()+0x24f) [0x55638b822a7f]
122:  15: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x43) [0x55638b82bd43]
122:  16: (testing::UnitTest::Run()+0x44) [0x55638b822db4]
122:  17: (main()+0x16d) [0x55638b7f862d]
122:  18: (__libc_start_main()+0xf0) [0x7f16a6ea7830]
122:  19: (_start()+0x29) [0x55638b7f9359]
122: 2018-05-08 01:47:46.205943 7f16b14f9840 -1 *** Caught signal (Aborted) **
122:  in thread 7f16b14f9840 thread_name:unittest_bluefs
122: 
122:  ceph version 12.2.5-44-gc9be648 (c9be64881ef8471b139d15b0570951506f4bce5b) luminous (stable)
122:  1: (()+0xcb014) [0x55638b88b014]
122:  2: (()+0x11390) [0x7f16b10e8390]
122:  3: (gsignal()+0x38) [0x7f16a6ebc428]
122:  4: (abort()+0x16a) [0x7f16a6ebe02a]
122:  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7f16a7fc421e]
122:  6: (interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> >::intersection_size_asym(interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&, interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&)+0x15aa) [0x55638b87386a]
122:  7: (StupidAllocator::init_rm_free(unsigned long, unsigned long)+0x269b) [0x55638b86ba1b]
122:  8: (BlueFS::mount()+0x3fa) [0x55638b83f63a]
122:  9: (BlueFS_test_replay_Test::TestBody()+0x542) [0x55638b801b82]
122:  10: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x43) [0x55638b82b863]
122:  11: (testing::Test::Run()+0xba) [0x55638b82257a]
122:  12: (testing::TestInfo::Run()+0x118) [0x55638b8226c8]
122:  13: (testing::TestCase::Run()+0xb5) [0x55638b8227a5]
122:  14: (testing::internal::UnitTestImpl::RunAllTests()+0x24f) [0x55638b822a7f]
122:  15: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x43) [0x55638b82bd43]
122:  16: (testing::UnitTest::Run()+0x44) [0x55638b822db4]
122:  17: (main()+0x16d) [0x55638b7f862d]
122:  18: (__libc_start_main()+0xf0) [0x7f16a6ea7830]
122:  19: (_start()+0x29) [0x55638b7f9359]
122:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
122: 
122: --- begin dump of recent events ---
122:      0> 2018-05-08 01:47:46.205943 7f16b14f9840 -1 *** Caught signal (Aborted) **
122:  in thread 7f16b14f9840 thread_name:unittest_bluefs
122: 
122:  ceph version 12.2.5-44-gc9be648 (c9be64881ef8471b139d15b0570951506f4bce5b) luminous (stable)
122:  1: (()+0xcb014) [0x55638b88b014]
122:  2: (()+0x11390) [0x7f16b10e8390]
122:  3: (gsignal()+0x38) [0x7f16a6ebc428]
122:  4: (abort()+0x16a) [0x7f16a6ebe02a]
122:  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7f16a7fc421e]
122:  6: (interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> >::intersection_size_asym(interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&, interval_set<unsigned long, btree::btree_map<unsigned long, unsigned long, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair<unsigned long const, unsigned long> >, 256> > const&)+0x15aa) [0x55638b87386a]
122:  7: (StupidAllocator::init_rm_free(unsigned long, unsigned long)+0x269b) [0x55638b86ba1b]
122:  8: (BlueFS::mount()+0x3fa) [0x55638b83f63a]
122:  9: (BlueFS_test_replay_Test::TestBody()+0x542) [0x55638b801b82]
122:  10: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x43) [0x55638b82b863]
122:  11: (testing::Test::Run()+0xba) [0x55638b82257a]
122:  12: (testing::TestInfo::Run()+0x118) [0x55638b8226c8]
122:  13: (testing::TestCase::Run()+0xb5) [0x55638b8227a5]
122:  14: (testing::internal::UnitTestImpl::RunAllTests()+0x24f) [0x55638b822a7f]
122:  15: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x43) [0x55638b82bd43]
122:  16: (testing::UnitTest::Run()+0x44) [0x55638b822db4]
122:  17: (main()+0x16d) [0x55638b7f862d]
122:  18: (__libc_start_main()+0xf0) [0x7f16a6ea7830]
122:  19: (_start()+0x29) [0x55638b7f9359]
122:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
122: 
122: --- logging levels ---
122:    0/ 5 none
122:    0/ 1 lockdep
122:    0/ 1 context
122:    1/ 1 crush
122:    1/ 5 mds
122:    1/ 5 mds_balancer
122:    1/ 5 mds_locker
122:    1/ 5 mds_log
122:    1/ 5 mds_log_expire
122:    1/ 5 mds_migrator
122:    0/ 1 buffer
122:    0/ 1 timer
122:    0/ 1 filer
122:    0/ 1 striper
122:    0/ 1 objecter
122:    0/ 5 rados
122:    0/ 5 rbd
122:    0/ 5 rbd_mirror
122:    0/ 5 rbd_replay
122:    0/ 5 journaler
122:    0/ 5 objectcacher
122:    0/ 5 client
122:    1/ 5 osd
122:    0/ 5 optracker
122:    0/ 5 objclass
122:    1/ 3 filestore
122:    1/ 3 journal
122:    0/ 0 ms
122:    1/ 5 mon
122:    0/10 monc
122:    1/ 5 paxos
122:    0/ 5 tp
122:    1/ 5 auth
122:    1/ 5 crypto
122:    1/ 1 finisher
122:    1/ 1 reserver
122:    1/ 5 heartbeatmap
122:    1/ 5 perfcounter
122:    1/ 5 rgw
122:    1/10 civetweb
122:    1/ 5 javaclient
122:    1/ 5 asok
122:    1/ 1 throttle
122:    0/ 0 refs
122:    1/ 5 xio
122:    1/ 5 compressor
122:    1/ 5 bluestore
122:    1/20 bluefs
122:    1/20 bdev
122:    1/ 5 kstore
122:    4/ 5 rocksdb
122:    4/ 5 leveldb
122:    4/ 5 memdb
122:    1/ 5 kinetic
122:    1/ 5 fuse
122:    1/ 5 mgr
122:    1/ 5 mgrc
122:    1/ 5 dpdk
122:    1/ 5 eventtrace
122:   -2/-2 (syslog threshold)
122:   99/99 (stderr threshold)
122:   max_recent       500
122:   max_new         1000
122:   log_file 
122: --- end dump of recent events ---
148/148 Test #122: unittest_bluefs .........................***Exception: Other478.98 sec

99% tests passed, 1 tests failed out of 148

Total Test time (real) = 614.48 sec

The following tests FAILED:
    122 - unittest_bluefs (OTHER_FAULT)
Errors while running CTest


Files

LastTest.log (93 KB) LastTest.log Yuri Weinstein, 05/11/2018 09:37 PM

Related issues 1 (0 open1 closed)

Copied to bluestore - Backport #24132: luminous: "122 - unittest_bluefs (OTHER_FAULT)" during ctest runResolvedKefu ChaiActions
Actions #1

Updated by Josh Durgin almost 6 years ago

  • Project changed from RADOS to bluestore
  • Priority changed from Normal to High
Actions #2

Updated by Igor Fedotov almost 6 years ago

Mine isn't crashing.

Wondering if this is reproducible?

And is standalone bin/unittest_bluefs run failing?

Actions #3

Updated by Yuri Weinstein almost 6 years ago

I see it every time on luminous PRs testing (see on smithi123)

It's reproducible and also failed as standalone (as ctest -R unittest_bluefs ) =>

@smithi123:/home/yuriw/wip1/build$ vi Testing/Temporary/LastTest.log

see attached

Actions #4

Updated by Kefu Chai almost 6 years ago

  • Copied to Backport #24132: luminous: "122 - unittest_bluefs (OTHER_FAULT)" during ctest run added
Actions #5

Updated by Kefu Chai almost 6 years ago

  • Status changed from New to Pending Backport
  • Assignee set to Kefu Chai
  • Backport set to luminous
Actions #6

Updated by Kefu Chai almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF