Project

General

Profile

Actions

Bug #47310

closed

enoent when reading osdmap

Added by Kefu Chai over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

INFO  2020-09-05 06:01:17,250 [shard 0] osd - start
DEBUG 2020-09-05 06:01:17,250 [shard 0] filestore - mount
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - list_collections
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
WARN  2020-09-05 06:01:18,575 [shard 0] osd - ignoring unrecognized collection: meta
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,575 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - omap_get_values
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - omap_get_values
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - omap_get_values
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - omap_get_values
...
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - omap_get_values
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,576 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,577 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,577 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,577 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,577 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,578 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - read
DEBUG 2020-09-05 06:01:18,579 [shard 0] osd -  pg_epoch 34 pg[3.d( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r=-1 lpr=0 crt=0'0 mlcod 0'0 unknown enter Initial
INFO  2020-09-05 06:01:18,579 [shard 0] osd - Entering state: Initial
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - open_collection
DEBUG 2020-09-05 06:01:18,579 [shard 0] filestore - open_collection
terminate called after throwing an instance of 'std::runtime_error'
  what():  read gave enoent on #-1:37b8e826:::osdmap.37:0#
Aborting on shard 0.
Backtrace:
  0x00000000012e16dc
  0x00000000012a46f0

Files

ceph-osd.0.log.bz2 (670 KB) ceph-osd.0.log.bz2 Kefu Chai, 09/05/2020 07:12 AM
Actions #2

Updated by Kefu Chai over 3 years ago

Backtrace:
[Backtrace #0]
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/include/seastar/util/backtrace.hh:59
non-virtual thunk to std::_Nested_exception<std::runtime_error>::~_Nested_exception() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:730
 (inlined by) print_with_backtrace at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:751
print_with_backtrace at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:758
non-virtual thunk to std::_Nested_exception<std::runtime_error>::~_Nested_exception() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:3517
 (inlined by) operator() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:3499
 (inlined by) _FUN at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:3495
.annobin_funlockfile.c_end at funlockfile.c:?
__GI___libc_sigaction at :?
__isoc99_fscanf.cold.0 at isoc99_fscanf.c:?
?? ??:0
std::rethrow_exception(std::__exception_ptr::exception_ptr) at ??:?
__cxa_throw_bad_array_new_length at ??:?
__gxx_personality_v0 at ??:?
_Unwind_GetTextRelBase at ??:?
_Unwind_Resume at ??:?
?? at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/crimson/common/errorator.h:222
 (inlined by) ?? at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/crimson/common/errorator.h:417
operator()<seastar::future<ceph::buffer::v15_2_0::list> > at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/crimson/common/errorator.h:653
 (inlined by) operator() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/include/seastar/core/future.hh:1640
 (inlined by) ?? at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/include/seastar/core/future.hh:2038
 (inlined by) operator() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/include/seastar/core/future.hh:1639
 (inlined by) run_and_dispose at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/include/seastar/core/future.hh:648
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:2192
non-virtual thunk to std::_Nested_exception<std::runtime_error>::~_Nested_exception() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:2608
non-virtual thunk to std::_Nested_exception<std::runtime_error>::~_Nested_exception() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:2591
 (inlined by) seastar::reactor::run() at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/reactor.cc:2763
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/seastar/src/core/app-template.cc:207
main at /usr/src/debug/ceph-16.0.0-5167.g0f0addb4d7.el8.x86_64/src/crimson/osd/main.cc:147
check_one_fd at check_fds.c:?
_start at ??:?
Actions #3

Updated by Kefu Chai over 3 years ago

$ grep '=== osd_map' ceph-osd.0.log
DEBUG 2020-09-05 05:38:43,594 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #5 === osd_map(7..7 src has 1..7) v4 (41)
DEBUG 2020-09-05 05:38:43,594 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #7 === osd_map(1..7 src has 1..7) v4 (41)
DEBUG 2020-09-05 05:38:44,588 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #8 === osd_map(8..8 src has 1..8) v4 (41)
DEBUG 2020-09-05 05:38:45,591 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #10 === osd_map(9..9 src has 1..9) v4 (41)
DEBUG 2020-09-05 05:38:49,943 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #12 === osd_map(10..11 src has 1..11) v4 (41)
DEBUG 2020-09-05 05:38:50,946 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #13 === osd_map(12..12 src has 1..12) v4 (41)
DEBUG 2020-09-05 05:38:50,947 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #15 === osd_map(11..12 src has 1..12) v4 (41)
DEBUG 2020-09-05 05:38:53,635 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #16 === osd_map(13..13 src has 1..13) v4 (41)
DEBUG 2020-09-05 05:38:59,971 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #18 === osd_map(14..14 src has 1..14) v4 (41)
DEBUG 2020-09-05 05:39:00,971 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #19 === osd_map(14..15 src has 1..15) v4 (41)
DEBUG 2020-09-05 05:39:00,977 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #20 === osd_map(15..15 src has 1..15) v4 (41)
DEBUG 2020-09-05 05:39:00,978 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #22 === osd_map(14..15 src has 1..15) v4 (41)
DEBUG 2020-09-05 05:39:03,020 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #23 === osd_map(16..17 src has 1..17) v4 (41)
DEBUG 2020-09-05 05:39:06,962 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #24 === osd_map(18..18 src has 1..18) v4 (41)
DEBUG 2020-09-05 05:39:09,080 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #25 === osd_map(19..19 src has 1..19) v4 (41)
DEBUG 2020-09-05 05:39:14,968 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #26 === osd_map(20..20 src has 1..20) v4 (41)
DEBUG 2020-09-05 05:39:15,008 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #27 === osd_map(19..20 src has 1..20) v4 (41)
DEBUG 2020-09-05 05:39:15,047 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #28 === osd_map(21..21 src has 1..21) v4 (41)
DEBUG 2020-09-05 05:39:22,969 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #29 === osd_map(22..23 src has 1..23) v4 (41)
DEBUG 2020-09-05 05:39:24,040 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #30 === osd_map(23..24 src has 1..24) v4 (41)
DEBUG 2020-09-05 05:39:30,065 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #31 === osd_map(25..26 src has 1..26) v4 (41)
DEBUG 2020-09-05 05:39:33,072 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #32 === osd_map(26..27 src has 1..27) v4 (41)
DEBUG 2020-09-05 05:39:38,083 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #33 === osd_map(28..29 src has 1..29) v4 (41)
DEBUG 2020-09-05 05:39:39,092 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #34 === osd_map(29..30 src has 1..30) v4 (41)
DEBUG 2020-09-05 05:39:39,103 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #35 === osd_map(29..30 src has 1..30) v4 (41)
DEBUG 2020-09-05 05:39:42,625 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #36 === osd_map(31..31 src has 1..31) v4 (41)
DEBUG 2020-09-05 05:39:45,105 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #37 === osd_map(32..32 src has 1..32) v4 (41)
DEBUG 2020-09-05 05:39:48,109 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #38 === osd_map(33..33 src has 1..33) v4 (41)
DEBUG 2020-09-05 05:39:49,108 [shard 0] ms - [osd.0(client) v2:172.21.15.36:6800/18918@56360 >> mon.0 v2:172.21.15.36:3300/0] <== #39 === osd_map(34..34 src has 1..34) v4 (41)

so the latest osdmap osd.0 should possess was osdmap.34.

DEBUG 2020-09-05 05:39:50,471 [shard 0] osd - replicated_request(id=1966, detail=RepRequest(from=3 req=osd_repop(client.4225.0:6820 3.b e35/14 3:d9683550:::smithi03620586-884:head v 35'239, mlcod=35'239) v3)) start

DEBUG 2020-09-05 05:39:50,471 [shard 0] ms - [osd.0(cluster) v2:172.21.15.36:6801/18918 >> osd.3 v2:172.21.15.189:6805/18782@49592] <== #1400 === osd_repop(client.4225.0:6820 3.b e3
5/14) v3 (112)

/////

DEBUG 2020-09-05 05:39:52,118 [shard 0] ms - [osd.0(cluster) v2:172.21.15.36:6801/18918 >> osd.3 v2:172.21.15.189:6805/18782@49592] <== #1416 === osd_repop(client.4225.0:6905 3.1 e3
7/14) v3 (112)

DEBUG 2020-09-05 05:39:52,118 [shard 0] osd - replicated_request(id=1986, detail=RepRequest(from=3 req=osd_repop(client.4225.0:6905 3.1 e37/14 3:8e8db84e:::smithi03620586-669:head v 37'315, mlcod=37'315) v3)) start

DEBUG 2020-09-05 05:39:52,118 [shard 0] osd -  pg_epoch 34 pg[3.1( v 37'315 (0'0,37'315] local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=14) [3,1,0] r=2 lpr=14 luod=0'0 lua=0'0 crt=36'314 lcod 36'314 mlcod 36'314 active add_log_entry 37'315 (36'314) modify   3:8e8db84e:::smithi03620586-669:head by client.4225.0:6905 2020-09-05T05:39:52.113436+0000 0 ObjectCleanRegions clean_offsets: [0~1442323,1961464~18446744073707590151], clean_omap: 1, new_object: 0

primary was sending osd.0 a replicated request with e35 and later e37. but the "min_epoch" of MOSDRepOp was 14, so osd.0 proceeded with these operations.

Actions #4

Updated by Kefu Chai over 3 years ago

  • Status changed from New to Resolved
  • Assignee set to Xuehan Xu
  • Pull request ID set to 37003
Actions

Also available in: Atom PDF