Project

General

Profile

Actions

Bug #21149

closed

SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())

Added by shangzhong zhu over 6 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
Hadoop/Java
Labels (FS):
Java/Hadoop
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When I run the Hadoop write test, the following exception occurs(NOT 100%):

/clove/vm/renhw/ceph/rpmbuild/BUILD/ceph-12.1.0.3/src/log/SubsystemMap.h: In function 'bool ceph::logging::SubsystemMap::should_gather(unsigned int, int)' thread 7f8d54f93700 time 2017-08-26 09:56:14.921849
/clove/vm/renhw/ceph/rpmbuild/BUILD/ceph-12.1.0.3/src/log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
ceph version 12.1.2-593-gb3caae4 (b3caae4223d6182e56ae979497e76c21cdad0f86) luminous (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f8d210db420]
2: (()+0x4310) [0x7f8d35283310]
3: (Java_com_ceph_fs_CephMount_native_1ceph_1conf_1read_1file()+0x3bc) [0x7f8d3528565c]
4: [0x7f8d3d017774]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Aborted

Actions #1

Updated by shangzhong zhu over 6 years ago

I think the exception was triggered by writing the debug message before reading ceph config.
PR https://github.com/ceph/ceph/pull/17157

Actions #2

Updated by shangzhong zhu over 6 years ago

[hadoop@ceph149 hadoop]$ cat /etc/ceph/ceph.conf
[global]
fsid = 99bf903b-b1e9-49de-afd6-2d7897bfd3c5
mon_initial_members = ceph147
mon_host = 192.9.9.147
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
mon_allow_pool_delete = true

mon_osd_min_up_ratio = 0
mon_osd_min_in_ratio = 0

enable experimental unrecoverable data corrupting features = bluestore rocksdb
osd objectstore = bluestore

bluestore_allocator = stupid
bluefs_allocator = stupid

[osd]
bluestore = true

[client]
debug client = 20/30
debug javaclient = 20/30
#log_file = /tmp/client.log
#client_trace = /tmp/client_trace.log

#[mds]
#debug mds = 20/30

When I run the hadoop write test, the assertion can be reproduced.

[hadoop@ceph149 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha4-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 1
2017-09-20 16:12:56,318 INFO fs.TestDFSIO: TestDFSIO.1.8
2017-09-20 16:12:56,323 INFO fs.TestDFSIO: nrFiles = 10
2017-09-20 16:12:56,323 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
2017-09-20 16:12:56,323 INFO fs.TestDFSIO: bufferSize = 1000000
2017-09-20 16:12:56,323 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
2017-09-20 16:12:57,116 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 10 files
2017-09-20 16:12:57,530 INFO fs.TestDFSIO: created control files for: 10 files
2017-09-20 16:12:57,750 INFO client.RMProxy: Connecting to ResourceManager at /192.9.9.148:8032
2017-09-20 16:12:58,119 INFO client.RMProxy: Connecting to ResourceManager at /192.9.9.148:8032
2017-09-20 16:12:58,552 INFO mapred.FileInputFormat: Total input files to process : 10
2017-09-20 16:12:58,604 INFO mapreduce.JobSubmitter: number of splits:10
2017-09-20 16:12:58,807 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2017-09-20 16:12:58,807 INFO Configuration.deprecation: dfs.permissions is deprecated. Instead, use dfs.permissions.enabled
2017-09-20 16:12:58,808 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2017-09-20 16:12:59,033 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1505892902374_0010
/clove/vm/renhw/ceph/rpmbuild/BUILD/ceph-12.1.0.3/src/log/SubsystemMap.h: In function 'bool ceph::logging::SubsystemMap::should_gather(unsigned int, int)' thread 7f7d7edb9700 time 2017-09-20 16:12:59.225384
/clove/vm/renhw/ceph/rpmbuild/BUILD/ceph-12.1.0.3/src/log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
ceph version 12.1.2-593-gb3caae4 (b3caae4223d6182e56ae979497e76c21cdad0f86) luminous (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f7d4bd69420]
2: (()+0x4310) [0x7f7d54a7a310]
3: (Java_com_ceph_fs_CephMount_native_1ceph_1conf_1read_1file()+0x3bc) [0x7f7d54a7c65c]
4: [0x7f7d68e0e774]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Aborted

Actions #3

Updated by Patrick Donnelly over 6 years ago

  • Project changed from Ceph to CephFS
  • Category changed from 129 to 48
  • Assignee deleted (Jos Collin)
  • Source changed from Development to Community (user)
  • Component(FS) Hadoop/Java added
Actions #4

Updated by Patrick Donnelly about 5 years ago

  • Status changed from New to Rejected

Java/Hadoop testing is no longer a priority.

Actions #5

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (48)
  • Labels (FS) Java/Hadoop added
Actions

Also available in: Atom PDF