Project

General

Profile

Actions

Bug #12815

closed

Stop the rbd map command due to the linux kernal crush.

Added by jack ma over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
krbd
Crash signature (v1):
Crash signature (v2):

Description

My ceph version is 0.87

[root@song ~]# uname -a
Linux song 3.10.0-123.el7.x86_64 #1 SMP Thu Jun 4 17:17:49 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

I clone a snap to a new pool rbd.

Then i use the rbd map command map the clone image in the rbd.But the rbd map command does not return.Then I stop the command by contorl+c.This lead to the linux kernal crush.

The system long

Aug 28 18:39:51 song kernel: [ 614.716802] Key type dns_resolver registered
Aug 28 18:39:51 song kernel: [ 614.731813] Key type ceph registered
Aug 28 18:39:51 song kernel: [ 614.733494] libceph: loaded (mon/osd proto 15/24)
Aug 28 18:39:51 song kernel: [ 614.739723] rbd: loaded rbd (rados block device)
Aug 28 18:39:51 song kernel: [ 614.744911] libceph: client5549 fsid 04164f07-5a2a-44f2-be6f-810d3c83f448
Aug 28 18:39:51 song kernel: [ 614.751753] libceph: mon0 10.118.202.189:6789 session established
Aug 28 18:39:51 song kernel: [ 614.770606] rbd: image im3_clone0: WARNING: kernel layering is EXPERIMENTAL!
Aug 28 18:39:51 song kernel: [ 614.798066] libceph: read_partial_message skipping long message (4796 > 4092)

Aug 28 18:39:46 song systemd1: Started Collect Read-Ahead Data.
Aug 28 18:39:46 song systemd1: Started Drop Read-Ahead Data.
Aug 28 18:39:48 song kdumpctl42874: No memory reserved for crash kernel.
Aug 28 18:39:48 song kdumpctl42874: Starting kdump: [FAILED]
Aug 28 18:39:48 song systemd1: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 28 18:39:48 song systemd1: Failed to start Crash recovery kernel arming.
Aug 28 18:39:48 song systemd1: Unit kdump.service entered failed state.
Aug 28 18:39:48 song openstack-guard1354: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl xn' for details.
Aug 28 18:39:51 song kernel: [ 614.716802] Key type dns_resolver registered
Aug 28 18:39:51 song kernel: [ 614.731813] Key type ceph registered
Aug 28 18:39:51 song kernel: [ 614.733494] libceph: loaded (mon/osd proto 15/24)
Aug 28 18:39:51 song kernel: [ 614.739723] rbd: loaded rbd (rados block device)
Aug 28 18:39:51 song kernel: [ 614.744911] libceph: client5549 fsid 04164f07-5a2a-44f2-be6f-810d3c83f448
Aug 28 18:39:51 song kernel: [ 614.751753] libceph: mon0 10.118.202.189:6789 session established
Aug 28 18:39:51 song kernel: [ 614.770606] rbd: image im3_clone0: WARNING: kernel layering is EXPERIMENTAL!
Aug 28 18:39:51 song kernel: [ 614.798066] libceph: read_partial_message skipping long message (4796 > 4092)
Aug 28 18:40:01 song systemd1: Starting Session 5 of user root.
Aug 28 18:40:01 song systemd1: Started Session 5 of user root.
Aug 28 18:40:07 song systemd1: Starting Crash recovery kernel arming...
Aug 28 18:40:08 song systemd1: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Aug 28 18:40:08 song systemd1: Started Collect Read-Ahead Data.
Aug 28 18:40:08 song systemd1: Started Drop Read-Ahead Data.
Aug 28 18:40:10 song kdumpctl44292: No memory reserved for crash kernel.
Aug 28 18:40:10 song kdumpctl44292: Starting kdump: [FAILED]
Aug 28 18:40:10 song systemd1: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 28 18:40:10 song systemd1: Failed to start Crash recovery kernel arming.
Aug 28 18:40:10 song systemd1: Unit kdump.service entered failed state.
Aug 28 18:40:10 song openstack-guard1354: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.
Aug 28 18:40:29 song systemd1: Starting Crash recovery kernel arming...
Aug 28 18:40:30 song systemd1: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Aug 28 18:40:30 song systemd1: Started Collect Read-Ahead Data.
Aug 28 18:40:30 song systemd1: Started Drop Read-Ahead Data.
Aug 28 18:40:32 song kdumpctl45690: No memory reserved for crash kernel.
Aug 28 18:40:32 song kdumpctl45690: Starting kdump: [FAILED]
Aug 28 18:40:32 song systemd1: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 28 18:40:32 song systemd1: Failed to start Crash recovery kernel arming.
Aug 28 18:40:32 song systemd1: Unit kdump.service entered failed state.
Aug 28 18:40:32 song openstack-guard1354: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.
Aug 28 18:40:51 song systemd1: Starting Crash recovery kernel arming...
Aug 28 18:40:52 song systemd1: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Aug 28 18:40:52 song systemd1: Started Collect Read-Ahead Data.
Aug 28 18:40:52 song systemd1: Started Drop Read-Ahead Data.
Aug 28 18:40:54 song kdumpctl47088: No memory reserved for crash kernel.
Aug 28 18:40:54 song kdumpctl47088: Starting kdump: [FAILED]
Aug 28 18:40:54 song systemd1: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 28 18:40:54 song systemd1: Failed to start Crash recovery kernel arming.
Aug 28 18:40:54 song systemd1: Unit kdump.service entered failed state.
Aug 28 18:40:54 song openstack-guard1354: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.
Aug 28 18:41:13 song systemd1: Starting Crash recovery kernel arming...
Aug 28 18:41:13 song systemd1: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Aug 28 18:41:13 song systemd1: Started Collect Read-Ahead Data.
Aug 28 18:41:13 song systemd1: Started Drop Read-Ahead Data.
Aug 28 18:41:15 song kdumpctl48485: No memory reserved for crash kernel.
Aug 28 18:41:15 song kdumpctl48485: Starting kdump: [FAILED]
Aug 28 18:41:15 song systemd1: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 28 18:41:15 song systemd1: Failed to start Crash recovery kernel arming.
Aug 28 18:41:15 song systemd1: Unit kdump.service entered failed state.
Aug 28 18:41:15 song openstack-guard1354: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.
Aug 28 18:41:18 song kernel: [ 701.523636] rbd: image im3_clone0: unable to tear down watch request (-512)
Aug 28 18:41:18 song kernel: [ 701.523636]
Aug 28 18:41:18 song kernel: [ 701.527379] -----------
[ cut here ]------------
Aug 28 18:41:18 song kernel: [ 701.528529] kernel BUG at net/ceph/osd_client.c:969!
Aug 28 18:41:18 song kernel: [ 701.529591] invalid opcode: 0000 [#1] SMP
Aug 28 18:41:18 song kernel: [ 701.530633] Modules linked in: rbd libceph dns_resolver ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables scsi_transport_iscsi openvswitch(OF) gre libcrc32c sg snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm ppdev snd_page_alloc snd_timer vmw_balloon snd coretemp serio_raw soundcore crc32c_intel pcspkr parport_pc i2c_piix4 shpchp parport vmw_vmci mperf sppdh(OF) binfmt_misc nfsd spp_rr(OF) dm_spp(OF) auth_rpcgss nfs_acl lockd vhost_net sunrpc tun macvtap macvlan ext4 mbcache jbd2 sd_mod sr_mod cdrom crc_t10dif crct10dif_common ata_generic pata_acpi vmwgfx ttm drm mptspi ata_piix scsi_transport_spi e1000 mptscsih libata mptbase i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: openvswitch]
Aug 28 18:41:18 song kernel: [ 701.539290] CPU: 2 PID: 43850 Comm: rbd Tainted: GF O-------------- 3.10.0-123.el7.x86_64 #1
Aug 28 18:41:18 song kernel: [ 701.541346] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
Aug 28 18:41:18 song kernel: [ 701.543404] task: ffff8800729b8000 ti: ffff880072f98000 task.ti: ffff880072f98000
Aug 28 18:41:18 song kernel: [ 701.544437] RIP: 0010:[<ffffffffa04cb68c>] [<ffffffffa04cb68c>] __remove_osd+0xbc/0xc0 [libceph]
Aug 28 18:41:18 song kernel: [ 701.545499] RSP: 0018:ffff880072f99d68 EFLAGS: 00010212
Aug 28 18:41:18 song kernel: [ 701.546516] RAX: ffff88006ecbdc98 RBX: ffff88006ecbd800 RCX: ffff880072f99fd8
Aug 28 18:41:18 song kernel: [ 701.547528] RDX: 0000000000000000 RSI: ffff88006ecbd800 RDI: ffff880036828750
Aug 28 18:41:18 song kernel: [ 701.548520] RBP: ffff880072f99d80 R08: ffff88006d9ee000 R09: 0000000180200016
Aug 28 18:41:18 song kernel: [ 701.549494] R10: ffffea0001b67b80 R11: ffffffffa04d220d R12: ffff880036828750
Aug 28 18:41:18 song kernel: [ 701.550450] R13: ffff8800368287a8 R14: ffff880073e26000 R15: ffff880072fe7800
Aug 28 18:41:18 song kernel: [ 701.551390] FS: 00007f9f71756880(0000) GS:ffff880077640000(0000) knlGS:0000000000000000
Aug 28 18:41:18 song kernel: [ 701.552342] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 28 18:41:18 song kernel: [ 701.553307] CR2: 00000000006dde20 CR3: 000000006ea7d000 CR4: 00000000000007e0
Aug 28 18:41:18 song kernel: [ 701.554334] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 28 18:41:18 song kernel: [ 701.555359] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 28 18:41:18 song kernel: [ 701.556318] Stack:
Aug 28 18:41:18 song kernel: [ 701.557287] ffff880036828750 ffff880036828750 ffff8800368287d0 ffff880072f99db8
Aug 28 18:41:18 song kernel: [ 701.558276] ffffffffa04d01bc 0000000000000001 ffff880072f99e00 ffff880036828000
Aug 28 18:41:18 song kernel: [ 701.559260] ffff88007465ba60 0000000000000000 ffff880072f99dd8 ffffffffa04bf530
Aug 28 18:41:18 song kernel: [ 701.560264] Call Trace:
Aug 28 18:41:18 song kernel: [ 701.561220] [<ffffffffa04d01bc>] ceph_osdc_stop+0x9c/0x130 [libceph]
Aug 28 18:41:18 song kernel: [ 701.562263] [<ffffffffa04bf530>] ceph_destroy_client+0x30/0x110 [libceph]
Aug 28 18:41:18 song kernel: [ 701.563225] [<ffffffff812d044d>] ? list_del+0xd/0x30
Aug 28 18:41:18 song kernel: [ 701.564171] [<ffffffffa050f9da>] rbd_client_release+0x4a/0xb0 [rbd]
Aug 28 18:41:18 song kernel: [ 701.565099] [<ffffffffa0510a25>] rbd_dev_destroy+0x65/0x70 [rbd]
Aug 28 18:41:18 song kernel: [ 701.566058] [<ffffffffa0516fc2>] rbd_add+0x9a2/0xbe8 [rbd]
Aug 28 18:41:18 song kernel: [ 701.567014] [<ffffffff813b4597>] bus_attr_store+0x27/0x30
Aug 28 18:41:18 song kernel: [ 701.567947] [<ffffffff81225716>] sysfs_write_file+0xc6/0x140
Aug 28 18:41:18 song kernel: [ 701.568866] [<ffffffff811afb0d>] vfs_write+0xbd/0x1e0
Aug 28 18:41:18 song kernel: [ 701.569754] [<ffffffff811b0558>] SyS_write+0x58/0xb0
Aug 28 18:41:18 song kernel: [ 701.570647] [<ffffffff815f20d9>] system_call_fastpath+0x16/0x1b
Aug 28 18:41:18 song kernel: [ 701.571490] Code: 89 c0 41 b9 c8 03 00 00 48 c7 c1 cc db 4d a0 31 d2 48 c7 c6 78 2e 4e a0 48 c7 c7 a8 bf 4e a0 31 c0 e8 89 da e0 e0 e9 63 ff ff ff <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 56 41 55 49 89 fd 41
Aug 28 18:41:18 song kernel: [ 701.574086] RIP [<ffffffffa04cb68c>] __remove_osd+0xbc/0xc0 [libceph]
Aug 28 18:41:18 song kernel: [ 701.574918] RSP <ffff880072f99d68>
Aug 28 18:41:18 song kernel: [ 701.575837] ---[ end trace 706c78aa94a3651d ]---


Related issues 1 (1 open0 closed)

Related to Linux kernel client - Feature #12874: krbd: get rid of RBD_MAX_SNAP_COUNTNewIlya Dryomov08/31/2015

Actions
Actions #1

Updated by Ilya Dryomov over 8 years ago

  • Status changed from New to Closed
  • Assignee set to Ilya Dryomov

There are two problems here. The first one is

Aug 28 18:39:51 song kernel: [ 614.798066] libceph: read_partial_message skipping long message (4796 > 4092)

and I think it happened because you tried to map a clone of an image with a lot of (598?) snapshots. There is currently an artificial limit of 510 snapshots - once you create a 511th snapshot, you can no longer map the image or any of its snapshots or clones. The failure mode is unfortunate, but yeah, Ctrl-C is the only way out of it. I'll file a ticket to get that limit lifted.

The second problem is the BUG_ON you hit after you interrupted rbd map. In newer kernels that BUG_ON was changed to a WARN_ON and the surrounding code was fixed so that it no longer triggers. This happened around kernel 3.17-3.18, if memory serves. I'm not sure, but I think newer Red Hat kernels have those fixes backported.

Actions

Also available in: Atom PDF