Bug #5301
closedmon: leveldb crash in tcmalloc
0%
Description
Hello
I have replaced my crushmap:
# begin crush map # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 # types type 0 osd type 1 host type 2 rack type 3 row type 4 room type 5 datacenter type 6 pool # buckets host localhost { id -2 # do not change unnecessarily # weight 2.000 alg straw hash 0 # rjenkins1 item osd.0 weight 1.000 item osd.1 weight 1.000 item osd.2 weight 1.000 } rack localrack { id -3 # do not change unnecessarily # weight 2.000 alg straw hash 0 # rjenkins1 item localhost weight 3.000 } pool default { id -1 # do not change unnecessarily # weight 3.000 alg straw hash 0 # rjenkins1 item localrack weight 3.000 } # rules rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type osd step emit } rule metadata { ruleset 1 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type osd step emit } rule rbd { ruleset 2 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type osd step emit } # end crush map
with:
# begin crush map # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 # types type 0 osd type 1 host type 2 rack type 3 row type 4 room type 5 datacenter type 6 root # buckets host n11c1 { id -4 # do not change unnecessarily # weight 1.000 alg straw hash 0 # rjenkins1 item osd.0 weight 1.000 } host n14c1 { id -5 # do not change unnecessarily # weight 1.000 alg straw hash 0 # rjenkins1 item osd.1 weight 1.000 } host n18c1 { id -6 # do not change unnecessarily # weight 1.000 alg straw hash 0 # rjenkins1 item osd.2 weight 1.000 } root default { id -1 # do not change unnecessarily # weight 3.000 alg straw hash 0 # rjenkins1 item n11c1 weight 1.000 item n14c1 weight 1.000 item n18c1 weight 1.000 } # rules rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type host step emit } rule metadata { ruleset 1 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type host step emit } rule rbd { ruleset 2 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type host step emit } # end crush map
Cluster status after applying new crushmap:
# ceph -s health HEALTH_WARN 394 pgs stale; 394 pgs stuck stale monmap e7: 3 mons at {cc2=10.1.128.1:6789/0,n11c1=10.1.128.11:6789/0,n14c1=10.1.128.14:6789/0}, election epoch 110452, quorum 0,1,2 cc2,n11c1,n14c1 osdmap e1862: 3 osds: 3 up, 3 in pgmap v8016076: 1104 pgs: 710 active+clean, 394 stale+active+clean; 24020 MB data, 50278 MB used, 616 GB / 670 GB avail mdsmap e7839: 1/1/1 up {0=n11c1=up:active}
Kernel rbd clients started to crash with error:
Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.688107] libceph: osd2 10.1.128.18:6801 socket closed (con state OPEN) Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.688727] ------------[ cut here ]------------ Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.689348] kernel BUG at /build/buildd-linux_3.8.13-1-amd64-YudJGj/linux-3.8.13/net/ceph/osd_client.c:601! Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.690389] invalid opcode: 0000 [#1] SMP Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.691060] Modules linked in: xt_owner xt_comment iptable_mangle ip_tables x_tables xfs cbc rbd libceph loop nfsd auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc fuse coretemp crc32c_intel ghash_clmulni_intel snd_pcm aesni_intel aes_x86_64 snd_page_alloc ablk_helper snd_timer cryptd snd xts soundcore lrw gf128mul evdev pcspkr joydev ext4 crc16 jbd2 mbcache btrfs zlib_deflate crc32c libcrc32c xen_blkfront xen_netfront Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] CPU 0 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] Pid: 37, comm: kworker/0:1 Not tainted 3.8-2-amd64 #1 Debian 3.8.13-1 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] RIP: e030:[<ffffffffa029619d>] [<ffffffffa029619d>] osd_reset+0x105/0x195 [libceph] Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] RSP: e02b:ffff880003521ce8 EFLAGS: 00010206 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] RAX: ffff88007c9c0850 RBX: ffff88000375b748 RCX: 0000000000000000 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] RDX: ffff88007c9c0820 RSI: ffff88007fc19e05 RDI: ffff88007c6f64a8 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] RBP: ffff88007c9c0800 R08: ffff88000375b800 R09: 00000000fffffff8 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] R10: ffff88007b9f5e00 R11: ffff88007b9f5e00 R12: ffff88007cb9a420 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] R13: ffff88000375b758 R14: ffff88000375b7a0 R15: ffff88007c6f6468 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] FS: 00007f5631331700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] CR2: ffffffffff600400 CR3: 000000007b5f9000 CR4: 0000000000002660 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] Process kworker/0:1 (pid: 37, threadinfo ffff880003520000, task ffff88007aadf180) Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] Stack: Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] 0000000000000000 ffff88007c6f64b8 ffff88007c6f6030 ffff88007c6f6060 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] 0000000000000002 ffff88007c6f6428 ffffffffa02a25d0 ffff88007c6f6030 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] 0000000000000000 ffffffffa02925b7 ffffffff810040cf ffffffff81004273 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] Call Trace: Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffffa02925b7>] ? con_work+0x1b62/0x1c48 [libceph] Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff810040cf>] ? arch_local_irq_restore+0x7/0x8 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff81004273>] ? xen_mc_flush+0x11e/0x161 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff81003159>] ? xen_end_context_switch+0xe/0x1c Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff8100d02f>] ? load_TLS+0x7/0xa Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff8105fd0a>] ? mmdrop+0xd/0x1c Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff81061581>] ? finish_task_switch+0x83/0xb3 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff810532b1>] ? process_one_work+0x18d/0x2d3 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff810536bc>] ? worker_thread+0x118/0x1b2 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff810535a4>] ? rescuer_thread+0x187/0x187 Jun 11 14:25:08 i-10-1-73-197 kernel: [850269.692047] [<ffffffff81057358>] ? kthread+0x81/0x89
# ceph -v ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532)
Updated by Sage Weil almost 11 years ago
- Status changed from New to 4
Hi-
The 3.8.y kernel is EOL, but I pushed a branch that has the patch that (I believe) fixes this problem: linux-3.8.y-ceph. There is also a 3.9.y-ceph branch that does the same for 3.9.x.
Is this a repeatable problem, or something you saw once?
Updated by Maciej Galkiewicz almost 11 years ago
Well I could try to reproduce but I am not going to do this because it is my production cluster. I have also experienced mon crash after injecting new crushmap:
-45> 2013-06-11 14:16:27.177075 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1137 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (573516434 0 0) 0x23ca500 con 0x26749a0 -44> 2013-06-11 14:16:27.177298 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.2 10.1.128.14:6789/0 331 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (3053160810 0 0) 0x18aef00 con 0x2674840 -43> 2013-06-11 14:16:28.460724 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.25111 10.1.40.80:0/591195587 11 ==== mon_subscribe({monmap=0+,osdmap=1858}) ==== 58+0+0 (430888968 0 0) 0x2d70700 con 0x2675ce0 -42> 2013-06-11 14:16:28.460780 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.40.80:0/591195587 -- mon_map v1 -- ?+0 0x3414000 con 0x2675ce0 -41> 2013-06-11 14:16:28.460797 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.25111 10.1.40.80:0/591195587 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2be9a40 -40> 2013-06-11 14:16:29.483944 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.25720 10.1.90.111:0/2843846930 7 ==== mon_subscribe({monmap=0+}) ==== 31+0+0 (4141960948 0 0) 0x2eca1c0 con 0x1ed58c0 -39> 2013-06-11 14:16:29.484000 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.90.111:0/2843846930 -- mon_map v1 -- ?+0 0x2fef860 con 0x1ed58c0 -38> 2013-06-11 14:16:29.484017 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.25720 10.1.90.111:0/2843846930 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2d70700 -37> 2013-06-11 14:16:30.176344 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.1 10.1.128.11:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x25c2280 -36> 2013-06-11 14:16:30.176390 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.2 10.1.128.14:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x1b41900 -35> 2013-06-11 14:16:30.177000 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.2 10.1.128.14:6789/0 332 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (750709856 0 0) 0x1b41900 con 0x2674840 -34> 2013-06-11 14:16:30.177193 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1138 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (3609491739 0 0) 0x25c2280 con 0x26749a0 -33> 2013-06-11 14:16:30.911412 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1139 ==== forward(mdsbeacon(27298/n11c1 up:active seq 193 v7839) v2) to leader v1 ==== 555+0+0 (188608702 0 0) 0x27dcf00 con 0x26749a0 -32> 2013-06-11 14:16:30.911572 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.128.11:6789/0 -- route(mdsbeacon(27298/n11c1 up:active seq 193 v7839) v2 tid 298) v2 -- ?+0 0x2013b00 con 0x26749a0 -31> 2013-06-11 14:16:32.226239 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.25985 10.1.109.87:0/1578239815 7 ==== mon_subscribe({monmap=0+}) ==== 31+0+0 (4141960948 0 0) 0x2be9dc0 con 0x2d6e420 -30> 2013-06-11 14:16:32.226292 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.109.87:0/1578239815 -- mon_map v1 -- ?+0 0x2fef2c0 con 0x2d6e420 -29> 2013-06-11 14:16:32.226307 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.25985 10.1.109.87:0/1578239815 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2eca1c0 -28> 2013-06-11 14:16:33.176416 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.1 10.1.128.11:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x2072780 -27> 2013-06-11 14:16:33.176462 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.2 10.1.128.14:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x2138f00 -26> 2013-06-11 14:16:33.177177 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1140 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (3114351694 0 0) 0x2072780 con 0x26749a0 -25> 2013-06-11 14:16:33.177381 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.2 10.1.128.14:6789/0 333 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (3498565493 0 0) 0x2138f00 con 0x2674840 -24> 2013-06-11 14:16:33.927586 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.25870 10.1.0.31:0/3237252543 12 ==== mon_subscribe({monmap=0+}) ==== 31+0+0 (4141960948 0 0) 0x2eca700 con 0x2d6e000 -23> 2013-06-11 14:16:33.927650 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.0.31:0/3237252543 -- mon_map v1 -- ?+0 0x2e74000 con 0x2d6e000 -22> 2013-06-11 14:16:33.927669 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.25870 10.1.0.31:0/3237252543 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2be9dc0 -21> 2013-06-11 14:16:34.570686 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.19948 10.1.0.17:0/2103056858 9 ==== mon_subscribe({monmap=0+,osdmap=1858}) ==== 58+0+0 (430888968 0 0) 0x2be9a40 con 0x2d6eb00 -20> 2013-06-11 14:16:34.570745 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.0.17:0/2103056858 -- mon_map v1 -- ?+0 0x2e750e0 con 0x2d6eb00 -19> 2013-06-11 14:16:34.570776 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.19948 10.1.0.17:0/2103056858 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2eca700 -18> 2013-06-11 14:16:34.626154 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.17599 10.1.0.175:0/2078230490 9 ==== mon_subscribe({monmap=0+,osdmap=1858}) ==== 58+0+0 (3019450415 0 0) 0x2d71500 con 0x2675600 -17> 2013-06-11 14:16:34.626210 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.0.175:0/2078230490 -- mon_map v1 -- ?+0 0x2e754a0 con 0x2675600 -16> 2013-06-11 14:16:34.626230 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.17599 10.1.0.175:0/2078230490 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2be9a40 -15> 2013-06-11 14:16:34.911589 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1141 ==== forward(mdsbeacon(27298/n11c1 up:active seq 194 v7839) v2) to leader v1 ==== 555+0+0 (3914741050 0 0) 0x27dd800 con 0x26749a0 -14> 2013-06-11 14:16:34.911749 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.128.11:6789/0 -- route(mdsbeacon(27298/n11c1 up:active seq 194 v7839) v2 tid 299) v2 -- ?+0 0x27d2900 con 0x26749a0 -13> 2013-06-11 14:16:35.403335 7f09c6710700 1 -- 10.1.128.1:6789/0 <== client.26201 10.1.73.197:0/3158215080 10 ==== mon_subscribe({monmap=0+,osdmap=1858}) ==== 58+0+0 (4177322872 0 0) 0x2ecae00 con 0x1ed42c0 -12> 2013-06-11 14:16:35.403392 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.73.197:0/3158215080 -- mon_map v1 -- ?+0 0x2e752c0 con 0x1ed42c0 -11> 2013-06-11 14:16:35.403411 7f09c6710700 1 -- 10.1.128.1:6789/0 --> client.26201 10.1.73.197:0/3158215080 -- mon_subscribe_ack(300s) v1 -- ?+0 0x2d71500 -10> 2013-06-11 14:16:36.176489 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.1 10.1.128.11:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x2138500 -9> 2013-06-11 14:16:36.176535 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.2 10.1.128.14:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x2139180 -8> 2013-06-11 14:16:36.177201 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.2 10.1.128.14:6789/0 334 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (1996422648 0 0) 0x2139180 con 0x2674840 -7> 2013-06-11 14:16:36.177396 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1142 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (3255419649 0 0) 0x2138500 con 0x26749a0 -6> 2013-06-11 14:16:38.911767 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1143 ==== forward(mdsbeacon(27298/n11c1 up:active seq 195 v7839) v2) to leader v1 ==== 555+0+0 (2625110559 0 0) 0x27ddb00 con 0x26749a0 -5> 2013-06-11 14:16:38.911940 7f09c6710700 1 -- 10.1.128.1:6789/0 --> 10.1.128.11:6789/0 -- route(mdsbeacon(27298/n11c1 up:active seq 195 v7839) v2 tid 300) v2 -- ?+0 0x2013b00 con 0x26749a0 -4> 2013-06-11 14:16:39.176560 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.1 10.1.128.11:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x1a1ef00 -3> 2013-06-11 14:16:39.176607 7f09c6f11700 1 -- 10.1.128.1:6789/0 --> mon.2 10.1.128.14:6789/0 -- paxos(lease lc 15584196 fc 15583846 pn 0 opn 0) v3 -- ?+0 0x2138a00 -2> 2013-06-11 14:16:39.177238 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.2 10.1.128.14:6789/0 335 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (731610213 0 0) 0x2138a00 con 0x2674840 -1> 2013-06-11 14:16:39.177441 7f09c6710700 1 -- 10.1.128.1:6789/0 <== mon.1 10.1.128.11:6789/0 1144 ==== paxos(lease_ack lc 15584196 fc 15583846 pn 0 opn 0) v3 ==== 80+0+0 (123262281 0 0) 0x1a1ef00 con 0x26749a0 0> 2013-06-11 14:16:39.179209 7f09c6710700 -1 *** Caught signal (Segmentation fault) ** in thread 7f09c6710700 ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532) 1: /usr/bin/ceph-mon() [0x5a08b2] 2: (()+0xf030) [0x7f09ce2dd030] 3: (tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int)+0xeb) [0x7f09cd6f9c3b] 4: (tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long)+0x23) [0x7f09cd6f9cc3] 5: (operator delete(void*)+0x2db) [0x7f09cd70b2bb] 6: (std::vector<leveldb::Iterator*, std::allocator<leveldb::Iterator*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<leveldb::Iterator**, std::vector<leveldb::Iterator*, std::allocator<leveldb::Iterator*> > >, leveldb::Iterator* const&)+0x15b) [0x7f09cd4a21eb] 7: (leveldb::Version::AddIterators(leveldb::ReadOptions const&, std::vector<leveldb::Iterator*, std::allocator<leveldb::Iterator*> >*)+0x12d) [0x7f09cd4aea0d] 8: (leveldb::DBImpl::NewInternalIterator(leveldb::ReadOptions const&, unsigned long*)+0x10f) [0x7f09cd49cc5f] 9: (leveldb::DBImpl::NewIterator(leveldb::ReadOptions const&)+0x14) [0x7f09cd49cd94] 10: (LevelDBStore::_get_iterator()+0x41) [0x58dc31] 11: (LevelDBStore::get(std::string const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> > const&, std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > >*)+0x32) [0x58cca2] 12: (MonitorDBStore::get(std::string const&, std::string const&, ceph::buffer::list&)+0xbd) [0x4a9bcd] 13: (MonitorDBStore::get(std::string const&, std::string const&)+0x99) [0x4a9d19] 14: (PaxosService::get_last_committed()+0x30) [0x4e9970] 15: (PGMonitor::update_from_paxos()+0x19) [0x555909] 16: (Monitor::_ms_dispatch(Message*)+0xba1) [0x4d6891] 17: (Monitor::ms_dispatch(Message*)+0x32) [0x4f0fb2] 18: (DispatchQueue::entry()+0x393) [0x6b05d3] 19: (DispatchQueue::DispatchThread::entry()+0xd) [0x6444cd] 20: (()+0x6b50) [0x7f09ce2d4b50] 21: (clone()+0x6d) [0x7f09cca33a7d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 0/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 hadoop 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-mon.cc2.log --- end dump of recent events ---
Is there something wrong with prepared crushmap? Why the cluster did not rebalance pgs properly?
Updated by Joao Eduardo Luis almost 11 years ago
- Status changed from New to 4
Okay, regarding the crash, although I've been unable to figure out what or who (us or leveldb) may be causing it, these are my finding so far (keeping them for posterity):
- The crash is obviously happening in tcmalloc
- We've hit that crash before (see #3678), and it was apparently fixed by dealing with bufferlists and messages under a lock to avoid other threads to mess with their pointers (see commits a058f16113efa8f32eb5503d5443aa139754d479 and d16ad9263d7b1d3c096f56c56e9631fae8509651)
- There was another bug report on another project's list (source) that mentions that this happens during a free call after some state becoming corrupt -- I suppose some multithreading may be involved, which would make sense given what I perceive to be the cause on #3678 above.
So, my current theory is that there may be some multithreading issue with leveldb, given that this time delete is coming directly from leveldb, whereas on #3678 it was coming from ReplicatedPG.
Ian, should I look further into this?
Updated by Sage Weil almost 11 years ago
- Subject changed from inconsistent ceph cluster state kills rbd clients to mon: leveldb crash in tcmalloc
- Priority changed from Normal to High
what distro are you using? this sounds a bit like #5239
Updated by Sage Weil almost 11 years ago
Maciej Galkiewicz wrote:
debian wheezy (7.0)
ok now it sounds a lot like #5239. i'm not able to reproduce this on my wheezy boxes (either running a gitbuilder or release build).
if you see this again, can you grab the 'thread apply all bt full' output from the core file?
Updated by Sage Weil almost 11 years ago
- Status changed from 4 to Can't reproduce