Project

General

Profile

Actions

Bug #729

closed

weird kernel BUG on metropolis

Added by Colin McCabe over 13 years ago. Updated over 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm not sure if this is our issue, or an ext3 issue. Anyway, it came up on metropolis, and I need write down the info before I lose it.

Basically, the following cryptic message appeared on syslog (at priority LOG_EMERG, so it overwrote my screen):

Message from syslogd@metropolis at Jan 19 10:32:                                  
kernel:[503895.715738] Code: 0b 48 8b 44 24 40 48 39 45 28 73 23 49 8b bf 90 02 00 00 48 c7 c2 00 1b 12 a0 be 01 00 00 00 48 81 c7 50 01 00 00 e8 cd f0 ff ff <0f> 0b eb fe 48 8b 54 24 38 48 8b 4c 24 18 4c 8d 8c 24 80 00 00

Then, after that, any kind of file I/O started to hang forever. Saving files, running sudo (which reads /etc/sudoers), /bin/ls, etc. I had to powercycle the machine to even find any logs.

When it came back, I found this in /var/log/syslog:

Jan 19 10:32:01 metropolis kernel: [503895.713852] Block Allocation Reservation Windows Map (ext3_try_to_allocate_with_rsv):
Jan 19 10:32:01 metropolis kernel: [503895.713855] reservation window 0xffff88023c707158 start: 0, end: 0
Jan 19 10:32:01 metropolis kernel: [503895.713858] reservation window 0xffff88023dcce480 start: 7154280, end: 7154295
Jan 19 10:32:01 metropolis kernel: [503895.713860] reservation window 0xffff880185c74c00 start: 7182882, end: 7182889
Jan 19 10:32:01 metropolis kernel: [503895.713862] reservation window 0xffff88023c4b1e00 start: 8144526, end: 8144541
Jan 19 10:32:01 metropolis kernel: [503895.713865] reservation window 0xffff8801cc074740 start: 8160080, end: 8160087
Jan 19 10:32:01 metropolis kernel: [503895.713868] reservation window 0xffff88021e218940 start: 9121927, end: 9121990
Jan 19 10:32:01 metropolis kernel: [503895.713870] reservation window 0xffff88023b73c240 start: 9121999, end: 9122014
Jan 19 10:32:01 metropolis kernel: [503895.713872] reservation window 0xffff88023b73ccc0 start: 9123848, end: 9123863
Jan 19 10:32:01 metropolis kernel: [503895.713875] reservation window 0xffff8802171f4340 start: 9134086, end: 9134117
Jan 19 10:32:01 metropolis kernel: [503895.713877] reservation window 0xffff8802171f4f40 start: 9134126, end: 9134141
Jan 19 10:32:01 metropolis kernel: [503895.713880] reservation window 0xffff8802171f4c00 start: 9136152, end: 9136183
Jan 19 10:32:01 metropolis kernel: [503895.713882] reservation window 0xffff88023d357400 start: 9161992, end: 9162119
Jan 19 10:32:01 metropolis kernel: [503895.713885] reservation window 0xffff8801cc0da9c0 start: 9175910, end: 9175917
Jan 19 10:32:01 metropolis kernel: [503895.713887] reservation window 0xffff880238fc2a80 start: 9175918, end: 9175925
Jan 19 10:32:01 metropolis kernel: [503895.713890] reservation window 0xffff8801cc1461c0 start: 9175948, end: 9175955
Jan 19 10:32:01 metropolis kernel: [503895.713892] reservation window 0xffff880238fc28c0 start: 9175958, end: 9175965
Jan 19 10:32:01 metropolis kernel: [503895.713894] reservation window 0xffff880238fc2ac0 start: 9175966, end: 9175973
Jan 19 10:32:01 metropolis kernel: [503895.713897] reservation window 0xffff88023cb40080 start: 9175976, end: 9175983
Jan 19 10:32:01 metropolis kernel: [503895.713899] reservation window 0xffff88023e386540 start: 9176075, end: 9176082
Jan 19 10:32:01 metropolis kernel: [503895.713901] reservation window 0xffff88023c997900 start: 9176113, end: 9176120
Jan 19 10:32:01 metropolis kernel: [503895.713904] reservation window 0xffff880238fc2240 start: 9176153, end: 9176160
Jan 19 10:32:01 metropolis kernel: [503895.713907] reservation window 0xffff88023b521cc0 start: 9176242, end: 9176249
Jan 19 10:32:01 metropolis kernel: [503895.713910] reservation window 0xffff88023d2c1400 start: 9176250, end: 9176257
Jan 19 10:32:01 metropolis kernel: [503895.713912] reservation window 0xffff8801cc0da000 start: 9176258, end: 9176265
Jan 19 10:32:01 metropolis kernel: [503895.713914] reservation window 0xffff8801cc17ac00 start: 9176287, end: 9176294
Jan 19 10:32:01 metropolis kernel: [503895.713917] reservation window 0xffff880238fc2000 start: 9176303, end: 9176310
Jan 19 10:32:01 metropolis kernel: [503895.713920] reservation window 0xffff88023c9974c0 start: 9176311, end: 9176318
Jan 19 10:32:01 metropolis kernel: [503895.713922] reservation window 0xffff88023c9979c0 start: 9176319, end: 9176326
Jan 19 10:32:01 metropolis kernel: [503895.713925] reservation window 0xffff880238fc20c0 start: 9176327, end: 9176334
Jan 19 10:32:01 metropolis kernel: [503895.713928] reservation window 0xffff880238fc2100 start: 9176335, end: 9176342
Jan 19 10:32:01 metropolis kernel: [503895.713930] reservation window 0xffff880238fc2140 start: 9176343, end: 9176350
Jan 19 10:32:01 metropolis kernel: [503895.713933] reservation window 0xffff880238fc2180 start: 9176351, end: 9176358
Jan 19 10:32:01 metropolis kernel: [503895.713935] reservation window 0xffff88023cb40000 start: 9176359, end: 9176366
Jan 19 10:32:01 metropolis kernel: [503895.713938] reservation window 0xffff8801cc0da200 start: 9176400, end: 9176407
Jan 19 10:32:01 metropolis kernel: [503895.713941] reservation window 0xffff880238fc2440 start: 9176408, end: 9176415
Jan 19 10:32:01 metropolis kernel: [503895.713943] reservation window 0xffff880238fc2480 start: 9176416, end: 9176423
Jan 19 10:32:01 metropolis kernel: [503895.713946] reservation window 0xffff8801cc146140 start: 9176424, end: 9176431
Jan 19 10:32:01 metropolis kernel: [503895.713948] reservation window 0xffff8801cc0da340 start: 9176432, end: 9176439
Jan 19 10:32:01 metropolis kernel: [503895.713951] reservation window 0xffff880238fc2b00 start: 9176536, end: 9176543
Jan 19 10:32:01 metropolis kernel: [503895.713953] reservation window 0xffff880238fc2b40 start: 9176544, end: 9176551
Jan 19 10:32:01 metropolis kernel: [503895.713956] reservation window 0xffff880238fc2c80 start: 9176552, end: 9176559
Jan 19 10:32:01 metropolis kernel: [503895.713959] reservation window 0xffff8801cc146200 start: 9176560, end: 9176567
Jan 19 10:32:01 metropolis kernel: [503895.713961] reservation window 0xffff8801cc0daa80 start: 9176568, end: 9176575
Jan 19 10:32:01 metropolis kernel: [503895.713964] reservation window 0xffff880238fc2cc0 start: 9176576, end: 9176583
Jan 19 10:32:01 metropolis kernel: [503895.713967] reservation window 0xffff8801cc0dab00 start: 9176584, end: 9176591
Jan 19 10:32:01 metropolis kernel: [503895.713969] reservation window 0xffff88023cff61c0 start: 9179287, end: 9179294
Jan 19 10:32:01 metropolis kernel: [503895.713971] reservation window 0xffff88012d8c0a00 start: 9183232, end: 9183239
Jan 19 10:32:01 metropolis kernel: [503895.713974] reservation window 0xffff88012d8c0600 start: 9183240, end: 9183247
Jan 19 10:32:01 metropolis kernel: [503895.713976] reservation window 0xffff88012d8c0e40 start: 9183248, end: 9183255
Jan 19 10:32:01 metropolis kernel: [503895.713978] reservation window 0xffff88012d8c0100 start: 9191424, end: 9191431
Jan 19 10:32:01 metropolis kernel: [503895.713981] reservation window 0xffff88012d8c0800 start: 9191432, end: 9191439
Jan 19 10:32:01 metropolis kernel: [503895.713983] reservation window 0xffff88012d8c0d00 start: 9191440, end: 9191447
Jan 19 10:32:01 metropolis kernel: [503895.713985] reservation window 0xffff88023ddfa400 start: 9201748, end: 9201755
Jan 19 10:32:01 metropolis kernel: [503895.713988] reservation window 0xffff88023ddfad80 start: 9201756, end: 9201763
Jan 19 10:32:01 metropolis kernel: [503895.713990] reservation window 0xffff88023ddfaf80 start: 9201764, end: 9201771
Jan 19 10:32:01 metropolis kernel: [503895.713993] reservation window 0xffff88023cff6040 start: 9206526, end: 9206653
Jan 19 10:32:01 metropolis kernel: [503895.713994] reservation window 0xffff88023cff6e80 start: 9208377, end: 9208384
Jan 19 10:32:01 metropolis kernel: [503895.713996] reservation window 0xffff88023c830040 start: 9208385, end: 9208392
Jan 19 10:32:01 metropolis kernel: [503895.713998] reservation window 0xffff8802171f4d00 start: 9208397, end: 9208404
Jan 19 10:32:01 metropolis kernel: [503895.714000] reservation window 0xffff88023c8303c0 start: 9208405, end: 9208412
Jan 19 10:32:01 metropolis kernel: [503895.714002] reservation window 0xffff88023ddfa980 start: 9208413, end: 9208420
Jan 19 10:32:01 metropolis kernel: [503895.714004] reservation window 0xffff88023cff6780 start: 9208421, end: 9208428
Jan 19 10:32:01 metropolis kernel: [503895.714025] reservation window 0xffff88023d357c00 start: 9208429, end: 9208436
Jan 19 10:32:01 metropolis kernel: [503895.714027] reservation window 0xffff88023d357140 start: 9208437, end: 9208444
Jan 19 10:32:01 metropolis kernel: [503895.714029] reservation window 0xffff880238fc29c0 start: 9912813, end: 9912820
Jan 19 10:32:01 metropolis kernel: [503895.714032] reservation window 0xffff88023bce2740 start: 10224189, end: 10224196
Jan 19 10:32:01 metropolis kernel: [503895.714035] reservation window 0xffff88023bce2780 start: 10224197, end: 10224204
Jan 19 10:32:01 metropolis kernel: [503895.714037] reservation window 0xffff88023bce2380 start: 10224205, end: 10224212
Jan 19 10:32:01 metropolis kernel: [503895.714040] reservation window 0xffff88023ced6340 start: 10224213, end: 10224220
Jan 19 10:32:01 metropolis kernel: [503895.714042] reservation window 0xffff88022b04bdc0 start: 10224221, end: 10224228
Jan 19 10:32:01 metropolis kernel: [503895.714045] reservation window 0xffff88022b04b100 start: 10224229, end: 10224236
Jan 19 10:32:01 metropolis kernel: [503895.714047] reservation window 0xffff88023ced6dc0 start: 10224237, end: 10224244
Jan 19 10:32:01 metropolis kernel: [503895.714050] reservation window 0xffff88023bce2300 start: 10224245, end: 10224252
Jan 19 10:32:01 metropolis kernel: [503895.714052] reservation window 0xffff88023ced6c00 start: 10224253, end: 10224260
Jan 19 10:32:01 metropolis kernel: [503895.714054] reservation window 0xffff880238e137c0 start: 10224261, end: 10224268
Jan 19 10:32:01 metropolis kernel: [503895.714057] reservation window 0xffff880238e133c0 start: 10224269, end: 10224276
Jan 19 10:32:01 metropolis kernel: [503895.714060] reservation window 0xffff8801c37fc680 start: 10224277, end: 10224284
Jan 19 10:32:01 metropolis kernel: [503895.714062] reservation window 0xffff880238e138c0 start: 10224285, end: 10224292
Jan 19 10:32:01 metropolis kernel: [503895.714065] reservation window 0xffff88021ea988c0 start: 10224293, end: 10224300
Jan 19 10:32:01 metropolis kernel: [503895.714067] reservation window 0xffff880238e13a00 start: 10224301, end: 10224308
Jan 19 10:32:01 metropolis kernel: [503895.714070] reservation window 0xffff880238e13b80 start: 10224309, end: 10224316
Jan 19 10:32:01 metropolis kernel: [503895.714072] reservation window 0xffff880176caf000 start: 10224317, end: 10224324
Jan 19 10:32:01 metropolis kernel: [503895.714075] reservation window 0xffff880238e13640 start: 10224325, end: 10224332
Jan 19 10:32:01 metropolis kernel: [503895.714077] reservation window 0xffff88023bfa49c0 start: 10224333, end: 10224340
Jan 19 10:32:01 metropolis kernel: [503895.714079] reservation window 0xffff8801c37fc300 start: 10224341, end: 10224348
Jan 19 10:32:01 metropolis kernel: [503895.714081] reservation window 0xffff88021ea98c40 start: 10224357, end: 10224364
Jan 19 10:32:01 metropolis kernel: [503895.714083] reservation window 0xffff88023bfa4900 start: 10224365, end: 10224372
Jan 19 10:32:01 metropolis kernel: [503895.714084] reservation window 0xffff8801cc0dbd80 start: 10224373, end: 10224380
Jan 19 10:32:01 metropolis kernel: [503895.714086] reservation window 0xffff88023bfa4100 start: 10224381, end: 10224388
Jan 19 10:32:01 metropolis kernel: [503895.714088] reservation window 0xffff88023bfa4800 start: 10224389, end: 10224396
Jan 19 10:32:01 metropolis kernel: [503895.714090] reservation window 0xffff88023bfa48c0 start: 10224397, end: 10224404
Jan 19 10:32:01 metropolis kernel: [503895.714092] reservation window 0xffff8801cc0db1c0 start: 10224405, end: 10224412
Jan 19 10:32:01 metropolis kernel: [503895.714094] reservation window 0xffff88023bfa4140 start: 10224413, end: 10224420
Jan 19 10:32:01 metropolis kernel: [503895.714096] reservation window 0xffff88023cac6400 start: 10224421, end: 10224428
Jan 19 10:32:01 metropolis kernel: [503895.714097] Window map complete.
Jan 19 10:32:01 metropolis kernel: [503895.714115] ------------[ cut here ]------------
Jan 19 10:32:01 metropolis kernel: [503895.714142] kernel BUG at /build/buildd-linux-2.6_2.6.32-29-amd64-xcs37n/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ext3/balloc.c:1384!
Jan 19 10:32:01 metropolis kernel: [503895.714196] invalid opcode: 0000 [#1] SMP
Jan 19 10:32:01 metropolis kernel: [503895.714225] last sysfs file: /sys/devices/virtual/bdi/0:17/uevent
Jan 19 10:32:01 metropolis kernel: [503895.714254] CPU 5
Jan 19 10:32:01 metropolis kernel: [503895.714276] Modules linked in: fuse kvm_intel kvm loop snd_pcm snd_timer snd soundcore snd_page_alloc psmouse pcspkr serio_raw i2c_i801 button processor joydev evdev i2c_core ext3 jbd mbcache sd_mod crc_t10dif usbhid hid ahci libata scsi_mod ehci_hcd e1000e usbcore nls_base thermal thermal_sys [last unloaded: scsi_wait_scan]
Jan 19 10:32:01 metropolis kernel: [503895.714476] Pid: 5470, comm: cosd Not tainted 2.6.32-5-amd64 #1 X8SIE
Jan 19 10:32:01 metropolis kernel: [503895.714504] RIP: 0010:[<ffffffffa0110332>] [<ffffffffa0110332>] ext3_try_to_allocate_with_rsv+0x4b1/0x5c1 [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.714560] RSP: 0018:ffff88023d3459f8 EFLAGS: 00010246
Jan 19 10:32:01 metropolis kernel: [503895.714586] RAX: 0000000000000028 RBX: 00000000008c827d RCX: 00000000000012ad
Jan 19 10:32:01 metropolis kernel: [503895.714630] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
Jan 19 10:32:01 metropolis kernel: [503895.714673] RBP: ffff88023cff6d80 R08: ffff88023c707150 R09: ffffffff813ae533
Jan 19 10:32:01 metropolis kernel: [503895.714718] R10: 0000000000000000 R11: 00000000000186a0 R12: ffff88023d357140
Jan 19 10:32:01 metropolis kernel: [503895.714762] R13: ffff88023c707000 R14: 00000000ffffffff R15: ffff88023be57400
Jan 19 10:32:01 metropolis kernel: [503895.714807] FS: 00007fd127594710(0000) GS:ffff880008b40000(0000) knlGS:0000000000000000
Jan 19 10:32:01 metropolis kernel: [503895.714853] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 19 10:32:01 metropolis kernel: [503895.714882] CR2: 00007fd11ff7ffd8 CR3: 000000023c995000 CR4: 00000000000026e0
Jan 19 10:32:01 metropolis kernel: [503895.714926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 19 10:32:01 metropolis kernel: [503895.714970] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 19 10:32:01 metropolis kernel: [503895.715027] Process cosd (pid: 5470, threadinfo ffff88023d344000, task ffff88018216c6a0)
Jan 19 10:32:01 metropolis kernel: [503895.715073] Stack:
Jan 19 10:32:01 metropolis kernel: [503895.715093] ffff88023efe45e8 008c800000008000 0000000000000008 ffff8801b558a770
Jan 19 10:32:01 metropolis kernel: [503895.715128] <0> 000001193be57400 ffff88023efe45e8 0000000000000119 ffff88023cff6da0
Jan 19 10:32:01 metropolis kernel: [503895.715202] <0> 00000000008c8000 00000000008cffff ffff88023c707148 000000083b0a8320
Jan 19 10:32:01 metropolis kernel: [503895.715274] Call Trace:
Jan 19 10:32:01 metropolis kernel: [503895.715302] [<ffffffffa0110650>] ? ext3_new_blocks+0x20e/0x5e6 [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715335] [<ffffffffa0110a45>] ? ext3_new_block+0x1d/0x24 [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715367] [<ffffffffa012026d>] ? ext3_xattr_block_set+0x522/0x6ec [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715400] [<ffffffffa0120713>] ? ext3_xattr_set_handle+0x2dc/0x44c [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715450] [<ffffffffa0120904>] ? ext3_xattr_set+0x81/0xc9 [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715484] [<ffffffff81106544>] ? __vfs_setxattr_noperm+0x3d/0xb1
Jan 19 10:32:01 metropolis kernel: [503895.715514] [<ffffffff8110662c>] ? vfs_setxattr+0x74/0x8c
Jan 19 10:32:01 metropolis kernel: [503895.715543] [<ffffffff811066eb>] ? setxattr+0xa7/0xdc
Jan 19 10:32:01 metropolis kernel: [503895.715570] [<ffffffff810f6f20>] ? path_walk+0xc0/0xc9
Jan 19 10:32:01 metropolis kernel: [503895.715598] [<ffffffff810e5825>] ? virt_to_head_page+0x9/0x2a
Jan 19 10:32:01 metropolis kernel: [503895.715626] [<ffffffff810f981c>] ? user_path_at+0x52/0x79
Jan 19 10:32:01 metropolis kernel: [503895.715656] [<ffffffff81073d06>] ? sys_futex+0x113/0x131
Jan 19 10:32:01 metropolis kernel: [503895.715683] [<ffffffff81106875>] ? sys_setxattr+0x59/0x80
Jan 19 10:32:01 metropolis kernel: [503895.715711] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Jan 19 10:32:01 metropolis kernel: [503895.715738] Code: 0b 48 8b 44 24 40 48 39 45 28 73 23 49 8b bf 90 02 00 00 48 c7 c2 00 1b 12 a0 be 01 00 00 00 48 81 c7 50 01 00 00 e8 cd f0 ff ff <0f> 0b eb fe 48 8b 54 24 38 48 8b 4c 24 18 4c 8d 8c 24 80 00 00
Jan 19 10:32:01 metropolis kernel: [503895.715943] RIP [<ffffffffa0110332>] ext3_try_to_allocate_with_rsv+0x4b1/0x5c1 [ext3]
Jan 19 10:32:01 metropolis kernel: [503895.715992] RSP <ffff88023d3459f8>
Jan 19 10:32:01 metropolis kernel: [503895.716304] ---[ end trace 828b9738b241c1c6 ]---

Previously, I had done:

mount -o remount,user_xattrs /

I had to do this because the Ceph object store requires extended attributes.

cmccabe@metropolis:~$ cat /etc/issue
Debian GNU/Linux 6.0 \n \l
cmccabe@metropolis:~$ uname -a
Linux metropolis 2.6.32-5-amd64 #1 SMP Fri Dec 10 15:35:08 UTC 2010 x86_64 GNU/Linux

Colin

Actions #1

Updated by Sage Weil over 13 years ago

  • Status changed from New to Closed

This is a known problem with ext3 and xattrs on 2.6.32. Either upgrading to a newer kernel (2.6.34 or later IIRC?) or (Yehuda seems to recall) switching to ext4 will do the trick. Update your fstab and reboot!

Actions

Also available in: Atom PDF