Actions
Bug #40481
closedosdmap->osd_addr allocation is susceptible to memory fragmentation
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
addr = krealloc(map->osd_addr, max*sizeof(*addr), GFP_NOFS);
In a cluster with several thousand OSDs this will be a ~1M allocation, currently required to be physically contiguous.
Updated by Ilya Dryomov almost 5 years ago
kworker/15:1: page allocation failure: order:7, mode:0x104050 CPU: 15 PID: 6313 Comm: kworker/15:1 Kdump: loaded Tainted: P OE ------------ 3.10.0-862.6.3.el7.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ceph-msgr ceph_con_workfn [libceph] Call Trace: [<ffffffffb570e80e>] dump_stack+0x19/0x1b [<ffffffffb519a5b0>] warn_alloc_failed+0x110/0x180 [<ffffffffb519f134>] __alloc_pages_nodemask+0x9b4/0xbb0 [<ffffffffb51e8ce8>] alloc_pages_current+0x98/0x110 [<ffffffffb519934e>] __get_free_pages+0xe/0x40 [<ffffffffb51f4f9e>] kmalloc_order_trace+0x2e/0xa0 [<ffffffffb51fa511>] ? __kmalloc_track_caller+0x221/0x240 [<ffffffffb51fa511>] __kmalloc_track_caller+0x221/0x240 [<ffffffffc0aa4056>] ? osdmap_set_max_osd+0x76/0x1d0 [libceph] [<ffffffffb51b395f>] krealloc+0x4f/0xa0 [<ffffffffc0aa4056>] osdmap_set_max_osd+0x76/0x1d0 [libceph] [<ffffffffc0aa7255>] ceph_osdmap_decode+0x195/0x860 [libceph] [<ffffffffc0a9daf4>] handle_one_map+0x224/0x250 [libceph] [<ffffffffc0aa233c>] ceph_osdc_handle_map+0x7dc/0x8c0 [libceph] [<ffffffffc0a96490>] dispatch+0x350/0x790 [libceph] [<ffffffffc0a91ff4>] try_read+0x4e4/0x1210 [libceph] [<ffffffffb50d972e>] ? dequeue_task_fair+0x41e/0x660 [<ffffffffb502959e>] ? __switch_to+0xce/0x580 [<ffffffffb50c9e50>] ? finish_task_switch+0x50/0x170 [<ffffffffc0a92dd9>] ceph_con_workfn+0xb9/0x670 [libceph] [<ffffffffb50b35ef>] process_one_work+0x17f/0x440 [<ffffffffb50b4686>] worker_thread+0x126/0x3c0 [<ffffffffb50b4560>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffffb50bb621>] kthread+0xd1/0xe0 [<ffffffffb50bb550>] ? insert_kthread_work+0x40/0x40 [<ffffffffb57205f7>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffffb50bb550>] ? insert_kthread_work+0x40/0x40 Mem-Info: active_anon:196776 inactive_anon:164248 isolated_anon:0#012 active_file:615418 inactive_file:6479656 isolated_file:0#012 unevictable:0 dirty:15 writeback:0 unstable:0#012 slab_reclaimable:154079 slab_unreclaimable:107384#012 mapped:82394 shmem:114925 pagetables:4453 bounce:0#012 free:165613 free_pcp:0 free_cma:0 Node 0 DMA free:15892kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 2829 31995 31995 Node 0 DMA32 free:169232kB min:5972kB low:7464kB high:8956kB active_anon:99092kB inactive_anon:127424kB active_file:235612kB inactive_file:2067288kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129288kB managed:2897832kB mlocked:0kB dirty:4kB writeback:0kB mapped:33796kB shmem:44528kB slab_reclaimable:64572kB slab_unreclaimable:37140kB kernel_stack:992kB pagetables:2248kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 29165 29165 Node 0 Normal free:477328kB min:61572kB low:76964kB high:92356kB active_anon:688012kB inactive_anon:529568kB active_file:2226060kB inactive_file:23851336kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:30408704kB managed:29865560kB mlocked:0kB dirty:56kB writeback:0kB mapped:295780kB shmem:415172kB slab_reclaimable:551744kB slab_unreclaimable:392380kB kernel_stack:13792kB pagetables:15564kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB Node 0 DMA32: 3980*4kB (UEM) 3460*8kB (UEM) 1895*16kB (UEM) 610*32kB (UEM) 277*64kB (UEM) 314*128kB (UEM) 72*256kB (UEM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 169792kB Node 0 Normal: 8145*4kB (UEM) 44558*8kB (UEM) 2521*16kB (UM) 516*32kB (UM) 549*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 481028kB Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB 7209999 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 4194300kB Total swap = 4194300kB 8388496 pages RAM 0 pages HighMem/MovableOnly 193671 pages reserved libceph: corrupt full osdmap (-12) epoch 721078 off 13481 (ffff9fe419c414c9 of ffff9fe419c3e020-ffff9fe419da13d1)
Updated by Ilya Dryomov almost 5 years ago
- Has duplicate Bug #40482: [libceph] page allocation failure when parsing new OSD map added
Updated by Ilya Dryomov over 4 years ago
- Status changed from 12 to In Progress
- Priority changed from Normal to Urgent
Updated by Ilya Dryomov over 4 years ago
Updated by Ilya Dryomov over 4 years ago
- Status changed from In Progress to Fix Under Review
[PATCH] libceph: use ceph_kvmalloc() for osdmap arrays
Updated by Ilya Dryomov over 4 years ago
- Status changed from Fix Under Review to Resolved
Updated by Ilya Dryomov almost 2 years ago
- Related to Bug #55408: libceph: corrupt inc osdmap (-12) epoch 409760 off 60 (ffffacad17925058 of ffffacad1792501c-ffffacad179edf02) added
Actions