Bug #4052
OSD High memory usage (8-12GB) right after start
0%
Description
Hi,
some of our osds need 8-12GB RAM right after startup.
Sage mentioned wip_bobtail_f might fix it but this branches osd is crashing right after start
Feb 8 10:37:42 xyz01 ceph-osd: 2013-02-08 10:37:42.834665 7fcdb6cb9780 -1 *** Caught signal (Aborted) ** in thread 7fcdb6cb9780 ceph version 0.56.2-17-g200d5e2 (200d5e2da5ab7a6292f3174b5a38510630e2c91f) 1: /usr/local/bin/ceph-osd() [0x855032] 2: (()+0xf030) [0x7fcdb66be030] 3: (gsignal()+0x35) [0x7fcdb4c3c475] 4: (abort()+0x180) [0x7fcdb4c3f6f0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fcdb548e89d] 6: (()+0x63996) [0x7fcdb548c996] 7: (()+0x639c3) [0x7fcdb548c9c3] 8: (()+0x63bee) [0x7fcdb548cbee] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x8fc7b7] 10: (PG::peek_map_epoch(ObjectStore*, coll_t, ceph::buffer::list*)+0x97) [0x74abf7] 11: (OSD::load_pgs()+0x735) [0x700ce5] 12: (OSD::init()+0x81e) [0x703d3e] 13: (main()+0x2046) [0x63c316] 14: (__libc_start_main()+0xfd) [0x7fcdb4c28ead] 15: /usr/local/bin/ceph-osd() [0x63e699] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Feb 8 10:37:42 xyz01 ceph-osd: 0> 2013-02-08 10:37:42.834665 7fcdb6cb9780 -1 *** Caught signal (Aborted) ** in thread 7fcdb6cb9780#012#012 ceph version 0.56.2-17-g200d5e2 (200d5e2da5ab7a6292f3174b5a38510630e2c91f) 1: /usr/local/bin/ceph-osd() [0x855032] 2: (()+0xf030) [0x7fcdb66be030] 3: (gsignal()+0x35) [0x7fcdb4c3c475] 4: (abort()+0x180) [0x7fcdb4c3f6f0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fcdb548e89d] 6: (()+0x63996) [0x7fcdb548c996] 7: (()+0x639c3) [0x7fcdb548c9c3] 8: (()+0x63bee) [0x7fcdb548cbee] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x8fc7b7] 10: (PG::peek_map_epoch(ObjectStore*, coll_t, ceph::buffer::list*)+0x97) [0x74abf7] 11: (OSD::load_pgs()+0x735) [0x700ce5]#012 12: (OSD::init()+0x81e) [0x703d3e] 13: (main()+0x2046) [0x63c316] 14: (__libc_start_main()+0xfd) [0x7fcdb4c28ead] 15: /usr/local/bin/ceph-osd() [0x63e699] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
History
#1 Updated by Simon Frerichs about 11 years ago
as requested current cluster status:
2013-02-08 10:48:40.125733 mon.0 [INF] pgmap v25369962: 2112 pgs: 1460 active+clean, 22 active+remapped+wait_backfill, 10 active+degraded+wait_backfill, 34 active+recovery_wait, 4 active+recovering+remapped, 1 active+recovering+degraded, 10 active+recovering+degraded+remapped+wait_backfill, 111 active+remapped, 25 down+peering, 35 active+remapped+backfilling, 267 active+degraded, 1 active+degraded+backfilling, 21 active+degraded+remapped+wait_backfill, 37 active+recovery_wait+remapped, 4 stale+down+peering, 12 active+recovery_wait+degraded, 12 active+degraded+remapped+backfilling, 1 active+clean+inconsistent, 7 active+recovery_wait+degraded+remapped, 5 active+clean+scrubbing+deep, 7 active+recovering, 20 active+recovering+remapped+wait_backfill, 6 active+recovering+degraded+wait_backfill; 5158 GB data, 15564 GB used, 29015 GB / 44580 GB avail; 460836/4425044 degraded (10.414%); 3/1403584 unfound (0.000%)
high memory usage was there already before we set two osds out. we had some degraded pgs for some days. unfound object is due to the crashed osd.
#2 Updated by Joao Eduardo Luis about 11 years ago
- Description updated (diff)
#3 Updated by Simon Frerichs about 11 years ago
health HEALTH_WARN 102 pgs backfill; 346 pgs degraded; 29 pgs down; 1 pgs inconsistent; 29 pgs peering; 61 pgs recovering; 4 pgs stale; 29 pgs stuck inactive; 4 pgs stuck stale; 641 pgs stuck unclean; recovery 457509/4423717 degraded (10.342%); 3/1403592 unfound (0.000%); 1/30 in osds are down monmap e8: 5 mons at {a=46.19.94.1:6789/0,b=46.19.94.2:6789/0,c=46.19.94.11:6789/0,d=46.19.94.12:6789/0,e=46.19.94.13:6789/0}, election epoch 184, quorum 0,1,2,3,4 a,b,c,d,e osdmap e13498: 34 osds: 29 up, 30 in pgmap v25370548: 2112 pgs: 1451 active+clean, 28 active+remapped+backfill, 15 active+degraded+backfill, 40 active, 9 active+recovering+remapped, 3 active+recovering+degraded, 10 active+recovering+degraded+remapped+backfill, 105 active+remapped, 31 active+remapped, 25 down+peering, 264 active+degraded, 1 active+degraded, 2 active+clean+scrubbing, 23 active+degraded+remapped+backfill, 35 active+remapped, 4 stale+down+peering, 7 active+degraded, 1 active+recovering+degraded+remapped, 9 active+degraded+remapped, 1 active+clean+inconsistent, 7 active+degraded+remapped, 3 active+clean+scrubbing, 12 active+recovering, 20 active+recovering+remapped+backfill, 6 active+recovering+degraded+backfill; 5158 GB data, 15561 GB used, 29019 GB / 44580 GB avail; 457509/4423717 degraded (10.342%); 3/1403592 unfound (0.000%) mdsmap e597864: 0/0/1 up
#4 Updated by Ian Colle about 11 years ago
- Assignee set to Josh Durgin
- Priority changed from Normal to Urgent
#5 Updated by Josh Durgin about 11 years ago
- Priority changed from Urgent to High
I haven't been able to reproduce this locally.
#6 Updated by Simon Frerichs about 11 years ago
Josh Durgin wrote:
I haven't been able to reproduce this locally.
do you need more information / log output?
#7 Updated by Josh Durgin about 11 years ago
Yeah, if you can still reproduce it, a heap profile of an osd that's using excessive memory would be great.
You can start a heap profile on osd N by: ceph osd tell N heap start_profiler and you can get it to dump the collected profile using ceph osd tell N heap dump. The dumps should show up in the osd log directory. Assuming the heap profiler is working correctly, you can look at the dump using pprof in google-perftools.
If you could attach the output from pprof that'd be ideal.
#8 Updated by Tamilarasi muthamizhan about 11 years ago
hi Josh, burnupi57 running on wip-f branch might help.
we have this running from last week for the memory leak testing.
#9 Updated by Simon Frerichs about 11 years ago
i'll add an heap dump soon.
i just restarted another osd with wip_bobtail_f, also crashing:
Feb 9 01:59:44 fcstore01 ceph-osd: -8> 2013-02-09 01:59:43.568604 7f75f6412780 -1 osd.11 pg_epoch: 13136 pg[0.4( v 3774'2 (0'0,3774'2] local-les=13129 n=2 ec=1 les/c 13129/13136 13128/13128/12376) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -7> 2013-02-09 01:59:43.628057 7f75f6412780 -1 osd.11 pg_epoch: 13802 pg[0.7( v 3767'3 (0'0,3767'3] local-les=13799 n=3 ec=1 les/c 13799/13802 13618/13618/12376) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -6> 2013-02-09 01:59:43.668497 7f75f6412780 -1 osd.11 pg_epoch: 13624 pg[0.d( v 3767'1 (0'0,3767'1] local-les=13619 n=1 ec=1 les/c 13619/13624 13618/13618/12345) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -5> 2013-02-09 01:59:43.726633 7f75f6412780 -1 osd.11 pg_epoch: 12423 pg[0.16( v 3774'5 (0'0,3774'5] local-les=12377 n=5 ec=1 les/c 12377/12423 12376/12376/12311) [] r=0 lpr=0 pi=12311-12375/1 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -4> 2013-02-09 01:59:43.812364 7f75f6412780 -1 osd.11 pg_epoch: 12346 pg[0.32( v 3767'3 (0'0,3767'3] local-les=12346 n=3 ec=1 les/c 12346/12346 12345/12345/12275) [] r=0 lpr=0 pi=12258-12344/3 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -3> 2013-02-09 01:59:43.880802 7f75f6412780 -1 osd.11 pg_epoch: 13102 pg[0.3b( v 3767'4 (0'0,3767'4] local-les=13099 n=4 ec=1 les/c 13099/13102 13094/13094/12345) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -2> 2013-02-09 01:59:43.983465 7f75f6412780 -1 osd.11 pg_epoch: 13091 pg[0.40( v 842'3 (0'0,842'3] local-les=13078 n=3 ec=1 les/c 13078/13091 13076/13076/12376) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: -1> 2013-02-09 01:59:44.029989 7f75f6412780 -1 osd.11 pg_epoch: 13355 pg[0.42( v 3767'3 (0'0,3767'3] local-les=13166 n=3 ec=1 les/c 13166/13170 13165/13165/13165) [] r=0 lpr=0 pi=13154-13164/1 lcod 0'0 mlcod 0'0 inactive] ondisk_snapcolls: [0~1] does not match snap_collections [] repairing. Feb 9 01:59:44 fcstore01 ceph-osd: 0> 2013-02-09 01:59:44.097010 7f75f6412780 -1 *** Caught signal (Aborted) **#012 in thread 7f75f6412780#012#012 ceph version 0.56.2-17-g200d5e2 (200d5e2da5ab7a6292f3174b5a38510630e2c91f)#012 1: /usr/local/bin/ceph-osd() [0x855032]#012 2: (()+0xf030) [0x7f75f5e17030]#012 3: (gsignal()+0x35) [0x7f75f4395475]#012 4: (abort()+0x180) [0x7f75f43986f0]#012 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f75f4be789d]#012 6: (()+0x63996) [0x7f75f4be5996]#012 7: (()+0x639c3) [0x7f75f4be59c3]#012 8: (()+0x63bee) [0x7f75f4be5bee]#012 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x8fc7b7]#012 10: (PG::peek_map_epoch(ObjectStore*, coll_t, ceph::buffer::list*)+0x97) [0x74abf7]#012 11: (OSD::load_pgs()+0x735) [0x700ce5]#012 12: (OSD::init()+0x81e) [0x703d3e]#012 13: (main()+0x2046) [0x63c316]#012 14: (__libc_start_main()+0xfd) [0x7f75f4381ead]#012 15: /usr/local/bin/ceph-osd() [0x63e699]#012 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. <pre>
#10 Updated by Sage Weil about 11 years ago
Hi Simon-
Is the osd crashing on startup every time? From the trace it looks like there is an invalid xattr set on the pg directory.
#11 Updated by Simon Frerichs about 11 years ago
Hi Sage,
i checked two osds one start with this branch each and they crashed.
I'll do another check later. Our cluster is recovering/backfilling since 24-36 hours and our users don't like more downtime/lagging.
#12 Updated by Sage Weil about 11 years ago
Simon Frerichs wrote:
Hi Sage,
i checked two osds one start with this branch each and they crashed.
I'll do another check later. Our cluster is recovering/backfilling since 24-36 hours and our users don't like more downtime/lagging.
Thanks. I just reproduced this locally.. there is definitely something wrong with wip_bobtail_f. Thanks!
#13 Updated by Sage Weil about 11 years ago
quick update: chasing a relate issue, hold off on upgrading a bit longer.
#14 Updated by Sage Weil about 11 years ago
Just to clarify: please try the latest bobtail branch. Ignore wip_bobtail_f. Thanks!
#15 Updated by Sage Weil about 11 years ago
- Status changed from New to Can't reproduce
Closing out this bug. The wip_bobtail_f issues were limited to that branch. We'll need to reproduce the problem on current bobtail or master to get any further.