Project

General

Profile

Bug #12638

build_incremental() could take 40678 ms to finish

Added by Kefu Chai about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/06/2015
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
firefly,hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)

it is reported that a monitor with over 800 osdmap subscribers will take 90 seconds to return from update_from_paxos(). the majority of time is spend on build_incremental(). so we'd better load the fullmap/incremental map in a jiffy.

015-08-06 01:00:58.294782 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x1c4b1280 next 528892 (onetime)
2015-08-06 01:00:58.294790 7fd57c264700  5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81113180 10.202.48.22:0/1302109
2015-08-06 01:00:58.294815 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]
2015-08-06 01:00:58.335493 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x72c20c0 next 528892 (onetime)
2015-08-06 01:00:58.335508 7fd57c264700  5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81122956 10.202.48.29:0/1040905
2015-08-06 01:00:58.335512 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]

Related issues

Copied to Ceph - Backport #12816: build_incremental() could take 40678 ms to finish Resolved 08/06/2015
Copied to Ceph - Backport #12817: build_incremental() could take 40678 ms to finish Resolved 08/06/2015

Associated revisions

Revision 5e99a578 (diff)
Added by Kefu Chai almost 3 years ago

mon: add a cache layer over MonitorDBStore

the cache of of leveldb does not perform well under some condition,
so we need a cache in our own stack.

  • add an option "mon_osd_cache_size" to control the size of cache size
    of MonitorDBStore.

Fixes: #12638
Signed-off-by: Kefu Chai <>

Revision 411769c1 (diff)
Added by Kefu Chai almost 3 years ago

mon: add a cache layer over MonitorDBStore

the cache of of leveldb does not perform well under some condition,
so we need a cache in our own stack.

  • add an option "mon_osd_cache_size" to control the size of cache size
    of MonitorDBStore.

Fixes: #12638
Signed-off-by: Kefu Chai <>
(cherry picked from commit 5e99a578050976ca22b549812ac80d494fe7041d)

Conflicts:
src/mon/OSDMonitor.h
minor differences in OSDMonitor ctor

Revision 0bc909e5 (diff)
Added by Kefu Chai almost 3 years ago

mon: add a cache layer over MonitorDBStore

the cache of of leveldb does not perform well under some condition,
so we need a cache in our own stack.

  • add an option "mon_osd_cache_size" to control the size of cache size
    of MonitorDBStore.

Fixes: #12638
Signed-off-by: Kefu Chai <>
(cherry picked from commit 5e99a578050976ca22b549812ac80d494fe7041d)

Conflicts:
src/mon/OSDMonitor.h
minor differences in OSDMonitor ctor

History

#1 Updated by Kefu Chai about 3 years ago

  • Assignee set to Kefu Chai

#2 Updated by Kefu Chai about 3 years ago

2015-08-06 01:00:58.294782 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x1c4b1280 next 528892 (onetime)
2015-08-06 01:00:58.294790 7fd57c264700  5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81113180 10.202.48.22:0/1302109
2015-08-06 01:00:58.294815 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]
2015-08-06 01:00:58.335493 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x72c20c0 next 528892 (onetime)
2015-08-06 01:00:58.335508 7fd57c264700  5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81122956 10.202.48.29:0/1040905
2015-08-06 01:00:58.335512 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]

40678 ms from build_incremental() to the next check_sub(). in this case, a lease timeout followed.

2015-08-07 02:07:07.063950 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x17c94340 next 529993 (onetime)
2015-08-07 02:07:07.063961 7f475f98d700  5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.84840059 10.202.48.34:0/2331792
2015-08-07 02:07:07.063974 7f475f98d700 10 mon-018@2(peon).osd e529993 build_incremental [529993..529993]
2015-08-07 02:07:07.064150 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x1bf26a00 next 529993 (onetime)
2015-08-07 02:07:07.064159 7f475f98d700  5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.80986206 10.202.48.26:0/1364503
2015-08-07 02:07:07.064167 7f475f98d700 10 mon-018@2(peon).osd e529993 build_incremental [529993..529993]
2015-08-07 02:07:07.064268 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x1baae100 next 529993 (onetime)

176 ms from build_incremental() to the next check_sub().

ubuntu@teuthology $ grep '29993..529993' mon.18.log | grep send_incremental |  head -n1
2015-08-07 02:07:07.063961 7f475f98d700  5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.84840059 10.202.48.34:0/2331792
ubuntu@teuthology $ grep '29993..529993' mon.18.log | grep send_incremental |  tail -n1
2015-08-07 02:07:07.270799 7f475f98d700  5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to osd.411 10.202.49.16:6821/29473
ubuntu@teuthology$ grep '29993..529993' mon.18.log | grep send_incremental | awk '{print $10}'  |less | cut -d'.' -f1|grep client | wc -l
729

and it replied 728 osdmap subscriptions in 206838 ms, i.e. 0.2 s.

$ grep '29993..529993' mon.18.log | grep send_incremental | awk '{print $11}'  | cut -d':' -f1 | sort | uniq | wc -l
79

they comes from 79 ip addresses. so the monitor performed well with the same load.

#3 Updated by Kefu Chai about 3 years ago

sam suggested,

we can also change the

- mon_leveldb_cache_size // increase this value to see if it remedies the situation
- mon_leveldb_size_warn

#4 Updated by Greg Farnum about 3 years ago

These numbers look very strange to me. Do we have any more details about what's happening here, or why things might be so slow?

#5 Updated by Kefu Chai about 3 years ago

Do we have any more details about what's happening here

per Tupper, this happens when an OSD host restarts.

or why things might be so slow?

one reason could be, the leveldb was compacting at that moment.

#6 Updated by Kefu Chai almost 3 years ago

  • Status changed from Verified to Need Review

#7 Updated by Kefu Chai almost 3 years ago

  • Description updated (diff)
  • Affected Versions v0.80.10 added

#8 Updated by Kefu Chai almost 3 years ago

  • Status changed from Need Review to Pending Backport
  • Backport set to firefly

#9 Updated by Loic Dachary almost 3 years ago

  • Backport changed from firefly to firefly,hammer

#10 Updated by Loic Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF