Project

General

Profile

Actions

Bug #20601

closed

mon comamnds time out due to pool create backlog w/ valgrind

Added by Sage Weil almost 7 years ago. Updated almost 7 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This isn't wrong per se, but it does mean worklaods with lots of pool creates (parallel rados api tests) and slow mons (valgrind) can be slow enough to make real requests time out (ceph osd unset noscrub).

see /a/sage-2017-07-12_02:32:14-rados-wip-sage-testing-distro-basic-smithi/1390087


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #20602: mon crush smoke test can time out under valgrindResolvedSage Weil07/12/2017

Actions
Actions #1

Updated by Sage Weil almost 7 years ago

  • Priority changed from Normal to Urgent

another failure with same cause, different symptom: this time a 'osd out 0' timed out due to a bunch of pool creates.

/a/sage-2017-07-13_20:38:15-rados-wip-sage-testing-distro-basic-smithi/1397165

Actions #2

Updated by Sage Weil almost 7 years ago

  • Subject changed from mon pool create forces paxos commit to mon comamnds time out due to pool create backlog w/ valgrind

It isn't that pool creations are serialized, actually; they are already batched. Maybe valgrind is just making it slow enough that they end up that way, though.

Actions #3

Updated by Sage Weil almost 7 years ago

  • Related to Bug #20602: mon crush smoke test can time out under valgrind added
Actions #4

Updated by Sage Weil almost 7 years ago

  • Status changed from 12 to Duplicate

ok, the problem is that the fork-based crushtool test is very slow under valgrind (valgrind has to do init/cleanup on the forked process). this is a dup of #20602.

Actions

Also available in: Atom PDF