Bug #9052: ceph-mon crashes with *** Caught signal (Floating point exception) ** - Ceph - Ceph

Actions

Copy link

Bug #9052

closed

ceph-mon crashes with * Caught signal (Floating point exception)

Added by Jamin Collins over 9 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Sage Weil

Category:

Monitor

Target version:

0.84

% Done:

Source:

Community (user)

Tags:

Backport:

firefly

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I've found that I can crash ceph-mon by attempting to change pool values (such as pg_num) before adding OSDs to the cluster. Examples of the crash and command:

Crash: http://pastebin.com/LpF0gHNY
Command: http://pastebin.com/8jJ80MK2

Actions

Copy link

Updated by Dan Mick over 9 years ago

Category set to Monitor
Priority changed from Normal to High
Target version set to 0.84
Source changed from other to Community (user)
Backport set to firefly

Actions

Copy link

Updated by Dan Mick over 9 years ago

With no OSDs in the cluster, the calculations for pgs_per_osd can divide by zero (integer, but that still causes the FPE).

    int expected_osds = MIN(p.get_pg_num(), osdmap.get_num_osds());
    int64_t new_pgs = n - p.get_pg_num();
    int64_t pgs_per_osd = new_pgs / expected_osds;

expected_osds can be zero.

Looking briefly, there are a few other places in OSDMonitor where /0 looks possible:

float up_ratio = (float)up / (float)osdmap.get_num_osds();

float in_ratio = (float)in / (float)osdmap.get_num_osds();

two instances of:

double halflife = (double)g_conf->mon_osd_laggy_halflife;
double decay_k = ::log(.5) / halflife;

It might be good to review Coverity and maybe increase the priority of such warnings.

Actions

Copy link

Updated by Sage Weil over 9 years ago

Priority changed from High to Urgent

Actions

Copy link

Updated by Sage Weil over 9 years ago

Assignee set to Sage Weil

Actions

Copy link

Updated by Sage Weil over 9 years ago

Status changed from New to Resolved

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #9052

ceph-mon crashes with * Caught signal (Floating point exception)

Updated by Dan Mick over 9 years ago

Updated by Dan Mick over 9 years ago

Updated by Sage Weil over 9 years ago

Updated by Sage Weil over 9 years ago

Updated by Sage Weil over 9 years ago