Bug #12466: Init script bug with two clusters with the same osd ID on the same host - Ceph - Ceph

Actions

Copy link

Bug #12466

closed

Init script bug with two clusters with the same osd ID on the same host

Added by Angapov Vasily almost 9 years ago. Updated about 7 years ago.

Status:

Won't Fix

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I installed two Ceph clusters (Centos 7, Ceph 9.0.1), one is "ceph", another is "ceph-prod". Both clusters have independent OSD numeration, so it turned out that i have the same osd IDs on some machines but belonging to different clusters, e.g. on one host I have osd.1 from "ceph" and osd.1 from "ceph-prod".

The problem is that by default Ceph stores OSD PID files in one directory (/var/tun/ceph) and calls them without honoring the cluster name, e.g. osd.1 have PID file "osd.1.pid" regardless of cluster it belongs to. So when osd.1 from "ceph" is already started, I cannot anymore start osd.1 from "ceph-prod" using /etc/init.d/ceph script, because it always tells:

=== osd.1 ===
Starting Ceph osd.1 on node1...already running

Error comes from init script where function is defined for checking OSD status:

daemon_is_running() {
name=$1
daemon=$2
daemon_id=$3
pidfile=$4
do_cmd "[ -e $pidfile ] || exit 1 # no pid, presumably not running
pid=\`cat $pidfile\`
[ -e /proc/\$pid ] && grep -q $daemon /proc/\$pid/cmdline && grep -qwe -i.$daemon_id /proc/\$pid/cmdline && exit 0 # running
exit 1 # pid is something else" "" "okfail"
}

And PID file name if generated like that:

get_conf pid_file "$run_dir/$type.$id.pid" "pid file"

So here we do not honor cluster name anyhow. It means that I just cannot start OSD from second cluster using init script, or at least cannot find a way how to do that.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #12466

Init script bug with two clusters with the same osd ID on the same host

Updated by huang jun almost 9 years ago

Updated by Angapov Vasily almost 9 years ago

Updated by Angapov Vasily almost 9 years ago

Updated by Sage Weil about 7 years ago