Bug #5495
closedceph-mon and minus character in hostname
0%
Description
It looks like ceph-mon does not cope with a - in the hostname:
- /usr/bin/ceph-mon --cluster=office -i test-uplink-mesh
[2584]: (33) Numerical argument out of domain - /usr/bin/ceph-mon --cluster=office -i testuplinkmesh
The second invocation does not print out the error, but also does not start the mon. But that may have another reason.
Files
Updated by Sage Weil almost 11 years ago
- Assignee set to Sage Weil
- Priority changed from Normal to Urgent
Updated by Sage Weil almost 11 years ago
- Status changed from New to Need More Info
what versino is this?
can you strace -f ceph-mon and attach that output? that'll give a better hint as to where things are going wrong..
Updated by Robert Sander almost 11 years ago
Sage Weil wrote:
what version is this?
This are the official 0.61.4 packages from ceph.com
can you strace -f ceph-mon and attach that output? that'll give a better hint as to where things are going wrong..
I am sorry but I already purged the last installation.
Updated by Sage Weil almost 11 years ago
- Status changed from Need More Info to Can't reproduce
Updated by Joao Eduardo Luis almost 11 years ago
- File error-numerical-value-out-of-domain.txt error-numerical-value-out-of-domain.txt added
- Status changed from Can't reproduce to 4
A user was able to reproduce this reliably enough to get an strace out of it. Attached.
Updated by Joao Eduardo Luis almost 11 years ago
Forgot to mention that this user was attempting to upgrade from bobtail to cuttlefish.
Updated by Sage Weil almost 11 years ago
that second strace shows it hitting an unrelated assert on db->create_and_open()... joao?
Updated by Sage Weil almost 11 years ago
- Status changed from 4 to Need More Info
Updated by Joao Eduardo Luis almost 11 years ago
Sage Weil wrote:
that second strace shows it hitting an unrelated assert on db->create_and_open()... joao?
We've seen that happening on these guys cluster too while trying to upgrade. I've tried to reproduce it to no avail.
Looks like LevelDB's Open() was being unable to lock store.db/LOCK, stating it was already locked. There was no other monitor's running on that machine though, and lsof didn't report anything holding a lock to that file. Is it worth it to open a bug for this scenario, moving the strace there, and automatically mark it as Can't Reproduce?
Updated by Sage Weil almost 11 years ago
- Status changed from Need More Info to Can't reproduce
Joao Luis wrote:
Sage Weil wrote:
that second strace shows it hitting an unrelated assert on db->create_and_open()... joao?
We've seen that happening on these guys cluster too while trying to upgrade. I've tried to reproduce it to no avail.
Looks like LevelDB's Open() was being unable to lock store.db/LOCK, stating it was already locked. There was no other monitor's running on that machine though, and lsof didn't report anything holding a lock to that file. Is it worth it to open a bug for this scenario, moving the strace there, and automatically mark it as Can't Reproduce?
yeah. closing this one as can't reproduce.. i'm able to do dashes just fine. Robert, if this problem persists, let us know!