Actions
Bug #3647
closedforgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclient: hunting for new mon
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Seeing errors when setting up ceph from scratch with the options in the ceph.conf file. I forgot the auth options for Cephx and added them later. then did a restart.
(04:47:39 PM) pat.beadles@newdream.net/18754661421353962283138380: here is the errors:2012-12-19 00:37:18.366578 7ff9faaad700 monclient: hunting for new mon 2012-12-19 00:37:18.377881 7ff9faaad700 monclient: hunting for new mon 2012-12-19 00:37:18.405659 7ff9faaad700 -- 10.1.10.21:0/12730 send_message dropped message observe(3 v0) v1 because of no pipe on con 0x7ff9ec0084a0 2012-12-19 00:37:18.405864 7ff9faaad700 -- 10.1.10.21:0/12730 send_message dropped message observe(4 v0) v1 because of no pipe on con 0x7ff9ec0084a0 2012-12-19 00:37:18.405979 7ff9faaad700 -- 10.1.10.21:0/12730 send_message dropped message observe(5 v0) v1 because of no pipe on con 0x7ff9ec0084a0 2012-12-19 00:37:18.406078 7ff9faaad700 monclient: hunting for new mon (04:47:52 PM) pat.beadles@newdream.net/18754661421353962283138380: on ceph1 (04:48:27 PM) pat.beadles@newdream.net/18754661421353962283138380: the errors on the other 2 nodes are: 2012-12-19 00:37:18.508030 7fdee4e67700 0 can't decode unknown message type 54 MSG_AUTH=17 (04:48:36 PM) pat.beadles@newdream.net/18754661421353962283138380: have you seen this? (04:48:41 PM) tamil.muthamizhan@newdream.net: please paste ceph.conf file (04:48:54 PM) pat.beadles@newdream.net/18754661421353962283138380: this happens when I do "ceph -s" (04:49:18 PM) tamil.muthamizhan@newdream.net: interesting (04:49:23 PM) tamil.muthamizhan@newdream.net: i have not seen this error (04:49:42 PM) pat.beadles@newdream.net/18754661421353962283138380: auth cluster required = none auth service required = none auth client required = none [osd] osd journal size = 1000 filestore xattr use omap = true # Execute $ hostname to retrieve the name of your host, # and replace {hostname} with the name of your host. # For the monitor, replace {ip-address} with the IP # address of your host. [mon.a] host = ceph1 mon addr = 10.1.10.21:6789 [mon.b] host = ceph2 mon addr = 10.1.10.26:6789 [mon.c] host = ceph3 mon addr = 10.1.10.22:6789 [osd.0] host = ceph1 osd data = /var/lib/ceph/osd/ceph-0 devs = /dev/sdc1 osd mkfs type = ext4 osd mount options ext4 = "rw,noatime,user_xattr" [osd.1] host = ceph1 osd data = /var/lib/ceph/osd/ceph-1 devs = /dev/sdd1 osd mkfs type = ext4 osd mount options ext4 = "rw,noatime,user_xattr" [osd.2] host = ceph2 osd data = /var/lib/ceph/osd/ceph-2 devs = /dev/sdc1 osd mkfs type = ext4 osd mount options ext4 = "rw,noatime,user_xattr" [osd.3] host = ceph2 osd data = /var/lib/ceph/osd/ceph-3 devs = /dev/sdd1 osd mkfs type = ext4 osd mount options ext4 = "rw,noatime,user_xattr" [osd.4] host = ceph3 osd data = /var/lib/ceph/osd/ceph-4 #devs = /dev/sdc1 #osd mkfs type = ext4 #osd mount options ext4 = "rw,noatime,user_xattr" [osd.5] host = ceph3 osd data = /var/lib/ceph/osd/ceph-5 devs = /dev/sdd1 osd mkfs type = ext4 #osd mount options ext4 = "rw,noatime,user_xattr" [mds.a] host = ceph1 (04:50:16 PM) tamil.muthamizhan@newdream.net: are you putting the cephx options under [global]? (04:50:20 PM) tamil.muthamizhan@newdream.net: i dont see global here (04:50:46 PM) pat.beadles@newdream.net/18754661421353962283138380: ubuntu@ceph1:/var/log/ceph$ sudo service ceph -a status === mon.a === mon.a: running {"version":"0.55.1-294-g0dd1302"} === mon.b === mon.b: running {"version":"0.55.1-294-g0dd1302"} === mon.c === mon.c: running {"version":"0.55.1-294-g0dd1302"} === mds.a === mds.a: running {"version":"0.55.1-294-g0dd1302"} === osd.0 === osd.0: running {"version":"0.55.1-294-g0dd1302"} === osd.1 === osd.1: running {"version":"0.55.1-294-g0dd1302"} === osd.2 === osd.2: running {"version":"0.55.1-294-g0dd1302"} === osd.3 === osd.3: running {"version":"0.55.1-294-g0dd1302"} === osd.4 === osd.4: running {"version":"0.55.1-294-g0dd1302"} === osd.5 === osd.5: running {"version":"0.55.1-294-g0dd1302"} (04:51:12 PM) tamil.muthamizhan@newdream.net: sudo ceph -s (04:51:15 PM) tamil.muthamizhan@newdream.net: ? (04:52:06 PM) tamil.muthamizhan@newdream.net: ok, so all daemons are running file? (04:52:08 PM) tamil.muthamizhan@newdream.net: fine*? (04:54:30 PM) pat.beadles@newdream.net/18754661421353962283138380: yes the auth is at the top of the file (04:54:54 PM) tamil.muthamizhan@newdream.net: ok (04:55:07 PM) tamil.muthamizhan@newdream.net: shall I take a look into your cluster? (04:55:41 PM) pat.beadles@newdream.net/18754661421353962283138380: If I do ceph health, it gives me an ok. (04:55:46 PM) pat.beadles@newdream.net/18754661421353962283138380: ubuntu@ceph1:/var/log/ceph$ ceph health 2012-12-19 00:55:01.177928 mon <- [health] 2012-12-19 00:55:01.178756 mon.0 -> 'HEALTH_OK' (0) (04:56:18 PM) tamil.muthamizhan@newdream.net: so when you issue "sudo ceph -s" on the other nodes, you see that starnge auth message? (04:56:26 PM) tamil.muthamizhan@newdream.net: strange* (04:57:07 PM) tamil.muthamizhan@newdream.net: is there any logs at /var/log/ceph/osd/ceph-0 (04:57:09 PM) tamil.muthamizhan@newdream.net: ? (04:57:18 PM) pat.beadles@newdream.net/18754661421353962283138380: the other 2 nodes respond correctly (04:57:32 PM) tamil.muthamizhan@newdream.net: is there any logs at /var/log/ceph/osd/ceph-<osd_no> (04:57:36 PM) pat.beadles@newdream.net/18754661421353962283138380: ubuntu@ceph3:/var/log/ceph$ ceph -s health HEALTH_OK monmap e1: 3 mons at {a=10.1.10.21:6789/0,b=10.1.10.26:6789/0,c=10.1.10.22:6789/0}, election epoch 24, quorum 0,1,2 a,b,c osdmap e78: 6 osds: 6 up, 6 in pgmap v868: 1344 pgs: 1344 active+clean; 8730 bytes data, 7712 MB used, 50457 MB / 61241 MB avail mdsmap e21: 1/1/1 up {0=a=up:active} (04:58:34 PM) pat.beadles@newdream.net/18754661421353962283138380: yes. would you be able to get to my VMs if I give you the IPs? (04:58:42 PM) tamil.muthamizhan@newdream.net: you mentionedthe errors on the other 2 nodes are: 2012-12-19 00:37:18.508030 7fdee4e67700 0 can't decode unknown message type 54 MSG_AUTH=17
Files
Actions