Project

General

Profile

Bug #20475

EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000

Added by Sage Weil over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
Start date:
07/01/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
kraken, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

2017-07-01T08:45:05.466 INFO:tasks.workunit.client.0.smithi187.stdout:1
2017-07-01T08:45:05.469 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1365: test_mon_osd:  expect_false ceph osd set-require-min-compat-client dumpling
2017-07-01T08:45:05.472 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:47: expect_false:  set -x
2017-07-01T08:45:05.475 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:48: expect_false:  ceph osd set-require-min-compat-client dumpling
2017-07-01T08:45:05.656 INFO:tasks.workunit.client.0.smithi187.stderr:Error EPERM: osdmap current utilizes features that require hammer; cannot set require_min_compat_client below that to dumpling
2017-07-01T08:45:05.666 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:48: expect_false:  return 0
2017-07-01T08:45:05.668 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1366: test_mon_osd:  ceph osd set-require-min-compat-client luminous
2017-07-01T08:45:05.847 INFO:tasks.workunit.client.0.smithi187.stderr:Error EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000); add --yes-i-really-mean-it to do it anyway
2017-07-01T08:45:05.858 INFO:tasks.workunit.client.0.smithi187.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1: test_mon_osd:  rm -fr /tmp/cephtool.iTI
2017-07-01T08:45:05.861 INFO:tasks.workunit:Stopping ['cephtool'] on client.0...
2017-07-01T08:45:05.863 INFO:teuthology.orchestra.run.smithi187:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0'

/a/sage-2017-07-01_06:55:28-rados-wip-sage-testing2-distro-basic-smithi/1351411
rados/singleton-bluestore/{all/cephtool.yaml msgr-failures/few.yaml msgr/random.yaml objectstore/bluestore.yaml rados.yaml}

Related issues

Copied to RADOS - Backport #20638: kraken: EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000 Resolved
Copied to RADOS - Backport #20639: jewel: EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000 Resolved

History

#1 Updated by Sage Weil over 1 year ago

I've seen this at least twice now. It is not an upgrade test, so either unauthenticated clients that are strays in the test lab are causing it (should fix that if so) or there is something wrong with the feature tracking?

#2 Updated by Greg Farnum over 1 year ago

Those look to be 22 and 60, which are DEFINE_CEPH_FEATURE_RETIRED(22, 1, BACKFILL_RESERVATION, JEWEL, LUMINOUS) and DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING).

I suspect there's lots of room for different systems to be handling optional (or new-retired-stuff) features like that incorrectly?

#3 Updated by Sage Weil over 1 year ago

/a/sage-2017-07-03_15:41:59-rados-wip-sage-testing-distro-basic-smithi/1356174

rados/singleton-bluestore/{all/cephtool.yaml msgr-failures/few.yaml msgr/simple.yaml objectstore/bluestore.yaml rados.yaml}

#4 Updated by Sage Weil over 1 year ago

  • Status changed from Verified to Need More Info

#5 Updated by Greg Farnum over 1 year ago

What info do we need if this is reproducing with nightly logging?

#6 Updated by Sage Weil over 1 year ago

baddy is

2017-07-13 22:47:17.400808 7f545ae21700 10 mon.a@0(leader) e1 ms_verify_authorizer 172.21.15.83:0/822331465 client protocol 0
2017-07-13 22:47:17.400836 7f545ae21700 10 In get_auth_session_handler for protocol 0
2017-07-13 22:47:17.401010 7f545df28700  1 -- 172.21.15.192:6789/0 <== client.4116 172.21.15.83:0/822331465 1 ==== auth(proto 0 31 bytes epoch 2) v1 ==== 61+0+0 (330027742 0 0) 0x5567c6297c00 con 0x5567c68760a0
2017-07-13 22:47:17.401060 7f545df28700 10 mon.a@0(leader) e1 _ms_dispatch new session 0x5567c6991d40 MonSession(client.4116 172.21.15.83:0/822331465 is open ) features 0xffddff8ee84fffb
2017-07-13 22:47:17.401073 7f545df28700 20 mon.a@0(leader) e1  caps 
2017-07-13 22:47:17.401077 7f545df28700 10 mon.a@0(leader).paxosservice(auth 1..5) dispatch 0x5567c6297c00 auth(proto 0 31 bytes epoch 2) v1 from client.4116 172.21.15.83:0/822331465 con 0x5567c68760a0
2017-07-13 22:47:17.401085 7f545df28700  5 mon.a@0(leader).paxos(paxos active c 1..10) is_readable = 1 - now=2017-07-13 22:47:17.401086 lease_expire=2017-07-13 22:47:22.145861 has v0 lc 10
2017-07-13 22:47:17.401103 7f545df28700 10 mon.a@0(leader).auth v5 preprocess_query auth(proto 0 31 bytes epoch 2) v1 from client.4116 172.21.15.83:0/822331465
2017-07-13 22:47:17.401109 7f545df28700 10 mon.a@0(leader).auth v5 prep_auth() blob_size=31
2017-07-13 22:47:17.401146 7f545df28700 10 cephx server client.mirror: start_session server_challenge ef75ec0c924a5012
2017-07-13 22:47:17.401158 7f545df28700  2 mon.a@0(leader) e1 send_reply 0x5567c6a02f80 0x5567c6296300 auth_reply(proto 2 0 (0) Success) v1
2017-07-13 22:47:17.401171 7f545df28700  1 -- 172.21.15.192:6789/0 --> 172.21.15.83:0/822331465 -- auth_reply(proto 2 0 (0) Success) v1 -- ?+0 0x5567c6296300 con 0x5567c68760a0
2017-07-13 22:47:17.401577 7f545df28700  1 -- 172.21.15.192:6789/0 <== client.4116 172.21.15.83:0/822331465 2 ==== auth(proto 2 128 bytes epoch 0) v1 ==== 158+0+0 (2555326981 0 0) 0x5567c6295400 con 0x5567c68760a0
2017-07-13 22:47:17.401594 7f545df28700 20 mon.a@0(leader) e1 _ms_dispatch existing session 0x5567c6991d40 for client.4116 172.21.15.83:0/822331465
2017-07-13 22:47:17.401600 7f545df28700 20 mon.a@0(leader) e1  caps 
2017-07-13 22:47:17.401602 7f545df28700 10 mon.a@0(leader).paxosservice(auth 1..5) dispatch 0x5567c6295400 auth(proto 2 128 bytes epoch 0) v1 from client.4116 172.21.15.83:0/822331465 con 0x5567c68760a0
2017-07-13 22:47:17.401609 7f545df28700  5 mon.a@0(leader).paxos(paxos active c 1..10) is_readable = 1 - now=2017-07-13 22:47:17.401609 lease_expire=2017-07-13 22:47:22.145861 has v0 lc 10
2017-07-13 22:47:17.401620 7f545df28700 10 mon.a@0(leader).auth v5 preprocess_query auth(proto 2 128 bytes epoch 0) v1 from client.4116 172.21.15.83:0/822331465
2017-07-13 22:47:17.401626 7f545df28700 10 mon.a@0(leader).auth v5 prep_auth() blob_size=128
2017-07-13 22:47:17.401630 7f545df28700 10 cephx server client.mirror: handle_request get_auth_session_key for client.mirror
2017-07-13 22:47:17.401637 7f545df28700  0 cephx server client.mirror: couldn't find entity name: client.mirror
2017-07-13 22:47:17.401642 7f545df28700  2 mon.a@0(leader) e1 send_reply 0x5567c6a02f80 0x5567c6297c00 auth_reply(proto 2 -1 (1) Operation not permitted) v1
2017-07-13 22:47:17.401657 7f545df28700  1 -- 172.21.15.192:6789/0 --> 172.21.15.83:0/822331465 -- auth_reply(proto 2 -1 (1) Operation not permitted) v1 -- ?+0 0x5567c6297c00 con 0x5567c68760a0

client.mirror on smithi083

#7 Updated by Sage Weil over 1 year ago

ok, smithi083 was (is!) locked by

/home/teuthworker/archive/teuthology-2017-07-13_05:10:02-fs-kraken-distro-basic-smithi/1395474

so that explains the kraken part. but doesn't make sense is why it is contacting mon(s) on smithi192. no logs except teuthology.log, unfortunately. shutdown didn't finish and something already cleaned up /var/log/ceph and packages on the machine.

#8 Updated by Sage Weil over 1 year ago

  • Status changed from Need More Info to In Progress

#9 Updated by Sage Weil over 1 year ago

  • Status changed from In Progress to Pending Backport

#10 Updated by Nathan Cutler over 1 year ago

  • Backport set to kraken, jewel

#11 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #20638: kraken: EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000 added

#12 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #20639: jewel: EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0x800000000200000 added

#13 Updated by Nathan Cutler over 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF