Project

General

Profile

Actions

Feature #13301

closed

Request for method to track client versions which have connected to Ceph cluster

Added by Michael Hackett over 8 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Proposed title of this feature request
Method to track client Ceph versions that have connected to cluster.

What is the nature and description of the request?

Customer would like to begin using vary_r tunables on their cluster and is aware that they have over one thousand RBD clients connected there cluster running a mix of dumpling and firefly versions. vary_r tunables are not supported below firelfy so in order to validate which clients are running dumpling in order to upgrade them they are looking for a way to track which Ceph versions their clients are running and would like the cluster to store this information, possibly on a monitor.

Why does the customer need this?
If a monitor can store the ceph version of clinets that connect to the cluster it would be much easier for the customer to plan upgrades on thousands of clients for issues similar to the one mentioned above.

How would the customer like to achieve this?
Customer would like something stored or logged on a monitor to reference the clients versions that have connected to the cluster.

Actions #1

Updated by Samuel Just over 8 years ago

Well, we could probably log each time a new client authenticates with the mon?

Actions #2

Updated by Dan van der Ster almost 8 years ago

That would be good enough, IMHO.

Actions #3

Updated by Kefu Chai almost 8 years ago

  • Assignee set to Kefu Chai
Actions #4

Updated by Vikhyat Umrao almost 8 years ago

- One more enhancement would be great here.
- As we recommend running both ceph server side daemons and clients to be in same version and if they are not in same version log a warning in ceph.log(cluster logs).

- It will help a lot in case of NOVA/KVM client instances upgrade procedure. For example if nova instances(qemu-kvm) processes are running firefly and ceph cluster daemons are running hammer.

- Now we upgrade our nova-compute nodes to hammer but did not stop and start nova instances(qemu-kvm)processes (needs a down time) or did not live-migrate(do not need downtime)to other nova-computes.

- This causes nova qemu-kvm in memory process still running firefly and server code running hammer code. After that if we will change the tunables to hammer (and bucket algorithm to straw2) all instacnes which are still running firefly code in memory will crash in instance logs:

terminate called after throwing an instance of 'ceph::buffer::malformed_input'
  what():  buffer::malformed_input: *unsupported bucket algorithm: 5*

- This feature will help to avoid such mistakes.
- As if we see version mismatch warning after upgrading the clients also in cluster logs(ceph.log) we will come to know we still needs to either stop and start instances or live migrate them.

Actions #5

Updated by Vikhyat Umrao almost 8 years ago

  • Subject changed from Request for method to track client versions which have connected to Ceph cluster to Request for method to track client versions which have connected to Ceph cluster and if not same log a warning in ceph.log
Actions #6

Updated by Vikhyat Umrao almost 8 years ago

- While discussing with Kefu and Brad we have one more idea of adding a log if tunables are not compatible between ceph cluster and clients.

- Also these all logs can be handled by 'config_opts.h' with addition of new options and by default true.

Actions #7

Updated by Dan van der Ster almost 8 years ago

Why not log all clients' versions when they connect, regardless if they're older/newer/equal to the server? (That was the original request).

(You mentioned one specific problem about version incompatibilities, but there are other use cases for logging the version, e.g. monitoring client statistics).

Actions #8

Updated by Vikhyat Umrao almost 8 years ago

Sorry Dan if I have made things confusing.

My new note is for one additional request in same path of enhancement suggested by you and your original request is still valid and it will not be a warning message it will be a INFO message like Sam had added Log each time a new client authenticates with the mon.

With my suggestion it would be one more addition to the log if client and server are not in same version and then log a WRN message in ceph.log(cluster logs) and ERR if tunables are not compatible.

Actions #9

Updated by Vikhyat Umrao almost 8 years ago

  • Subject changed from Request for method to track client versions which have connected to Ceph cluster and if not same log a warning in ceph.log to Request for method to track client versions which have connected to Ceph cluster
Actions #10

Updated by Vikhyat Umrao almost 8 years ago

I should have not changed the title as title covers all the three requests!

Actions #11

Updated by Dan van der Ster almost 8 years ago

Cool, sounds good.

Actions #12

Updated by Greg Farnum almost 8 years ago

I think the ceph-mgr PR that John put in has some of the rudiments to enable connected-client version tracking. You may want to coordinate with that, Kefu.

I can't imagine we'll add a persistent warning about old clients, since we work hard to preserve cross-version compatibility. Warnings about that should be the responsibility of things layered on top of Ceph, like whatever GUIs get built that talk to ceph-mgr. (also, kernel clients don't have anything like the same versions that userspace does ;)

Actions #13

Updated by Kefu Chai over 7 years ago

thanks Greg, see the discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1267636

the ceph-mgr can only take care of the client accepted by messenger, but i think that would be good enough. i agree we should not build this functionality in ceph. i will take ceph-mgr into consideration, it's a perfect fit of this use case.

Actions #15

Updated by Dan van der Ster over 7 years ago

My 2 cents re ceph-mgr: it would be nice if this feature was simple enough to be easily backported to jewel, otherwise OP (me) won't be able to enable the vary_r tunable until the L release :-/

Really, all we need is to log a client's IP when it connects with feature flags less than a configured value. You can even log it at level 3 or something so normal clusters aren't spammed.

Actions #16

Updated by Josh Durgin over 6 years ago

  • Status changed from New to Resolved

There's a bunch of discussion here. This is what's merged in luminous:

1) the ability to set a minimum required release for clients, to prevent new connections from older clients ('ceph osd set-require-min-compat-client jewel') - this defaults to jewel in new clusters, and can be viewed as part of 'ceph osd dump'.

2) 'ceph features' to report the total number of clients and daemons at given featuresets and releases (e.g.:

{
"mon": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 3
}
},
"mds": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 3
}
},
"osd": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 3
}
},
"client": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 1
}
}
}

3) logging at debug mon = 10 in the monitor logs for connecting/disconnecting client address and features

Since this bug is pretty cluttered already, I'll close it referencing the relevant PRs:

https://github.com/ceph/ceph/pull/15371
https://github.com/ceph/ceph/pull/16128

Further changes can be handled in new tickets.

Actions

Also available in: Atom PDF