Project

General

Profile

Actions

Feature #53050

closed

Support blocklisting a CIDR range

Added by Greg Farnum over 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy,pacific
Reviewed:
Affected Versions:
Component(RADOS):
Pull request ID:

Description

Disaster recovery use cases want to be able to fence off entire IP ranges, rather than needing to specify individual IPs. This is for several important reasons:
1) They may not know the individual IPs involved.
2) We do not want to inflate the OSD map with every IP in a potentially-large cluster
3) Nobody wants to have to issue 1000 or 10000 blocklist commands, and we don't want to have to commit that many OSDMap updates that quickly.

Extend the "osd blocklist" command (or add a similar one?) to work with CIDR ranges in addition to single IPs, and update the code that works with blocklists to deal with that change.


Related issues 3 (0 open3 closed)

Related to CephFS - Bug #55516: qa: fs suite tests failing with "json.decoder.JSONDecodeError: Extra data: line 2 column 82 (char 82)"ResolvedJos Collin

Actions
Copied to RADOS - Backport #55746: quincy: Support blocklisting a CIDR rangeResolvedGreg FarnumActions
Copied to RADOS - Backport #55747: pacific: Support blocklisting a CIDR rangeResolvedGreg FarnumActions
Actions #1

Updated by Patrick Donnelly over 2 years ago

So we're going to put a huge asterisk here that the CIDR range of machines must be hard-rebooted, right? Otherwise whenever the unblocklist event comes in, any kernel mounts with dirty data will happily flush that out.

That was the rationale for the new generation number we discussed in various places including:

https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/3FTOBHPZQGTEY3RHMO2EXKOLSP3SJGNW/

Also, I think compound blocklist commands should be supported as there will probably be several CIDR ranges involved.

Actions #2

Updated by Greg Farnum over 2 years ago

Patrick Donnelly wrote:

So we're going to put a huge asterisk here that the CIDR range of machines must be hard-rebooted, right? Otherwise whenever the unblocklist event comes in, any kernel mounts with dirty data will happily flush that out.

I mean that’s only if you have on the force reconnect, which is a mount option, right? So anybody who wants to plan to use range blocklisting can just not use bad remounts. Or, they can arrange to reboot, which I get the impression is going to happen anyway since other layers need to do updates and fallbacks.

That was the rationale for the new generation number we discussed in various places including:

https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/3FTOBHPZQGTEY3RHMO2EXKOLSP3SJGNW/

Yeah. That proposal isn’t gonna happen any time soon, if ever. I can knock this ticket out this or next week.

Also, I think compound blocklist commands should be supported as there will probably be several CIDR ranges involved.

Meh; we can handle three osdmap commits if needed but multiple ranges was specifically not part of the request — just a cluster with a single fixed range.

Actions #3

Updated by Patrick Donnelly over 2 years ago

Greg Farnum wrote:

Patrick Donnelly wrote:

So we're going to put a huge asterisk here that the CIDR range of machines must be hard-rebooted, right? Otherwise whenever the unblocklist event comes in, any kernel mounts with dirty data will happily flush that out.

I mean that’s only if you have on the force reconnect, which is a mount option, right?

You mean recover_session=clean?

https://docs.ceph.com/en/pacific/man/8/mount.ceph/#basic

Not just in that scenario, no. The kernel has gotten better about not flushing dirty data when blocklisted. I'm not sure how safe it is at this time, Jeff would need to comment.

So anybody who wants to plan to use range blocklisting can just not use bad remounts.

You can't (yet) skip blocklisted superblocks. They stick around and prevent new mounts from even connecting. Jeff is working on a fix for that presently which will be upstream soon.

Or, they can arrange to reboot, which I get the impression is going to happen anyway since other layers need to do updates and fallbacks.

That was the rationale for the new generation number we discussed in various places including:

https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/3FTOBHPZQGTEY3RHMO2EXKOLSP3SJGNW/

Yeah. That proposal isn’t gonna happen any time soon, if ever. I can knock this ticket out this or next week.

Also, I think compound blocklist commands should be supported as there will probably be several CIDR ranges involved.

Meh; we can handle three osdmap commits if needed but multiple ranges was specifically not part of the request — just a cluster with a single fixed range.

ACK.

Actions #4

Updated by Greg Farnum almost 2 years ago

  • Status changed from New to Resolved
  • Pull request ID set to 44151
Actions #5

Updated by Greg Farnum almost 2 years ago

  • Status changed from Resolved to Pending Backport
Actions #6

Updated by Jos Collin almost 2 years ago

  • Related to Bug #55516: qa: fs suite tests failing with "json.decoder.JSONDecodeError: Extra data: line 2 column 82 (char 82)" added
Actions #7

Updated by Neha Ojha almost 2 years ago

  • Backport set to quincy,pacific

The Backport field was empty, therefore no backport tickets were created.

Actions #8

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #55746: quincy: Support blocklisting a CIDR range added
Actions #9

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #55747: pacific: Support blocklisting a CIDR range added
Actions #10

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #11

Updated by Greg Farnum over 1 year ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF