Project

General

Profile

Feature #40909

Feature #40907: mgr/dashboard: REST API improvements

mgr/dashboard: REST API versioning

Added by Ernesto Puerta over 1 year ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
dashboard/backend
Target version:
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

In order to get an stable API we can either:
- Freeze existing API and allow only additive modifications (i.e.: new Resources, and extensions to the existing Resources under the assumption they don't break anything backwards).
- Use versioning for API.

The first approach (freezing API + additive changes) results in an ever growing API, where no refactoring is possible and the impact of changes is hard to determine.

Most APIs rely on versioning:
  • URI versioning: /api/v2/resource
  • Query string versioning: /api/resource?version=2
  • HTTP Headers:
    • Custom Header: X-Dashboard-Version: 2
    • MIME-compliant Vendor-specific Media Definition (VND): Accept: application/vnd.ceph.dashboard.v1+json
There is NO need to increment version on ANY change, but only in breaking ones (from [ref]):
  • Removing or renaming APIs or API parameters
  • Changes in behavior for an existing API (e.g.: changing defaults)
  • Changes in Error Codes and Fault Contracts
  • Anything that would violate the Principle of Least Astonishment

References:
Versioning

History

#1 Updated by Ernesto Puerta over 1 year ago

  • Description updated (diff)

#2 Updated by Ernesto Puerta over 1 year ago

  • Description updated (diff)

#3 Updated by Ernesto Puerta over 1 year ago

  • Description updated (diff)

#4 Updated by Ernesto Puerta over 1 year ago

  • Category set to dashboard/backend

#5 Updated by Sebastian Wagner over 1 year ago

API versioning is controversial.

In my opinion, there is just no good way to do API versioning and API versioning it itself doesn't really solve all issues.

Like https://apisyouwonthate.com/blog/api-versioning-has-no-right-way

or https://www.mnot.net/blog/2012/12/04/api-evolution.html

More important than API versioning is API design.

Like https://static.googleusercontent.com/media/research.google.com/de//pubs/archive/32713.pdf

Especially things like

  • Implementation details should not leak to the API, as it is often the cause for backward incompatible changes.
  • The API should be extensible to not force anyone to introduce backward incompatible changes
  • Try to keep API compatibility to -2 Ceph versions.

And you have to be very careful to not introduce bad API design with API versioning.

One thing is going to be mandatory when introducing API versioning, which is supposed to mean anything:
We can no longer pass thru any objects directly to the API. Which means a lot of boilerplate.

#6 Updated by Ernesto Puerta over 1 year ago

Yes, I'm aware of it. Thanks for the brain food, I really like reading about this stuff.

I think we all agree on that API versioning adds complexity and does not magically solves everything. That said, I think we have already crossed the point where we can live without API versioning (or any alternative solution). In the last 2 weeks, if I recall it properly, we have experienced 2-3 issues that were more or less related with mismatches between API methods and data structures (the Rook caching issue was triggered by a change in front-end/back-end auth/user model, the change in RGW default group/zonegroup that made NFS Ganesha QA test fail, and... what was the other :S). And versioned things surround us: Ceph 14.2.2, HTTP 1.1/2.0, HTML5, Python 2.7/3.4/3.6, Angular 2.x-7, Firefox/Chrome, OpenSUSE/Fedora, ...

The statement on not having breaking changes and making the API additive is great on the paper, but adding things forever tend to make them bloat, and bloated code is more prone to cause issues. In fact, most versioning recommendations don't enforce version increase when a change is additive and a default keeps it backward-compatible.

Having API versioning wouldn't have avoided those issues by itself, but:
  • It'd have made the errors more explicit: if a new back-end only provides Auth model v2, but front-end is calling v1/unversioned API, we'll get an API mismatch error.
  • It'd raise errors that might not have surfaced: even worse that breaking things due to backward-compatibility issues is having everything working apparently normal, while you're corrupting data.
  • It forces to think on backward compatibility: when you modify a data structure, which has a clearly visible version number, you tend to wonder if you might be breaking something.

As we don't have any team/individual fully dedicated to polish the API, I'd suggest taking a minimalistic approach (model-based versioning, which fits great with Mime-type versioning) and some defensive code/fencing (testing, change detection).

#7 Updated by Sebastian Wagner over 1 year ago

Ernesto Puerta wrote:

Yes, I'm aware of it. Thanks for the brain food, I really like reading about this stuff.

I think we all agree on that API versioning adds complexity and does not magically solves everything. That said, I think we have already crossed the point where we can live without API
versioning (or any alternative solution). In the last 2 weeks, if I recall it properly, we have experienced 2-3 issues that were more or less related with mismatches between API methods
and data structures (
  • the Rook caching issue was triggered by a change in front-end/back-end auth/user model,

That might be interesting. do you have details?

  • the change in RGW default group/zonegroup that made NFS Ganesha QA test fail,

Right now I'm not seeing how incasing anA PI version would have helped here. Do you have details?

and... what was the other :S).

And versioned things surround us: Ceph 14.2.2, HTTP 1.1/2.0, HTML5, Python 2.7/3.4/3.6, Angular 2.x-7, Firefox/Chrome, OpenSUSE/Fedora, ...

Sure, let's take those examples apart:

  • Ceph has in fact very little versioning. Maybe except for the v1 vs v2 messaging protocol and straw vs straw2. but each of those cases are typically an exception. actually it's more like the opposite: We are not calling cpeh osd pool2 create etc. The APIs or Ceph are typically not versioned at all.
  • And of course, by being part of a Ceph release, API users can already adapt their clients to cope with differeent Dashboard API versions.
  • HTTP/2.0: Indeed the on-wire protocl changed drastically. But this sounds more like a change form ceph-rest-api to restful.
  • Python is a great example of a evolutionaly API. Things get changes in Python and every new versions might come with incompatible changes that one has to cope with. We are not importing `import sys2` etc! APIs in Python might get deprecated and removed, if it turs out they don't work well. Just imaging for a moment if new Python versions would constantly add incompatible changes and would do so by providing versioned Python modules. That would be a total mess! And we're now about to do exactly that!
  • (I don't know how Angular / Chrome+Firefox are versioning their Javascrit APIs.)
  • AFAIK OpenSUSE/Fedora don't provide an API?

The statement on not having breaking changes and making the API additive is great on the paper, but adding things forever tend to make them bloat, and bloated code is more prone to cause
issues.

Actually no. API versioning won't relieve us from providing a stable and well thought-out API. If we're using API versioning to relieve us from building a proper and user-friendly API, the effort will fail.

In fact, most versioning recommendations don't enforce version increase when a change is additive and a default keeps it backward-compatible.

Again, if we misuse API versioning to make backward incompatible API changes, then we shound't announce the Dashboard API as the replacement for restful.

Having API versioning wouldn't have avoided those issues by itself, but:
  • It'd have made the errors more explicit: if a new back-end only provides Auth model v2, but front-end is calling v1/unversioned API, we'll get an API mismatch error.

interesting. Can you point me to a PR / issue?

  • It'd raise errors that might not have surfaced: even worse that breaking things due to backward-compatibility issues is having everything working apparently normal, while you're corrupting data.

We don't need API versioning to think that this sounds like a genue backend bug. The API should never allow the corruption of data.

  • It forces to think on backward compatibility: when you modify a data structure, which has a clearly visible version number, you tend to wonder if you might be breaking something.

Only thinking about backward-compatible changes for versioned classes sounds very dangerous to me! I don't think we should go in this direction.

As we don't have any team/individual fully dedicated to polish the API

Introducing API versioning is no mitigation for bad API design.

I'd suggest taking a minimalistic approach (model-based versioning, which fits great with Mime-type versioning) and some
defensive code/fencing (testing, change detection).

Do you really think Mime-type versioning is the right approach?

Edit:

As we don't have any team/individual fully dedicated to polish the API, I'd suggest [adding] versioning.

I'd strongly suggest that having a polished API is a necessity for adding any form of versioning.

#8 Updated by Ernesto Puerta over 1 year ago

Sebastian Wagner wrote:

Ernesto Puerta wrote:

  • the Rook caching issue was triggered by a change in front-end/back-end auth/user model,

That might be interesting. do you have details?

https://github.com/rook/rook/issues/3424

  • the change in RGW default group/zonegroup that made NFS Ganesha QA test fail,

Right now I'm not seeing how incasing anA PI version would have helped here. Do you have details?

Before dashboard's RGW multisite work by Alfonso, buckets were assumed to be created in the default zone + zonegroup. By extending the API to support specifying those 2 parameters, we have broken an "existing" API contract.

Versioning is adding an extra parameter to the API contract. If there's an API version mismatch, you'll catch that earlier (and in a more controlled way) than any formal or semantic mismatch.

  • Ceph has in fact very little versioning. Maybe except for the v1 vs v2 messaging protocol and straw vs straw2. but each of those cases are typically an exception. actually it's more like the opposite: We are not calling cpeh osd pool2 create etc. The APIs or Ceph are typically not versioned at all.
I wouldn't say that. Just the opposite, Ceph is full of versioning:
  • All major libraries (librados, librbd, librgw, etc.) have versioning macros and version-based deprecations.
  • All Ceph wire messages are versioned and have a compat version.
  • "Ceph Features" is in fact a more complex way of versioning (a manifest of features supported by version).
  • And of course, by being part of a Ceph release, API users can already adapt their clients to cope with differeent Dashboard API versions.

The truth is that being part of Ceph does nothing automagically for dashboard. As I mentioned above, even within Ceph there's per-component versioning, which helps API producers & consumers detect/understand interface & behavioral changes without code/doc inspection.

The statement on not having breaking changes and making the API additive is great on the paper, but adding things forever tend to make them bloat, and bloated code is more prone to cause
issues.

Actually no. API versioning won't relieve us from providing a stable and well thought-out API. If we're using API versioning to relieve us from building a proper and user-friendly API, the effort will fail.

Best practices don't mean to exempt you from anything else. They're just objective & measurable goals. Having a "proper and user-friendly" API is a subjective statement. Having a versioned API is an objective one. We should aim at having a "proper and user-friendly API", whatever that means, but in the meantime we should adopt API best practices.

In fact, most versioning recommendations don't enforce version increase when a change is additive and a default keeps it backward-compatible.

Again, if we misuse API versioning to make backward incompatible API changes, then we shound't announce the Dashboard API as the replacement for restful.

The truth is that WE ARE currently doing backward-incompatible changes. By using versioning/change-control mechanisms we make those changes easier to handle. And most APIs have: versioning + breaking changes.

Having API versioning wouldn't have avoided those issues by itself, but:
  • It'd have made the errors more explicit: if a new back-end only provides Auth model v2, but front-end is calling v1/unversioned API, we'll get an API mismatch error.

interesting. Can you point me to a PR / issue?

Yes, I was explicitly referring to the Rook issue above mentioned.

If you call /api/auth with stay_signed_in parameter in Nautilus (as it was in Mimic), you'll get the same issue. If we, instead, used /api/v2/auth or /api/auth with Mime-Type application/vnd-ceph.dashboard.auth-v2+json against a v1 endpoint, we could simply report a "version mistmatch v2 - v1", which makes quicker to debug the issue: as I don't have to inspect controller/auth.py code to discover that stay_signed_in param has been dropped.

  • It'd raise errors that might not have surfaced: even worse that breaking things due to backward-compatibility issues is having everything working apparently normal, while you're corrupting data.

We don't need API versioning to think that this sounds like a genue backend bug. The API should never allow the corruption of data.

Yes, bad things should never happen, but they DO happen. So we better adopt proven best-practices.

  • It forces to think on backward compatibility: when you modify a data structure, which has a clearly visible version number, you tend to wonder if you might be breaking something.

Only thinking about backward-compatible changes for versioned classes sounds very dangerous to me! I don't think we should go in this direction.

What's versioning about if not about changes?

As we don't have any team/individual fully dedicated to polish the API

Introducing API versioning is no mitigation for bad API design.

Having no API versioning at all with an API that's changing (with backward-incompatible changes) is by itself a smell of an improvable design.

I'd suggest taking a minimalistic approach (model-based versioning, which fits great with Mime-type versioning) and some
defensive code/fencing (testing, change detection).

Do you really think Mime-type versioning is the right approach?

It's one of the approaches enumerated. IMHO some versioning is better that none.

Edit:

As we don't have any team/individual fully dedicated to polish the API, I'd suggest [adding] versioning.

I'd strongly suggest that having a polished API is a necessity for adding any form of versioning.

You don't need to wait to have a perfect application to start pushing changes to Github. Why should we wait to have a "polished" API to have some change control?

#9 Updated by anurag bandhu 9 months ago

  • Assignee set to anurag bandhu

#10 Updated by Lenz Grimmer 9 months ago

Related to this conversation: https://pad.ceph.com/p/rest-api-stability-requirements - a proposal to define some test criteria that stable branches need to meet for every PR merged.

#12 Updated by Ernesto Puerta 9 months ago

Example (proposed reference): BlueJeans API (2 year old/quite stable API):
  • URL-versioned. Example: /v1/{resource}/
  • Per-component version. Example: Only enterprise resource is v2 (/v2/enterprise/), but everything else is v1.

Proposed Stability Contract: Prometheus API.

#13 Updated by Alfonso Martínez 7 months ago

  • Status changed from New to In Progress
  • Assignee changed from anurag bandhu to Christopher Odom

#14 Updated by Ernesto Puerta 7 months ago

Draft Proposal

Versioning workflow

  • An API client must always state the version they aim to consume for a given endpoint and method (GET, POST, PUT, ...): this constitutes the tuple (endpoint, method, version)
    • The version for that endpoint is specified in the Accept HTTP Header
    • For encoding the version, a vendor-specific MIME subtype is used:
      • Header: application/vnd.ceph.api.v[version]+json
    • version will potentially support a major.minor definition:
      • version = 1*DIGIT [ "." *DIGIT]
      • Examples of valid versions: "1", "1.0", "1.0000", "1.1"
  • In case of a successful response, the server will respond with a Content-Type including the matched version.
  • A given endpoint can support more than a single version:
    • By design, all previous minor versions are supported (endpoint v2.3 will also support requests made for endpoints v2.0, v2.1 and v2.2)
    • Additionally, several major versions might be supported if the legacy implementations are easy to maintain.
  • A client may check the versions supported by an endpoint using HTTP OPTIONS method and specifying multiple media for each version (e.g.: Accept: application/vnd.ceph.api.v2.0+json; application/vnd.ceph.api.v1.0+json;)

Versioning semantics and practices

The purpose of the explicit versioning is not to promote breaking-changes in the API, but to allow client-servers to handle gracefully interface mismatches.

Rules

  • All existing and new endpoints will start with the version v1.0.
  • Changes to the existing interfaces should always be additive:
    • Added request parameters should be optional and default to the legacy values.
    • Added response data should not alter the existing data hierarchies (users should still be able to traverse the data in the same way).
    • Additive changes will increment the minor version. That means that clients can consume endpoints providing higher minor versions, but a client requiring a higher minor version won't work with an endpoint providing a lower minor version.
  • If breaking changes are required on an endpoint, this will increment the major version (and set the minor to 0).

Example

Let's figure out how this whole thing would work with some real-world examples.

GET /rbd

Current syntax only allows to retrieve all RBD images in the cluster or in a specific pool (the following decorator is just an example):

# Endpoint 1 - First version
@RESTController.Collection(version=Version(1,0))
def list(self, pool_name=None):
  ...

Let's assume that we wanted to increase the granularity of this endpoint and filter RBD images by namespace. This is an additive change so this result in a new minor version:

# Endpoint 2 - Minor additive change: namespace (minor version increase: 1.0 -> 1.1)
@RESTController.Collection(version=Version(1,1))
def list(self, pool_name=None, namespace=None):
  ...

What if we find that allowing users to query all images across all pools is a heavy-weight operation and hence dangerous, and we decide to force users to ALWAYS define the pool (breaking change)? This is a breaking change, so it results in a new major version:

# Endpoint 3 - Major change: forcing pool_name to be defined (breaking changes, major version increase: 1.1 -> 2.0)

@RESTController.Collection(version=Version(2,0))
def list(self, pool_name, namespace=None):
  ...

How hard would it be to keep backward compatibility here? If we use a multimethod pattern, we can easily keep multiple implementations for the same endpoint:

# Endpoint 4 - Major change BUT keeping the legacy implementation
@RESTController.Collection(version=Version(1,1))
def list(self, pool_name=None, namespace=None):
  ...

@RESTController.Collection(version=Version(2,0))
def list(self, pool_name, namespace=None):
  return self.list[Version(1,1)](pool_name, namespace)

Results:

Client \ Endpoint 1 (v1.0) 2 (v1.1) 3 (v2.0) 4 (v2.0 + v1.1)
GET /rbd?pool=rbd with Accept: application/vnd.ceph.api.v1 200 200 415 200
GET /rbd?pool=rbd with Accept: application/vnd.ceph.api.v1.1 415 200 415 200
GET /rbd?pool=rbd with Accept: application/vnd.ceph.api.v2 415 415 200 200

Implementation Details

WIP

#15 Updated by Volker Theile 7 months ago

Looks good, but i have one suggestion. If the version is not defined via Accept HTTP Header, i would like to see that the latest version is used automatically.

Update:
When you write a script for Ceph X without versioning and Ceph Y changes the used REST API endpoint this will finally break your script. With explicit versioning this will not happen. So requiring versioning seems to be good.

#16 Updated by Alfonso Martínez 6 months ago

  • Pull request ID set to 35769

#17 Updated by Avan Thakkar 4 months ago

  • Assignee changed from Christopher Odom to Avan Thakkar

#18 Updated by Lenz Grimmer 3 months ago

  • Status changed from In Progress to Resolved
  • Target version set to v16.0.0

Also available in: Atom PDF