Project

General

Profile

Actions

Cleanup #44530

open

mgr/dashboard: speed-up Angular unit tests

Added by Ernesto Puerta about 4 years ago. Updated about 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
General
Target version:
-
% Done:

100%

Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Currently dashboard Angular unit tests take, depending on the environment, ~10-20 mins (and lots of CPU power/memory due to jest parallelization) to complete. This is mostly caused by the lack of isolation between components (fakes/mocks/stubs), which leads to most dependencies (children component, services, pipes, etc.) being instantiated and tested on every individual test-case (and there are ~100s).

The following strategies could be followed:
  • Use of mocking frameworks (e.g.: ng-mocks, ts-mocks). Currently jasmine has (namespace/clashing) issues with jest (e.g.: jasmine.createObjSpy) and jest provides very minimal mocking features.
  • Tune Angular default testing framework: (e.g.: https://www.npmjs.com/package/ng-bullet) to avoid repetitive creating/destroying infraestructure objects.
  • Use a unit testing helper framework: (like spectator). This would aid in the refactoring process by reducing the amount of boilerplate code.
  • Refactor tests (and components) to increase overall testability and reduce dependencies/side-effects (e.g.: using Directives or Pipes instead of services & DI for specifying behaviors). This should be backed by some agreed guidelines on the expectations from devels. regarding unit tests.
Converting "current unit tests" to "strict unit tests" could lead to:
  • ~700% speed-ups and decrease in resource utilization: as an example, navigation.component.spec.ts test took 138.995s (~3 secs./unit test) with the legacy approach (see https://github.com/ceph/ceph/pull/32419), and with the strict unit-testing it only takes 19.47s to finish (250 ms/unit-test, and much less peaks of CPU). As a result of this, the whole Angular unit test suite could be run in 2-3 mins max.
  • Better coverage: currently part of the coverage is 'accidental' (100% packages, classes or files), as it's caused by the subsequent running of dependent components. However, when looking at method or branch coverage, we may see that the average coverage is 70-76%, and in some components that figure plunges to 0% (https://jenkins.ceph.com/job/ceph-pull-requests/46781/cobertura/). Proper branch coverage would require matrix-like/combinatorial testing, but with the current overhead per-unit test, that's unaffordable.
  • Reduce 'flappiness' of tests: currently, some services are not mocked, and perform polling to the backend (e.g.: SummaryService in navigation.component.ts) or asynchronous behaviours/spurious race conditions.

Subtasks 1 (0 open1 closed)

Cleanup #45433: mgr/dashboard: Don't have two different unit test mechanicsResolvedStephan Müller

Actions
Actions #1

Updated by Ernesto Puerta about 4 years ago

  • Description updated (diff)
Actions #2

Updated by Tiago Melo about 4 years ago

Regarding ng-bullet:
We already use this technique in our project, although only during dev mode.
The reason for only using it during dev is that in some situations we get different results than when we run in prod.

I notice that ng-bullet has a few small differences from our code that we might want to implement.
If those changes fix the problem mentioned before, we could also use this in prod.

Our version:
https://github.com/ceph/ceph/blob/4c9c4229edb4dc20b6c357c9298a9c63a528e582/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts#L23
ng-bullet:
https://github.com/topnotch48/ng-bullet-workspace/blob/master/projects/ng-bullet/src/testing.ts#L13

Actions #3

Updated by Ernesto Puerta about 4 years ago

Tiago Melo wrote:

Regarding ng-bullet:
We already use this technique in our project, although only during dev mode.
The reason for only using it during dev is that in some situations we get different results than when we run in prod.

I notice that ng-bullet has a few small differences from our code that we might want to implement.
If those changes fix the problem mentioned before, we could also use this in prod.

Our version:
https://github.com/ceph/ceph/blob/4c9c4229edb4dc20b6c357c9298a9c63a528e582/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts#L23
ng-bullet:
https://github.com/topnotch48/ng-bullet-workspace/blob/master/projects/ng-bullet/src/testing.ts#L13

Thanks for pointing that out, Tiago. I didn't realize that both things were so close, but I see some subtle differences. Nevertheless, if you check this ng-bullet author's blog post you may see the rationale behind those.

Actions #4

Updated by Stephan Müller almost 4 years ago

Hi, just tested it on my current branch with the pool form (currently 8 test are failing an 99 tests are passing).

I ran 5 tests to compare them and I will not count the first one as the first test allways took a bit longer (with our solution 24s 390ms with the ng-bullet solution 1m 9s 540ms).

Our solution => 21s 501ms / 20s 475ms / 20s 422ms / 20s 571ms = 20s 742ms

ng-bullet solution => 23s 25ms / 22s 507ms / 21s 850ms / 22s 396ms = 22s 501 ms

without any solution => 6m 46s 482ms (oO)

I tested the new solution with the telemetry spec as I know our solution didn't pass for a test and it passed! I took a closer look and tried out what exactly made it pass it seems that setting testBedApi._instantiated to false solved the issue!

With the combined solution its 20s 922ms / 22s 852ms / 20s 838ms / 20s 687ms / 20s 710ms - so it's slightly slower than our soultion with 21s 201ms.

I've just run all suites with the combined solution and all suites passed :) It took 7m 45s 214ms with out using it for every suite it took 9m 28s 467ms

I'll provide a PR to enhance our solution.

To the other points mentioned above they are all valid and should be addressed. I also would recommend to create sub issues to discuss all points separately.

Actions #5

Updated by Ernesto Puerta almost 4 years ago

Interesting! Thanks @Stephan Hohn (and @Tiago, I read your response long time ago but forgot to ack).

On the differences between both approaches, could it be because ours is not destroying fixtures after tests or restoring the default resetTestinModule() (I don't get the rationale of everything here, but if we start seeing flapping failures, perhaps it's some of this).

Actions #6

Updated by Ernesto Puerta about 3 years ago

  • Project changed from mgr to Dashboard
  • Category changed from 132 to General
Actions

Also available in: Atom PDF