Project

General

Profile

Actions

Bug #63029

open

Upmap balancer: output of "osdmaptool <file> --upmap" says no optimizations, even though there are change recommendations in the out file

Added by Laura Flores 7 months ago. Updated 6 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):


Related issues 1 (1 open0 closed)

Copied to RADOS - Backport #63375: reef: Upmap balancer: output of "osdmaptool <file> --upmap" says no optimizations, even though there are change recommendations in the out fileIn ProgressLaura FloresActions
Actions #1

Updated by Laura Flores 7 months ago

  • Affected Versions v18.2.0 added
Actions #2

Updated by Laura Flores 7 months ago

I have looked into the issue. The problem exists on Reef, but does not reproduce in Quincy (17.2.6).

When I ran the upmap balancer on Reef (18.2.0), I noticed a log line from check_pg_upmaps that corresponds with
the pg mapping that was added in the output file (ceph osd pg-upmap-items 8.e 11 19).

Note the line "check_pg_upmaps simplifying partially no-op pg_upmap_items 8.e [11,19] -> [11,19]".

Reef v18.2.0:

$ ./bin/osdmaptool ~/bz_2241104/osdmap_cluster --upmap out.txt --debug_osd=10
./bin/osdmaptool: osdmap file '/home/lflores/bz_2241104/osdmap_cluster'
writing upmap command output to: out.txt
checking for upmap cleanups
2023-09-28T16:58:57.043+0000 7f8c9f6c9100 10 clean_pg_upmaps
2023-09-28T16:58:57.043+0000 7f8c9f6c9100 10 check_pg_upmaps pg 8.e weight_map {0=0.05,1=0.05,2=0.05,3=0.05,4=0.05,5=0.05,6=0.05,7=0.05,8=0.05,9=0.05,10=0.05,11=0.05,12=0.05,13=0.05,14=0.05,15=0.05,16=0.05,17=0.05,18=0.05,19=0.05}
2023-09-28T16:58:57.043+0000 7f8c9f6c9100 10 check_pg_upmaps simplifying partially no-op pg_upmap_items 8.e [11,19] -> [11,19]
upmap, max-count 10, max deviation 5

Quincy v17.2.6, on the other hand, does not have that log line:

$ ./bin/osdmaptool ~/bz_2241104/osdmap_cluster --upmap out.txt --debug_osd=10
./bin/osdmaptool: osdmap file '/home/lflores/bz_2241104/osdmap_cluster'
writing upmap command output to: out.txt
checking for upmap cleanups
upmap, max-count 10, max deviation 5
2023-09-28T16:45:46.997+0000 7f75787c4080 10 clean_pg_upmaps
2023-09-28T16:45:46.997+0000 7f75787c4080 10 check_pg_upmaps pg 8.e weight_map {0=0.05,1=0.05,2=0.05,3=0.05,4=0.05,5=0.05,6=0.05,7=0.05,8=0.05,9=0.05,10=0.05,11=0.05,12=0.05,13=0.05,14=0.05,15=0.05,16=0.05,17=0.05,18=0.05,19=0.05}
pools cephfs.cephfs.data ecpool_2 test4 test1 test2 cephfs.cephfs.meta .mgr test3
2023-09-28T16:45:46.997+0000 7f75787c4080 10 calc_pg_upmaps pools 3
2023-09-28T16:45:46.997+0000 7f75787c4080 10  osd_weight_total 1
2023-09-28T16:45:46.997+0000 7f75787c4080 10  pgs_per_weight 96

The extra line in Reef v18.2.0 comes from this part of the code:
check_pg_upmaps: https://github.com/ceph/ceph/blob/5dd24139a1eada541a3bc16b6941c5dde975e26d/src/osd/OSDMap.cc#L2148-L2156

      } else {
        //Josh--check partial no-op here.
        ldout(cct, 10) << __func__ << " simplifying partially no-op pg_upmap_items " 
                       << j->first << " " << j->second
                       << " -> " << newmap
                       << dendl;
        to_remap->insert({pg, newmap});
        any_change = true;

In Quincy v17.2.6, the code looks like this. Note the "else" vs. "else if" statement.
check_pg_upmaps: https://github.com/ceph/ceph/blob/d7ff0d10654d2280e08f1ab989c7cdf3064446a5/src/osd/OSDMap.cc#L2109-L2115

      } else if (newmap != j->second) {
        ldout(cct, 10) << " simplifying partially no-op pg_upmap_items " 
                       << j->first << " " << j->second
                       << " -> " << newmap
                       << dendl;
        to_remap->insert({pg, newmap});
        any_change = true;

I changed the line in Reef v18.2.0 back to the previous "else if" statement, and that fixed this particular scenario:

--- a/src/osd/OSDMap.cc
+++ b/src/osd/OSDMap.cc
@@ -2145,7 +2145,7 @@ bool OSDMap::check_pg_upmaps(
                        << j->first << " " << j->second
                        << dendl;
         to_cancel->push_back(pg);
-      } else {
+      } else if (newmap != j->second) {
         //Josh--check partial no-op here.
         ldout(cct, 10) << __func__ << " simplifying partially no-op pg_upmap_items " 
                        << j->first << " " << j->second

$ ./bin/osdmaptool ~/bz_2241104/osdmap_cluster --upmap out.txt --debug_osd=10
./bin/osdmaptool: osdmap file '/home/lflores/bz_2241104/osdmap_cluster'
writing upmap command output to: out.txt
checking for upmap cleanups
upmap, max-count 10, max deviation 5
2023-09-28T17:41:36.931+0000 7f94ac20f100 10 clean_pg_upmaps
2023-09-28T17:41:36.931+0000 7f94ac20f100 10 check_pg_upmaps pg 8.e weight_map {0=0.05,1=0.05,2=0.05,3=0.05,4=0.05,5=0.05,6=0.05,7=0.05,8=0.05,9=0.05,10=0.05,11=0.05,12=0.05,13=0.05,14=0.05,15=0.05,16=0.05,17=0.05,18=0.05,19=0.05}
pools .mgr test2 cephfs.cephfs.meta ecpool_2 test3 test4 test1 cephfs.cephfs.data 
2023-09-28T17:41:36.931+0000 7f94ac20f100 10 calc_pg_upmaps pools 1
2023-09-28T17:41:36.931+0000 7f94ac20f100 10  osd_weight_total 1
2023-09-28T17:41:36.931+0000 7f94ac20f100 10  pgs_per_weight 3
2023-09-28T17:41:36.931+0000 7f94ac20f100 10 calc_pg_upmaps distribution is almost perfect
2023-09-28T17:41:36.931+0000 7f94ac20f100 10 calc_pg_upmaps pools 9
2023-09-28T17:41:36.931+0000 7f94ac20f100 10  osd_weight_total 1

I tested this bug on other osdmaps, and the same happens, where many more suggestions are made in the out file than there should be.

Actions #3

Updated by Laura Flores 7 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 53720
Actions #4

Updated by Laura Flores 7 months ago

  • Backport set to reef
Actions #5

Updated by Laura Flores 6 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Laura Flores 6 months ago

  • Copied to Backport #63375: reef: Upmap balancer: output of "osdmaptool <file> --upmap" says no optimizations, even though there are change recommendations in the out file added
Actions #7

Updated by Laura Flores 6 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF