Project

General

Profile

Bug #19119

pre-jewel "osd rm" incrementals are misinterpreted

Added by Ilya Dryomov 3 months ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Immediate
Assignee:
Category:
OSDMap
Target version:
-
Start date:
03/01/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
kraken,jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
master
Needs Doc:
No

Description

I have a bunch of misdirected requests from a recent kernel client to a hammer cluster, triggered by osd rm:

2017-02-27 15:37:56.976845 osd.190 10.115.1.133:6808/3914 97 : cluster [WRN] client.9450549 10.115.1.35:0/1493770383 misdirected client.9450549.1:1379645861 pg 2.ec640804 to osd.190 in e241865, client e241865 pg 2.804 features 288863570635346

e241864 -> e241865 incremental:

{
    "epoch": 241865,
    "fsid": "9e3e9015-f626-4a44-83f7-0a939ef7ec02",
    "modified": "2017-02-27 11:07:56.497658",
    "new_pool_max": -1,
    "new_flags": -1,
    "new_max_osd": -1,
    "new_pools": [],
    "new_pool_names": [],
    "old_pools": [],
    "new_up_osds": [],
    "new_weight": [],
    "osd_state_xor": [
        {
            "osd": 204,
            "state_xor": [
                "autoout",
                "exists" 
            ]
        },
        [],
        [],
        [],
        [],
        [],
        [],
        [],
        [],
        [
            {
                "osd": 204,
                "uuid": "00000000-0000-0000-0000-000000000000" 
            }
        ],
        {},
        []
    ]

On master:

$ bin/osdmaptool --test-map-pg 2.ec640804 /tmp/map-241864.bin 
bin/osdmaptool: osdmap file '/tmp/map-241864.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([197,201,1], p197) up ([197,201,1], p197) acting ([197,201,1], p197)
$ bin/osdmaptool --test-map-pg 2.ec640804 /tmp/map-241865.bin 
bin/osdmaptool: osdmap file '/tmp/map-241865.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([197,201,1], p197) up ([197,201,1], p197) acting ([197,201,1], p197)

but (with osdmaptool patched to accept and apply incrementals):

$ bin/osdmaptool --test-map-pg 2.ec640804 /tmp/map-241864.bin /tmp/inc-241865.bin 
bin/osdmaptool: osdmap file '/tmp/map-241864.bin'
bin/osdmaptool: incremental file '/tmp/inc-241865.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([190,1], p190) up ([190,1], p190) acting ([190,1], p190)

which is where the misdirected request was sent.

On hammer:

$ ./osdmaptool --test-map-pg 2.ec640804 /tmp/map-241864.bin
./osdmaptool: osdmap file '/tmp/map-241864.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([197,201,1], p197) up ([197,201,1], p197) acting ([197,201,1], p197)
$ ./osdmaptool --test-map-pg 2.ec640804 /tmp/map-241865.bin
./osdmaptool: osdmap file '/tmp/map-241865.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([197,201,1], p197) up ([197,201,1], p197) acting ([197,201,1], p197)

and (same osdmaptool patch):

$ ./osdmaptool --test-map-pg 2.ec640804 /tmp/map-241864.bin /tmp/inc-241865.bin
./osdmaptool: osdmap file '/tmp/map-241864.bin'
./osdmaptool: incremental file '/tmp/inc-241865.bin'
 parsed '2.ec640804' -> 2.ec640804
2.ec640804 raw ([197,201,1], p197) up ([197,201,1], p197) acting ([197,201,1], p197)

osdmaptool.diff View (2.08 KB) Ilya Dryomov, 03/01/2017 06:45 PM


Related issues

Related to Bug #13988: new OSD re-using old OSD id fails to boot Resolved 12/05/2015
Copied to Backport #19209: kraken: pre-jewel "osd rm" incrementals are misinterpreted In Progress
Copied to Backport #19210: jewel: pre-jewel "osd rm" incrementals are misinterpreted Resolved

History

#1 Updated by Ilya Dryomov 3 months ago

It looks like Sage's commit in https://github.com/ceph/ceph/pull/6900 is the culprit. That "set weight to 1" was carried over into the kernel client in 4.7.

https://github.com/idryomov/ceph/commits/wip-osd-rm-incremental fixes it for me, but I'm not sure what to do with all the jewel and kraken maps...

#2 Updated by Ilya Dryomov 3 months ago

  • Related to Bug #13988: new OSD re-using old OSD id fails to boot added

#3 Updated by Sage Weil 3 months ago

  • Status changed from New to Verified
  • Priority changed from Urgent to Immediate

#4 Updated by Ilya Dryomov 3 months ago

  • Status changed from Verified to Need Review
  • Backport set to kraken,jewel

#5 Updated by Ilya Dryomov 3 months ago

  • Subject changed from hammer "osd rm" incrementals are misinterpreted to pre-jewel "osd rm" incrementals are misinterpreted

#6 Updated by Ilya Dryomov 3 months ago

#7 Updated by Sage Weil 3 months ago

  • Subject changed from pre-jewel "osd rm" incrementals are misinterpreted to hammer "osd rm" incrementals are misinterpreted
  • Description updated (diff)
  • Status changed from Need Review to Verified

#8 Updated by Ilya Dryomov 3 months ago

  • Subject changed from hammer "osd rm" incrementals are misinterpreted to pre-jewel "osd rm" incrementals are misinterpreted
  • Description updated (diff)

#9 Updated by Ilya Dryomov 3 months ago

  • Status changed from Verified to Pending Backport

#10 Updated by Jan Fajerski 3 months ago

  • Copied to Backport #19209: kraken: pre-jewel "osd rm" incrementals are misinterpreted added

#11 Updated by Jan Fajerski 3 months ago

  • Copied to Backport #19210: jewel: pre-jewel "osd rm" incrementals are misinterpreted added

Also available in: Atom PDF