Project

General

Profile

Bug #16176

objectmap does not show object existence correctly

Added by Xinxin Shu about 1 year ago. Updated 7 months ago.

Status:
Resolved
Priority:
High
Target version:
-
Start date:
06/07/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

in the latest master, i create 10GB rbd and use rbd bench-write to fill objects, and i used "ctrl + c" to terminate bench-write process, then i check objects,it shows there are 26 objects

[root@ctrl src]# ./rados -c /mnt/data/devs/ceph.conf -p rbd ls | grep rbd_data.113a74b0dc51 | wc -l
2016-06-07 10:02:45.681910 7f438e50ba40 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-06-07 10:02:45.682067 7f438e50ba40 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-06-07 10:02:45.713863 7f438e50ba40 -1 WARNING: the following dangerous and experimental features are enabled: *
26
[root@ctrl src]#

but when i use 'rbd du' to disk usage, it only shows that only 6 objects exist, and the objectmap flag is valid
NAME PROVISIONED USED
test 10240M 24576k

[root@ctrl src]# ./rbd -c /mnt/data/devs/ceph.conf -p rbd info test
2016-06-07 10:03:55.902227 7f6963a84d80 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-06-07 10:03:55.902758 7f6963a84d80 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-06-07 10:03:55.935078 7f6963a84d80 -1 WARNING: the following dangerous and experimental features are enabled: *
rbd image 'test':
size 10240 MB in 2560 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.113a74b0dc51
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:

bug-16176-log.zip (749 KB) Xinxin Shu, 11/22/2016 08:26 AM


Related issues

Copied to rbd - Backport #18289: kraken: objectmap does not show object existence correctly Closed
Copied to rbd - Backport #18290: jewel: objectmap does not show object existence correctly Resolved

History

#1 Updated by Samuel Just about 1 year ago

  • Project changed from Ceph to rbd
  • Category deleted (librbd)

#2 Updated by Jason Dillaman about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
  • Priority changed from High to Normal

#3 Updated by Jason Dillaman about 1 year ago

  • Status changed from In Progress to Need More Info

@Xinxin: I've tried to repeat your findings without success. Can you repeat with "debug rbd = 20" in your ceph client configuration and then attach the associated bench-write log files?

Also, did you image have a snapshot? If it does, that would potentially explain the issue.

#4 Updated by Jason Dillaman about 1 year ago

  • Assignee deleted (Jason Dillaman)

#5 Updated by Jason Dillaman 10 months ago

  • Needs Doc set to No

@Xinxin: ping -- any additional information?

#6 Updated by Xinxin Shu 9 months ago

hi jason, sorry for late response, i can reproduce this on the latest master, it seems that this only happens with async messenger, the 'rados ls' output is greater than 'rbd du' output, but with simple messenger i can not reproduce this, the 'rados ls' output is lower than or equal to 'rbd du' output, i think this is reasonable since librbd always updates object map before writing the data to ceph cluser

however in both cases, inconsistency state always occurs, my first thought is to update the object map and set the object map as invalid or other transient state, object map is only set to valid until the write completes, does this make sense to you

#7 Updated by Jason Dillaman 9 months ago

I am still not following the issue and will require more information. To answer your question, you cannot set the object map to valid after the block is updated because that would mean the object map doesn't reflect reality (i.e. if you were to crash after adding the object but before updating the object map, you would have an orphaned object).

#8 Updated by Xinxin Shu 9 months ago

m thought is as follows

1. update object map, set object map state to invalid or transient state
2. write data to rados
3. clear invalid state or change transient state to valid state

if rbd crashed after step 2, object map is updated but it is not a valid state, since we provide object map rebuild api, the user should rebuild objectmap if the objectmap is invalid.

#9 Updated by Jason Dillaman 9 months ago

You really need to explain to me the problem you are attempting to solve because I am not understanding it.

#10 Updated by Xinxin Shu 9 months ago

i want to fix the inconsistency between 'rbd df' and 'rados ls', as this tracker state, 'rados ls' show that there are 26 objects in this rbd, 'rbd du' should return 106496K, but actually it shows 24576K, only 6 objects has been written in this rbd, this is the inconsistenc of these two commands, i investigate write flow, found that librbd always update objectmap before writing data, if librbd crashes after updating objectmap, but before writing data, these two commands's output is inconsistent.

#11 Updated by Jason Dillaman 9 months ago

OK, I think I am still not understanding the issue. If `rados ls` shows 26 objects (104MB of used space) but `rbd du` shows 24MB of used space (6 objects), how is what you are suggesting going to address this issue? Since the object map is updated before the object is written, if anything, `rbd du` should say 104MB of space is used and `rados ls` would only show 24MB.

#12 Updated by Xiaoxi Chen 9 months ago

it looks to me like the only possibility is random write instead of sequential write was used to fill the drive, rbd du report 24MB doesnt necessary means 6 objects, it just iter the diff and sum up.

The easiest way to make this "issue" clear, @Xinxin, could you pls add output in https://github.com/ceph/ceph/blob/master/src/tools/rbd/action/DiskUsage.cc#L27? just log out <offset, len, exists> , which will be clear enough to show your disk structure to everybody.

#13 Updated by Xinxin Shu 9 months ago

hi jason, i reproduce this bug, detail logs are attached, pls check the log, "./bin/rbd -c ceph.conf -p rbd bench-write test" is used to fill data

[root@ceph01 build]# ./bin/rados -p rbd -c ceph.conf ls | grep rbd_data
2016-11-22 15:31:30.455290 7f6a0078ca00 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-11-22 15:31:30.455679 7f6a0078ca00 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-11-22 15:31:30.488535 7f6a0078ca00 -1 WARNING: the following dangerous and experimental features are enabled: *
rbd_data.100b74b0dc51.0000000000000594
rbd_data.100b74b0dc51.0000000000000683
rbd_data.100b74b0dc51.00000000000000b6
rbd_data.100b74b0dc51.000000000000052a
rbd_data.100b74b0dc51.0000000000000682
rbd_data.100b74b0dc51.0000000000000856
rbd_data.100b74b0dc51.00000000000002a7
rbd_data.100b74b0dc51.000000000000010a
rbd_data.100b74b0dc51.00000000000001a5
rbd_data.100b74b0dc51.000000000000068b
rbd_data.100b74b0dc51.00000000000001ed
rbd_data.100b74b0dc51.0000000000000279
rbd_data.100b74b0dc51.000000000000027a
rbd_data.100b74b0dc51.000000000000020d
rbd_data.100b74b0dc51.00000000000008b3
rbd_data.100b74b0dc51.000000000000083f

[root@ceph01 build]# ./bin/rbd -p rbd -c ceph.conf du test
2016-11-22 15:32:25.802697 7f2b35252ec0 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-11-22 15:32:25.802915 7f2b35252ec0 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-11-22 15:32:25.839371 7f2b35252ec0 -1 WARNING: the following dangerous and experimental features are enabled: *
NAME PROVISIONED USED
test 10240M 53248k

#14 Updated by Jason Dillaman 9 months ago

@Xinxin: Thanks for the debug logs. I will try to take a look at this today or tomorrow.

#15 Updated by Jason Dillaman 9 months ago

  • Backport set to kraken,jewel

OK, I see the issue now. In the face of a crash during in-flight IO, where more than one in-flight IO maps to the same backing object, while the first IO to the object is blocked waiting for the object map update, the subsequent IOs to the same object are not blocked and can proceed to create the backing object potentially before the object map update is committed. This bug was introduced in the Infernalis release due to updating the in-memory object map before the on-disk object map was committed.

#16 Updated by Jason Dillaman 9 months ago

  • Status changed from Need More Info to In Progress
  • Assignee set to Jason Dillaman
  • Priority changed from Normal to High

#17 Updated by Jason Dillaman 9 months ago

  • Status changed from In Progress to Need Review

#18 Updated by Mykola Golub 8 months ago

  • Status changed from Need Review to Pending Backport

#19 Updated by Nathan Cutler 8 months ago

  • Copied to Backport #18289: kraken: objectmap does not show object existence correctly added

#20 Updated by Nathan Cutler 8 months ago

  • Copied to Backport #18290: jewel: objectmap does not show object existence correctly added

#21 Updated by Jason Dillaman 8 months ago

  • Backport changed from kraken,jewel to jewel

#22 Updated by Nathan Cutler 8 months ago

  • Backport changed from jewel to jewel,kraken

#23 Updated by Nathan Cutler 7 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF