Project

General

Profile

Actions

Bug #15312

closed

stray lfn object after split

Added by Samuel Just about 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
infernalis
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2016-03-29 17:23:24.887554 7f5155037700 10 osd.2 267 Splitting pg[1.2s1( v 216'341 lc 216'340 (0'0,216'341] local-les=219 n=12 ec=7 les/c/f 219/201/0 237/252/59) [0,2147483647,4]/[0,1,4] r=-1 lpr=252 pi=199-251/5 crt=216'341 lcod 0'0 remapped NOTIFY m=1] into 1.42s1
...
2016-03-29 17:23:24.890658 7f5155037700 10 journal op_submit_start 14633
2016-03-29 17:23:24.890663 7f5155037700 5 filestore(/var/lib/ceph/osd/ceph-2) queue_transactions (writeahead) 14633 [Transaction(0x7f51832cfb80)]
...
2016-03-29 17:23:24.895759 7f5164dac700 20 LFNIndex(/var/lib/ceph/osd/ceph-2/current/1.2s1_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-2/current/1.2s1_head/smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_0_long
...
2016-03-29 17:23:24.896374 7f5164dac700 20 LFNIndex(/var/lib/ceph/osd/ceph-2/current/1.2s1_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-2/current/1.2s1_head/smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_0_long
...
2016-03-29 17:23:24.901297 7f5164dac700 10 filestore(/var/lib/ceph/osd/ceph-2) _finish_op 0x7f51804d7c20 seq 14633 osr(1.2s1 0x7f517fec03b0)/0x7f517fec03b0 lat 0.011272
...
2016-03-29 17:23:24.919000 7f51625a7700 10 osd.2 pg_epoch: 267 pg[1.2s1( v 216'341 lc 216'340 (0'0,216'341] local-les=219 n=6 ec=7 les/c/f 219/201/0 257/258/59) [0,5,4] r=-1 lpr=258 pi=199-257/7 crt=216'341 lcod 0'0 inactive NOTIFY] flushed
...
2016-03-29 17:23:25.654527 7f515c846700 10 osd.2 pg_epoch: 268 pg[1.2s1( v 216'341 lc 216'340 (0'0,216'341] local-les=219 n=6 ec=7 les/c/f 219/201/0 257/258/59) [0,5,4] r=-1 lpr=258 pi=199-257/7 crt=216'341 lcod 0'0 inactive NOTIFY] on_removal
...
2016-03-29 17:23:25.654672 7f515c846700 10 osd.2 pg_epoch: 268 pg[1.2s1( v 216'341 lc 216'340 (0'0,216'341] lb MIN (bitwise) local-les=219 n=6 ec=7 les/c/f 219/201/0 257/258/59) [0,5,4] r=-1 lpr=258 pi=199-257/7 crt=216'341 lcod 0'0 inactive NOTIFY] on_shutdown
...
2016-03-29 17:23:25.662119 7f514e82a700 20 snap_mapper.remove_oid 1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145
14614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628
7288289290291292293294295296297298299:b9

434de076 should have been mapped into 1.42.

sjust@teuthology:/a/samuelj-2016-03-28_22:15:42-rados-wip-sam-working-distro-basic-smithi/93689/remote


Related issues 1 (0 open1 closed)

Copied to Ceph - Backport #15488: infernalis: stray lfn object after splitRejectedActions
Actions #1

Updated by Samuel Just about 8 years ago

Here's a hint:

2016-03-29 17:22:23.948967 7fe9c5937700 10 journal op_apply_start 14216 open_ops 0 -> 1
2016-03-29 17:22:23.948979 7fe9c5937700 5 filestore(/var/lib/ceph/osd/ceph-2) _do_op 0x7fe9ea543660 seq 14216 osr(1.2s1 0x7fe9df7639f0)/0x7fe9df7639f0 start
2016-03-29 17:22:23.948985 7fe9c5937700 10 filestore(/var/lib/ceph/osd/ceph-2) _do_transaction on 0x7fe9e6091f80
2016-03-29 17:22:23.948967 7fe9c5136700 10 osd.2 pg_epoch: 220 pg[1.2s1( v 216'341 lc 216'340 (0'0,216'341] local-les=219 n=12 ec=7 les/c/f 219/201/0 217/218/59) [0,2,4] r=1 lpr=219 pi=199-217/2 luod=0'0 crt=216'341 lcod 215'339 active m=1] _committed_pushed_object last_complete 216'340 now ondisk
2016-03-29 17:22:23.949000 7fe9c5937700 15 filestore(/var/lib/ceph/osd/ceph-2) write 1.2s1_TEMP/1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# 1048576~73728
2016-03-29 17:22:23.949134 7fe9c5937700 10 filestore(/var/lib/ceph/osd/ceph-2) write 1.2s1_TEMP/1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# 1048576~73728 = 73728
2016-03-29 17:22:23.949148 7fe9c5937700 15 filestore(/var/lib/ceph/osd/ceph-2) remove 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134
13513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527
6277278279280281282283284285286287288289290291292293294295296297298299:b9#
2016-03-29 17:22:23.949305 7fe9c5937700 10 filestore(/var/lib/ceph/osd/ceph-2) remove 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134
13513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527
6277278279280281282283284285286287288289290291292293294295296297298299:b9# = 0
2016-03-29 17:22:23.949327 7fe9c5937700 15 filestore(/var/lib/ceph/osd/ceph-2) _collection_move_rename 1.2s1_head/1#1:434de076:::smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281
29130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270
271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# from 1.2s1_TEMP/1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head#
2016-03-29 17:22:23.949349 7fe9c5937700 10 filestore(/var/lib/ceph/osd/ceph-2) _set_replay_guard 14216.0.2 START
2016-03-29 17:22:23.950041 7fe9c5937700 20 filestore dbobjectmap: seq is 6616
2016-03-29 17:22:23.950286 7fe9c5937700 10 filestore(/var/lib/ceph/osd/ceph-2) _set_replay_guard 14216.0.2 done

and that thread was never...heard...from...again

because the process restarted on the very next line

2016-03-29 17:23:24.501179 7f51754df800 0 ceph version 10.0.5-2740-ge2d50a0 (e2d50a01f29a13f0ba8b51257f009639bd46e746), process ceph-osd, pid 8820

A bit later:

2016-03-29 17:23:24.515008 7f51754df800 10 journal journal_replay fs op_seq 14220

We are replaying from 14220? That's rather odd since we stopped right in the middle of 14216...

Actions #2

Updated by Samuel Just about 8 years ago

objectstore_tool.8490.log has 2016-03-29 17:22:30.342519 7f0f0bee4800 5 filestore(/var/lib/ceph/osd/ceph-2) mount op_seq is 14218

objectstore_tool.8471.log has 2016-03-29 17:22:30.282678 7f0cd4145800 5 filestore(/var/lib/ceph/osd/ceph-2) mount op_seq is 14216

objectstore_tool.8434.log has 2016-03-29 17:22:29.987954 7ffa99db4800 5 filestore(/var/lib/ceph/osd/ceph-2) mount op_seq is 13865, so that's the one that replayed our incomplete op.

2016-03-29 17:22:30.182810 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) do_transaction on 0x7ffaa4be3180
2016-03-29 17:22:30.182816 7ffa99db4800 20 filestore(/var/lib/ceph/osd/ceph-2) _check_global_replay_guard no xattr
2016-03-29 17:22:30.182828 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _check_replay_guard object has 14216.0.2 > current pos 14216.0.0, now or in future, SKIPPING REPLAY
2016-03-29 17:22:30.182833 7ffa99db4800 20 filestore(/var/lib/ceph/osd/ceph-2) _check_global_replay_guard no xattr
2016-03-29 17:22:30.182899 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) error opening file /var/lib/ceph/osd/ceph-2/current/1.2s1_head/smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long with flags=2: (2) No such file or directory
2016-03-29 17:22:30.182906 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _check_replay_guard 1.2s1_head 1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# dne
2016-03-29 17:22:30.182912 7ffa99db4800 15 filestore(/var/lib/ceph/osd/ceph-2) remove 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9#
2016-03-29 17:22:30.182998 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) remove 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# = 0
2016-03-29 17:22:30.183008 7ffa99db4800 15 filestore(/var/lib/ceph/osd/ceph-2) _collection_move_rename 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# from 1.2s1_TEMP/1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head#
2016-03-29 17:22:30.183014 7ffa99db4800 15 filestore(/var/lib/ceph/osd/ceph-2) collection_stat /var/lib/ceph/osd/ceph-2/current/1.2s1_head
2016-03-29 17:22:30.183016 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) collection_stat /var/lib/ceph/osd/ceph-2/current/1.2s1_head = 0
2016-03-29 17:22:30.183021 7ffa99db4800 20 filestore(/var/lib/ceph/osd/ceph-2) _check_global_replay_guard no xattr
2016-03-29 17:22:30.183075 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) error opening file /var/lib/ceph/osd/ceph-2/current/1.2s1_head/smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long with flags=2: (2) No such file or directory
2016-03-29 17:22:30.183081 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _check_replay_guard 1.2s1_head 1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# dne
2016-03-29 17:22:30.183089 7ffa99db4800 20 filestore(/var/lib/ceph/osd/ceph-2) _check_global_replay_guard no xattr
2016-03-29 17:22:30.183100 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _check_replay_guard object has 14216.0.2 == current pos 14216.0.2, in_progress=true, CONDITIONAL REPLAY
2016-03-29 17:22:30.183112 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _set_replay_guard 14216.0.2 START
2016-03-29 17:22:30.183155 7ffa99db4800 20 filestore dbobjectmap: seq is 6676
2016-03-29 17:22:30.184493 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _set_replay_guard 14216.0.2 done
2016-03-29 17:22:30.184590 7ffa99db4800 20 LFNIndex(/var/lib/ceph/osd/ceph-2/current/1.2s1_head) lfn_created smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long moving old name to alt attr smithi0465973-421 012345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928, new name is smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299
_b9_6E07B2C2__1_ffffffffffffffff_1
2016-03-29 17:22:30.184682 7ffa99db4800 20 filestore dbobjectmap: seq is 6677
2016-03-29 17:22:30.184699 7ffa99db4800 20 filestore dbobjectmap: seq is 6678
2016-03-29 17:22:30.184704 7ffa99db4800 20 filestore set_header: setting seq 6615
2016-03-29 17:22:30.184709 7ffa99db4800 20 filestore set_map_header: setting 6676 oid 1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# parent seq 6615
2016-03-29 17:22:30.184717 7ffa99db4800 20 filestore set_map_header: setting 6677 oid 1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# parent seq 6615
2016-03-29 17:22:30.184800 7ffa99db4800 20 filestore(/var/lib/ceph/osd/ceph-2) lfn_unlink: clearing omap on 1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# in cid 1.2s1_TEMP
2016-03-29 17:22:30.184809 7ffa99db4800 10 filestore oid: 1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# not skipping op, *spos 14216.0.2
2016-03-29 17:22:30.184810 7ffa99db4800 10 filestore > header.spos 0.0.0
2016-03-29 17:22:30.184811 7ffa99db4800 20 filestore remove_map_header: removing 6676 oid 1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head#
2016-03-29 17:22:30.184815 7ffa99db4800 20 filestore clear_header: clearing seq 6676
2016-03-29 17:22:30.184839 7ffa99db4800 20 filestore lookup_parent: parent 6615 for seq 6676
2016-03-29 17:22:30.184848 7ffa99db4800 20 filestore lookup_parent: parent seq is 6615 with parent 0
2016-03-29 17:22:30.184852 7ffa99db4800 20 filestore set_header: setting seq 6615
2016-03-29 17:22:30.184942 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _close_replay_guard 14216.0.2
2016-03-29 17:22:30.185077 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _close_replay_guard 14216.0.2 done
2016-03-29 17:22:30.185082 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) _collection_move_rename 1.2s1_head/1#1:434de076:::smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:b9# from 1.2s1_TEMP/1#-3:40000000:::temp_recovering_1.2s1_216'340_218_b9:head# = 0

Actions #3

Updated by Samuel Just about 8 years ago

2016-03-29 17:22:30.183075 7ffa99db4800 10 filestore(/var/lib/ceph/osd/ceph-2) error opening file /var/lib/ceph/osd/ceph-2/current/1.2s1_head/smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long with flags=2: (2) No such file or directory
is interesting because it suggests that we collided with something in the 0 slot -- probably a link created the last time but not selected for some reason.

Actions #4

Updated by Samuel Just about 8 years ago

2016-03-29 17:22:30.184590 7ffa99db4800 20 LFNIndex(/var/lib/ceph/osd/ceph-2/current/1.2s1_head) lfn_created smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long moving old name to alt attr smithi0465973-421 012345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928, new name is smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299__b9_6E07B2C2__1_ffffffffffffffff_1

If you look closely, the "old name" is a truncation of the new name. This is ext4, are we hitting an xattr limit?

Actions #5

Updated by Samuel Just about 8 years ago

len('smithi0465973-421 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299__b9_6E07B2C2__1_ffffffffffffffff_1')

843

It doesn't seem like we should be...

Actions #6

Updated by Samuel Just about 8 years ago

Of course, if the xattr is chained over multiple attrs, then setting it wouldn't be atomic... Bluestore cannot come soon enough.

Actions #7

Updated by Samuel Just about 8 years ago

It seems like the problem sequence is:

filestore: collection_move_rename a/b -> c/b
  LFNIndex: lookup a/b (short, temp recovery)
  LFNIndex: create name mangled(b, 0) (lfn, actual name)
  filestore: link a/b -> mangled(b, 0)
  LFNIndex: created mangled(b, 0), start writing full name into xattr
*restart daemon*
filestore: replay starting at collection_move_rename a/b -> c/b (in progress)
  LFNIndex: lookup a/b (short, temp recovery)
  LFNIndex: create name mangled(b, 1) (lfn, actual name) -- mangled(b, 0) still exists, but has an existing lfn attr which does not match (truncated)
  filestore: link a/b -> mangled(b, 1)
  LFNIndex: created mangled(b, 1), move truncated name to alt attr since it doesn't match, write full name into lfn attr
  filestore: remove a/b (does not do anything to the alt attr since it takes the short name path)
filestore: finish collection_move_rename

I think this leaves us with two links to the same object with a valid lfn attr and a truncate alt attr, mangled(b, 0) and mangled(b, 1)

smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_1_long
and
smithi0465973-421 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051_63111dd55de62cce8abd_0_long

Later, we do a collection_split 1.2s1->1.42s1 and list the collection.

I don't yet see why we get 'lfn_unlink removing alt attr from' on _0_long twice.

Actions #8

Updated by Samuel Just about 8 years ago

It does look like col_split_level doesn't handle the case where we have an lfn chain where some move and some don't correctly (move_objects can change short names for elements of objs_to_move).

Actions #9

Updated by Samuel Just about 8 years ago

Ah, LFNIndex::move_object when it calls remove_object passes the object rather than the short name, so it'll always find 0 in this case. So, how does either one survive?

Actions #10

Updated by Samuel Just about 8 years ago

  • Priority changed from Urgent to Immediate
Actions #11

Updated by Samuel Just about 8 years ago

Ok, this became visible now because of the change to stripe attrs over smaller value after hammer in an attempt to keep xattrs in the inode (c6cdb408). We'll have to fix it before jewel is released. Fortunately, the fix is to just assume we can write an lfn attr in one big attr. I'm also going to update the do_op logic to let the objectstore validate an hobject so that ext4 can have a lower limit than xfs and can take into account the escaping done inside of filestore.

Actions #12

Updated by Samuel Just about 8 years ago

  • Status changed from New to 7
Actions #13

Updated by Samuel Just about 8 years ago

  • Status changed from 7 to Pending Backport
  • Priority changed from Immediate to High
  • Backport set to infernalis
Actions #14

Updated by Nathan Cutler about 8 years ago

  • Copied to Backport #15488: infernalis: stray lfn object after split added
Actions #15

Updated by Loïc Dachary over 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF