Project

General

Profile

Bug #63099

Updated by Jos Collin 7 months ago

ceph_getxattr() call always return -61 (ENODATA) in PeerReplayer::synchronize, which results in 'prev' snapshot always coming in as boost::none in PeerReplayer::do_synchronize function. Even if the ceph_fsetxattr seems successfully setting (returns 0) the    'ceph.mirror.dirty_snap_id' in the first pass, the ceph_getxattr() call always returns ENODATA in the following passes (and in    further snapshot creations) for the same dir_root. It never retrieves the xattr value from the dir_root. 

 As the prev snapshot is always none, this issue blocks the implementation of https://tracker.ceph.com/issues/61334. 

 The ceph_getxattr() call eventually hits Server::handle_client_getvxattr(). But Debugging shows the is_ceph_vxattr call below "No such file or directory" in it doesn't check for 'ceph.mirror.dirty_snap_id' and so it responds -CEPHFS_ENODATA. In addition the osd logs, which is suspicious.  

 <pre> 
 2023-10-04T16:27:03.132+0530 7f11eda57580 10 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label                        
 2023-10-04T16:27:03.132+0530 7f11eda57580 -1 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label failed to that, Server::handle_client_getvxattr() does nothing open /home/user/ceph/build/dev/osd1/block: (2) No such file or directory                                                                        
 2023-10-04T16:27:03.132+0530 7f11eda57580 10 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label                        
 2023-10-04T16:27:03.132+0530 7f11eda57580 -1 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label failed to open /home/user/ceph/build/dev/osd1/block: (2) No such file or directory                                                                        
 2023-10-04T16:27:03.132+0530 7f11eda57580 10 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label                        
 2023-10-04T16:27:03.132+0530 7f11eda57580 -1 bluestore(/home/user/ceph/build/dev/osd1/block) _read_bdev_label failed to open /home/user/ceph/build/dev/osd1/block: (2) No such file or directory  
 
 ................ 
 ................ 
 ................ 
 
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 20 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/c/f=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] do_op: op osd_op(client.4163.0:6 1.0 1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head [getxattr striper.excl in=12b,getxattr striper.size in=12b,getxattr striper.allocated in=17b,getxattr striper.version in=15b] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e16) v8                                              
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 20 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/cf=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] op_has_sufficient_caps session=0x55960e17cf00 pool=1 (.mgrdevicehealth) pool_app_metadata={mgr={}} need_read_cap=1 need_write_cap=0 classes=[] -> yes                                                  
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 10 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/cf=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] do_op osd_op(client.4163.0:6 1.0 1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head [getxattr striper.excl in=12b,getxattr striper.size in=12b,getxattr striper.allocated in=17b,getxattr striper.version in=15b] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e16) v8 may_read -> read-ordered flags ondisk+read+known_if_redirected+supports_pool_eio                                                                                                             
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 10 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/c/f=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] get_object_context: obc NOT found in particular cache: 1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head                                                                                              
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 15 bluestore(/home/user/ceph/build/dev/osd1) getattr 1.0_head #1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head# _                                                                                                   
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 20 bluestore(/home/user/ceph/build/dev/osd1).collection(1.0_head 0x55960f1dbb00) get_onode oid #1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head# key 0x7F80000000000000011B0E96AE'devicehealth!main.db-journal.0000000000000000!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F                                                                                          
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 20 bluestore(/home/user/ceph/build/dev/osd1).collection(1.0_head 0x55960f1dbb00)    r -2 v.len 0                                                                                                                                     
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 10 bluestore(/home/user/ceph/build/dev/osd1) getattr 1.0_head #1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head# _ = -2                                                                                              
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 10 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/c/f=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] get_object_context: no obc for the 'ceph.mirror.dirty_snap_id' attribute. So these two things needs fix. soid 1:1b0e96ae:devicehealth::main.db-journal.0000000000000000:head and !can_create                                                                                      
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0 20 osd.1 pg_epoch: 16 pg[1.0( v 16'2 (0'0,16'2] local-lis/les=15/16 n=1 ec=15/15 lis/c=15/15 les/c/f=16/16/0 sis=15) [1,0,2] r=0 lpr=15 crt=16'2 lcod 16'1 mlcod 16'1 active+clean] do_op: find_object_context got error -2                     
 2023-10-04T16:27:23.077+0530 7f3b7d9cb6c0    1 -- [v2:127.0.0.1:6810/3240622459,v1:127.0.0.1:6811/3240622459] --> 127.0.0.1:0/1693054178 -- osd_op_reply(6 main.db-journal.0000000000000000 [getxattr,getxattr,getxattr,getxattr] v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v8-- 0x55960f080900 con 0x55960e73ed00 
 </pre>

Back