Bug #18980: rgw: "cluster [WRN] bad locator @X on object @X...." in cluster log - rgw - Ceph

Actions

Copy link

Bug #18980

closed

rgw: "cluster [WRN] bad locator @X on object @X...." in cluster log

Added by Ali Maredia about 7 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

High

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

jewel kraken

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

rgw

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

During the clean up phase of teuthology (run as master as the --ceph-suite), an egrep of the logs reveals warnings similar to:
"cluster [WRN] bad locator @9 on object @9 op osd_op(client.4146.0:79 9.5 9:b08b92bd::::head [delete] snapc 0=[] ondisk+write+known_if_redirected e30) v8"

This does not happen on runs with kraken.

Run on master
http://pulpito.ceph.com/amaredia-2017-02-17_19:57:25-rgw:singleton-master---basic-smithi/

Run on kraken
http://pulpito.ceph.com/amaredia-2017-02-17_17:27:21-rgw:singleton-kraken---basic-smithi/

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Yehuda Sadeh about 7 years ago

Project changed from Ceph to rgw
Subject changed from "cluster [WRN] bad locator @X on object @X...." in cluster log to rgw: "cluster [WRN] bad locator @X on object @X...." in cluster log
Priority changed from Normal to High

seems like a problem where we send a delete with an empty object name. Maybe on radosgw-admin user rm, but not 100% sure yet.

Actions

Copy link

Updated by Casey Bodley about 7 years ago

Yeah, I managed to track this down in one of my runs: http://qa-proxy.ceph.com/teuthology/cbodley-2017-03-01_14:55:21-rgw-wip-rgw-encryption---basic-mira/871984/teuthology.log

2017-03-01T23:00:34.776 INFO:teuthology.orchestra.run.mira027:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin --log-to-stderr --format json -n client.0 user rm --uid fud'
...
2017-03-01T23:00:35.081 INFO:teuthology.orchestra.run.mira027.stderr:2017-03-01 23:00:34.980947 7f836966c600 10 removing email index:
...
2017-03-01T23:00:35.082 INFO:teuthology.orchestra.run.mira027.stderr:2017-03-01 23:00:34.983733 7f836966c600  1 -- 172.21.4.124:0/814352763 --> 172.21.8.116:6800/15679 -- osd_op(unknown.0.0:79 10.5 10:b08b92bd::::head [delete] snapc 0=[] ondisk+write+known_if_redirected e32) v8 -- 0x556337151260 con 0

which leads to:

2017-03-01T23:05:37.686 INFO:teuthology.orchestra.run.mira102.stdout:2017-03-01 23:00:35.123793 osd.0 172.21.8.116:6800/15679 1 : cluster [WRN] bad locator @10 on object @10 op osd_op(client.4146.0:79 10.5 10:b08b92bd::::head [delete] snapc 0=[] ondisk+write+known_if_redirected e32) v8

If the user's email address is empty, we try to remove an object with an empty name.

Actions

Copy link

Updated by Casey Bodley about 7 years ago

Status changed from New to Fix Under Review
Backport set to jewel kraken

https://github.com/ceph/ceph/pull/13783

The teuthology failures don't occur on kraken or earlier, but I still tagged for backport to be safe. It's possible that whatever osd changes that led to the 'bad locator' warnings may be backported as well.

Actions

Copy link