Project

General

Profile

Bug #16121

rgw multisite "ERROR" messages during "normal operation"

Added by Jan Klare almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

i already commented on a bug here https://bugzilla.redhat.com/show_bug.cgi?id=1327142 since i thought the ERROR messages i see were related to "real" sync ERRORs. I have configured a mutisite test setup and am using COSbench. Whenever i create a bucket, an object or delete one, i get ERRORs like these:

(the usual logs from civetweb are filtered out, just showing logs containing "ERROR")

create bucket1:
-> logs in master zone:
2016-05-31 16:58:52.343411 7f5850ff1700 0 ERROR: failed to wait for op, ret=-22: POST http://rgw1.de/admin/log?type=metadata&notify&rgwx-zonegroup=ce98cb46-10d3-4b0f-827a-cc19774057af
-> no logs in secondary zone

create bucket2:
-> logs in master zone:
2016-05-31 16:59:13.205150 7f5850ff1700 0 ERROR: failed to wait for op, ret=-22: POST http://rgw1.de/admin/log?type=metadata&notify&rgwx-zonegroup=ce98cb46-10d3-4b0f-827a-cc19774057af
-> no logs in secondary zone

put one 4k file into bucket1:
-> no logs in master zone:
-> logs in secondary zone:
2016-05-31 17:00:25.341013 7f8488fe9700 0 ERROR: lease cr failed, done early
2016-05-31 17:00:25.341037 7f8488fe9700 0 ERROR: full sync on bucket1 bucket_id=79d864f5-3c4b-48da-8f9e-b24d82c1aacd.770892.103 shard_id=1 failed, retcode=-16

put one 4k file into bucket2:
-> no logs in master zone:
-> logs in secondary zone:
2016-05-31 17:00:03.020905 7f8488fe9700 0 ERROR: lease cr failed, done early
2016-05-31 17:00:03.020932 7f8488fe9700 0 ERROR: full sync on bucket2 bucket_id=79d864f5-3c4b-48da-8f9e-b24d82c1aacd.770892.104 shard_id=1 failed, retcode=-16

delete one 4k file from bucket1:
-> no logs in master zone:
-> logs in secondary zone:
2016-05-31 17:07:09.891998 7f8488fe9700 0 ERROR: lease cr failed, done early
2016-05-31 17:07:09.892024 7f8488fe9700 0 ERROR: incremental sync on bucket1 bucket_id=79d864f5-3c4b-48da-8f9e-b24d82c1aacd.770892.103 shard_id=1 failed, retcode=-16

same for the file in bucket2

I am seeing no errors when deleting the bucket.

As Casey already said, these ERRORs seem to be part of the "normal operation" and i am opening this bug to track the status of this, since ERROR messages should not be part of a fully functional setup (https://bugzilla.redhat.com/show_bug.cgi?id=1327142#c28).

Cheers,
Jan


Related issues

Copied to rgw - Backport #17147: jewel: rgw multisite "ERROR" messages during "normal operation" Resolved

History

#1 Updated by Loïc Dachary over 7 years ago

https://github.com/ceph/ceph/pull/9786 claims to fix this issue, is it real ?

#2 Updated by Casey Bodley over 7 years ago

Loic Dachary wrote:

https://github.com/ceph/ceph/pull/9786 claims to fix this issue, is it real ?

yes

#3 Updated by Loïc Dachary over 7 years ago

  • Status changed from New to Pending Backport
  • Backport set to jewel

#4 Updated by Loïc Dachary over 7 years ago

  • Copied to Backport #17147: jewel: rgw multisite "ERROR" messages during "normal operation" added

#5 Updated by Loïc Dachary over 7 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF