Project

General

Profile

Actions

Bug #21331

closed

pg recovery priority inversion

Added by Sage Weil over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] execute_ctx 0:6c035d59:::10033c72719.00000000:head [create,setxattr parent (36
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] do_osd_op 0:6c035d59:::10033c72719.00000000:head [create,setxattr parent (369)
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] do_osd_op  create
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] do_osd_op  setxattr parent (369)
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] do_osd_op  setxattr layout (30)
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26]  using newer snapc 1=[]
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26]  set mtime to 2017-09-09 17:48:37.925036
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26]  final snapset 1=[]:{} in 0:6c035d59:::10033c72719.00000000:head
32387/832388 n=10607 ec=1/1 lis/c 832387/827296 les/c/f 832388/827299/819533 832378/832387/832379) [92,18,26]/[92,18,16] r=0 lpr=832387 pi=[827296,832387)/3 bft=26 crt=832403'3721335 mlcod 0'0 active+recovery_wait+degraded+remapped m=26] new_repop rep_tid 2041 on osd_op(mds.0.308850:3887948 0.36 0:6c035d59:::10033c

notably does not trim in calc_trim_to. that's because mlcod = 0'0, and
<prE>
eversion_t limit = MIN(
min_last_complete_ondisk,
pg_log.get_can_rollback_to());
if (limit != eversion_t() &&
limit != pg_trim_to &&
pg_log.get_log().approx_size() > target) {

this pg has 54,000 entries! and is eating all the ram on the box.

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #21761: ceph-osd consumes way too much memory during recoveryCan't reproduce10/11/2017

Actions
Actions

Also available in: Atom PDF