Project

General

Profile

Actions

Bug #13801

closed

Deep-scrub will crash osd if osd backend is newstore

Added by Zhi Zhang over 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Because of this change (https://github.com/ceph/ceph/pull/6076), the hobject_t will contain pool id, hence the ghobject_t having this hobject_t will be not equal to ghobject_t().

In newstore, this will cause assertion failure as shown in below logs.

2015-11-12 18:59:32.045862 7fa22797a700 10 osd.6 1006 dequeue_op 0x7fa2473adb00 prio 127 cost 0 latency 0.000174 replica scrub(pg: 3.1e5,from:0'0,to:0'0,epoch:1006,start:MIN,end:MAX,chunky:1,deep:1,seed:4294967295,version:6) v6 pg pg[3.1e5( empty local-les=1004 n=0 ec=836 les/c/f 1004/1004/0 1003/1003/996) [0,6] r=1 lpr=1003 pi=930-1002/11 crt=0'0 active]
2015-11-12 18:59:32.045925 7fa22797a700 10 osd.6 pg_epoch: 1006 pg[3.1e5( empty local-les=1004 n=0 ec=836 les/c/f 1004/1004/0 1003/1003/996) [0,6] r=1 lpr=1003 pi=930-1002/11 crt=0'0 active] handle_message: 0x7fa2473adb00
2015-11-12 18:59:32.046018 7fa22797a700 10 osd.6 pg_epoch: 1006 pg[3.1e5( empty local-les=1004 n=0 ec=836 les/c/f 1004/1004/0 1003/1003/996) [0,6] r=1 lpr=1003 pi=930-1002/11 crt=0'0 active] build_scrub_map_chunk [3/00000000//0,MAX)  seed 4294967295
2015-11-12 18:59:32.046046 7fa22797a700 15 newstore(/var/lib/ceph/osd/ceph-6) collection_list 3.1e5_head start 3/00000000//0 end MAX max 2147483647
2015-11-12 18:59:32.046071 7fa22797a700 20 newstore(/var/lib/ceph/osd/ceph-6) collection_list range --.7ffffffffffffffb.a7800000. to --.7ffffffffffffffb.a8000000. and --.8000000000000003.a7800000. to --.8000000000000003.a8000000. start 3/00000000//0
2015-11-12 18:59:32.049697 7fa22797a700 -1 os/newstore/NewStore.cc: In function 'virtual int NewStore::collection_list(coll_t, ghobject_t, ghobject_t, bool, int, std::vector<ghobject_t>*, ghobject_t*)' thread 7fa22797a700 time 2015-11-12 18:59:32.046131
os/newstore/NewStore.cc: 1591: FAILED assert(k >= start_key && k < end_key)

The fix is to make compatible with previous change to create a ghobject_t() object with pool id and shard id in newstore.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #13668: OSD always crashes while doing scrub with new store as the backendRejected11/02/2015

Actions
Actions #2

Updated by Kefu Chai over 8 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Sage Weil over 8 years ago

  • Assignee set to Sage Weil
Actions #4

Updated by Nathan Cutler over 8 years ago

  • Related to Bug #13668: OSD always crashes while doing scrub with new store as the backend added
Actions #5

Updated by Nathan Cutler over 8 years ago

#13668 looks related, or possibly a duplicate.

#13827 describes a scrub-related OSD crash in Hammer.

Actions #6

Updated by Sage Weil over 7 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF