Actions
Bug #53227
openosdc: bh split will lost error number, maybe cause client crash
Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
ObjectCacher
Target version:
% Done:
100%
Source:
Community (dev)
Tags:
backport_processed
Backport:
pacific,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Description
GDB:
#12 0x00007f72626352ab in ObjectCacher::C_RetryRead::finish (this=0x7f71dc001f80, r=<optimized out>) at /var/ws/ivan/nautilus-ceph-14.2.5/src/osdc/ObjectCacher.cc:82 82 r = oc->_readx(rd, oset, onfinish, false, &trace); (gdb) p rd $6 = (ObjectCacher::OSDRead *) 0x7f6a1a8ac7b0 (gdb) pmap hits loff_t BufferHead* elem[0].left: $60 = 0 elem[0].right: $61 = (ObjectCacher::BufferHead *) 0x7f6a38fbaf60 elem[1].left: $62 = 131072 elem[1].right: $63 = (ObjectCacher::BufferHead *) 0x7f6a3d80fce0 elem[2].left: $64 = 262144 elem[2].right: $65 = (ObjectCacher::BufferHead *) 0x7f6a4e6e86c0 elem[3].left: $66 = 393216 elem[3].right: $67 = (ObjectCacher::BufferHead *) 0x7f6a4a43c370 elem[4].left: $68 = 524288 elem[4].right: $69 = (ObjectCacher::BufferHead *) 0x7f6a3d80f940 elem[5].left: $70 = 655360 elem[5].right: $71 = (ObjectCacher::BufferHead *) 0x7f6a571cfe70 elem[6].left: $72 = 786432 elem[6].right: $73 = (ObjectCacher::BufferHead *) 0x7f6a274e5570 elem[7].left: $74 = 917504 elem[7].right: $75 = (ObjectCacher::BufferHead *) 0x7f6a4ce46510 elem[8].left: $76 = 1048576 elem[8].right: $77 = (ObjectCacher::BufferHead *) 0x7f6a216f1a00 elem[9].left: $78 = 1179648 elem[9].right: $79 = (ObjectCacher::BufferHead *) 0x7f6a57fffcb0 elem[10].left: $80 = 1441792 elem[10].right: $81 = (ObjectCacher::BufferHead *) 0x7f6a33e64180 elem[11].left: $82 = 1572864 elem[11].right: $83 = (ObjectCacher::BufferHead *) 0x7f6a313543b0 elem[12].left: $84 = 1703936 elem[12].right: $85 = (ObjectCacher::BufferHead *) 0x7f6a19960540 elem[13].left: $86 = 1835008 elem[13].right: $87 = (ObjectCacher::BufferHead *) 0x7f6a2a7b05d0 elem[14].left: $88 = 1966080 elem[14].right: $89 = (ObjectCacher::BufferHead *) 0x7f6a47ac5ce0 elem[15].left: $90 = 2097152 elem[15].right: $91 = (ObjectCacher::BufferHead *) 0x7f6a37b8c840 elem[16].left: $92 = 2228224 elem[16].right: $93 = (ObjectCacher::BufferHead *) 0x7f6a5464e3d0 elem[17].left: $94 = 2490368 elem[17].right: $95 = (ObjectCacher::BufferHead *) 0x7f6f923edca0 elem[18].left: $96 = 2752512 elem[18].right: $97 = (ObjectCacher::BufferHead *) 0x7f6a24ad4300 elem[19].left: $98 = 2883584 elem[19].right: $99 = (ObjectCacher::BufferHead *) 0x7f6a351c37d0 Map size = 20 (gdb) p (*(BufferHead *)0x7f6a24ad4300).state $310 = 5 //STATE_TX (gdb) p (*(BufferHead *)0x7f6a24ad4300).error $311 = 0 (gdb) p (*(BufferHead *)0x7f6a351c37d0).state $312 = 6 //STATE_ERROR (gdb) p (*(BufferHead *)0x7f6a351c37d0).error $313 = 0 //should be eio 5
in conclusion, bh 0x7f6a351c37d0 have STATE_ERROR, but error number is 0,
it's previous bh have STATE_TX, this write triggered bh split.
How did the problem arise:
when readahead, something make osd reply eio, the error number is 5,
eio will be assigned to bh->error, then will read_cond.Signal.
But there is a writing that triggers the split of bh.
in split will new a BufferHead and and use error(0) to initialize, the eio will lost.
When read is awakened, Will continue to branch !error, client will crash.
How to resolve:
add function set_error in BufferHead, when split set_error like set_state.
Updated by Patrick Donnelly over 2 years ago
- Subject changed from bh split will lost error number, maybe cause client crash to osdc: bh split will lost error number, maybe cause client crash
- Status changed from New to Fix Under Review
- Assignee set to wendong jia
- Target version set to v17.0.0
- Source set to Community (dev)
- Backport set to pacific,octopus
- Pull request ID set to 43881
Updated by Patrick Donnelly over 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot over 2 years ago
- Copied to Backport #53703: pacific: osdc: bh split will lost error number, maybe cause client crash added
Updated by Backport Bot over 2 years ago
- Copied to Backport #53704: octopus: osdc: bh split will lost error number, maybe cause client crash added
Actions