Project

General

Profile

Actions

Bug #37431

open

msg/async: crash in the case STATE_OPEN_MESSAGE_READ_DATA

Added by shen hang over 5 years ago. Updated about 5 years ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
AsyncMessenger
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
luminous, mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This bug was found in version 12.2.5.
mds cannot be started.
When mds was restarted and would crash during the rejoin phase, the log of mds is listed below.
It seems client x.x.x.16 sent a big msg of size 2869776384 to mds and caused the crash.
2869776384 exceeds the boundary of int . when 2869776384 converted to int parameters in function advance(int), it inovked end of buffer.

2018-11-01 18:25:45.859359 7f87d9c34700 20 -- x.x.x.42:6800/778237716 >> x.x.x.16:0/3867854018 conn(0x558b5f74f000 :6800 s=STATE_OPEN_MESSAGE_READ_DATA pgs=548 cs=1 l=0).process prev state is STATE_OPEN_MESSAGE_READ_DATA
2018-11-01 18:25:45.859362 7f87d9c34700 25 -- x.x.x.42:6800/778237716 >> x.x.x.x:0/3867854018 conn(0x558b5f74f000 :6800 s=STATE_OPEN_MESSAGE_READ_DATA pgs=548 cs=1 l=0).read_until len is 2869776384 state_offset is 2869267096
2018-11-01 18:25:45.859635 7f87d9c34700 25 -- x.x.x.42:6800/778237716 >> x.x.x.16:0/3867854018 conn(0x558b5f74f000 :6800 s=STATE_OPEN_MESSAGE_READ_DATA pgs=548 cs=1 l=0).read_until read_bulk left is 509288 got 509288
2018-11-01 18:25:45.860316 7f87d9c34700 -1 ** Caught signal (Aborted) *
in thread 7f87d9c34700 thread_name:msgr-worker-1

From /var/log/messages ,I found output belows:

Nov 1 18:25:40 xxx ceph-mds: tcmalloc: large alloc 2869780480 bytes == 0x558b87dd0000 @ 0x7f87ddd8f4ef 0x7f87dddb0010 0x558b44c5ee94 0x558b44fc57ff 0x558b44d2f5a9 0x558 b44d3216e 0x7f87dbe6d2b0 0x7f87dc4f2e25 0x7f87db5d534d
Nov 1 18:25:45 xxx ceph-mds: terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
Nov 1 18:25:45 xxx ceph-mds: what(): buffer::end_of_buffer
Nov 1 18:25:45 xxx ceph-mds: ** Caught signal (Aborted) *

Actions #1

Updated by Kefu Chai over 5 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by Patrick Donnelly over 5 years ago

  • Subject changed from msg/async:Crash in the case STATE_OPEN_MESSAGE_READ_DATA of process(),When read is big unsigned int and will be converted to unsigned indata_blp.advance(), it will exceed the boundary of int. to msg/async: crash in the case STATE_OPEN_MESSAGE_READ_DATA
  • Assignee set to shen hang
  • Target version set to v14.0.0
  • Start date deleted (11/28/2018)
  • Source set to Community (dev)
  • Pull request ID set to 25315
  • Affected Versions v12.2.10 added
  • Affected Versions deleted (v12.2.5, v12.2.6, v12.2.7, v12.2.8, v12.2.9)
  • ceph-qa-suite deleted (fs)
Actions #3

Updated by Nathan Cutler over 5 years ago

  • Backport set to luminous, mimic
Actions #4

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
Actions #5

Updated by Greg Farnum about 5 years ago

  • Category set to AsyncMessenger
Actions

Also available in: Atom PDF