Project

General

Profile

Bug #24327

osd: segv in pg_log_entry_t::encode()

Added by frank lin almost 6 years ago. Updated almost 6 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
FileStore, OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The affected osd restarted itself and everything seems fine then.But what is the cause of the crash?


-105> 2018-05-29 14:01:43.126619 b5dc18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.32:0/5768 conn(0xef2c700 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=13 cs=1 l=1). rx osd.15 seq 2816 0xfe12000 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.117993) v4
  -104> 2018-05-29 14:01:43.127136 b5dc18e0  1 -- 10.0.2.45:6801/5358 <== osd.15 10.0.2.32:0/5768 2816 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.117993) v4 ==== 2004+0+0 (1886851389 0 0) 0xfe12000 con 0xef2c700
  -103> 2018-05-29 14:01:43.127201 b5dc18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.32:0/5768 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.117993) v4 -- 0x11ed42c0 con 0
  -102> 2018-05-29 14:01:43.129037 b55c18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.35:0/5038 conn(0xeada000 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx osd.21 seq 2889 0xf70d1e0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.126816) v4
  -101> 2018-05-29 14:01:43.129259 b55c18e0  1 -- 10.0.2.45:6801/5358 <== osd.21 10.0.2.35:0/5038 2889 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.126816) v4 ==== 2004+0+0 (832584264 0 0) 0xf70d1e0 con 0xeada000
  -100> 2018-05-29 14:01:43.129383 b55c18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.35:0/5038 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.126816) v4 -- 0xeea67e0 con 0
   -99> 2018-05-29 14:01:43.130012 b55c18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.35:0/5038 conn(0xeba6000 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx osd.21 seq 2889 0xeea67e0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.126816) v4
   -98> 2018-05-29 14:01:43.130110 b55c18e0  1 -- 10.0.1.45:6801/5358 <== osd.21 10.0.1.35:0/5038 2889 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.126816) v4 ==== 2004+0+0 (832584264 0 0) 0xeea67e0 con 0xeba6000
   -97> 2018-05-29 14:01:43.130152 b55c18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.35:0/5038 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.126816) v4 -- 0x1248e940 con 0
   -96> 2018-05-29 14:01:43.155868 b5dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.30:0/5826 conn(0xdf42700 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx osd.10 seq 2859 0x11ed42c0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.152177) v4
   -95> 2018-05-29 14:01:43.156011 b5dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.10 10.0.1.30:0/5826 2859 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.152177) v4 ==== 2004+0+0 (3706048477 0 0) 0x11ed42c0 con 0xdf42700
   -94> 2018-05-29 14:01:43.156093 b5dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.30:0/5826 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.152177) v4 -- 0x14644f20 con 0
   -93> 2018-05-29 14:01:43.158013 b5dc18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.30:0/5826 conn(0xf10c100 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=15 cs=1 l=1). rx osd.10 seq 2859 0x14644f20 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.152177) v4
   -92> 2018-05-29 14:01:43.158117 b5dc18e0  1 -- 10.0.2.45:6801/5358 <== osd.10 10.0.2.30:0/5826 2859 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.152177) v4 ==== 2004+0+0 (3706048477 0 0) 0x14644f20 con 0xf10c100
   -91> 2018-05-29 14:01:43.158179 b5dc18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.30:0/5826 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.152177) v4 -- 0x146367e0 con 0
   -90> 2018-05-29 14:01:43.241053 b5dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.36:0/5830 conn(0xeba6d00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=1). rx osd.22 seq 2833 0x146367e0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.238670) v4
   -89> 2018-05-29 14:01:43.241190 b5dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.22 10.0.1.36:0/5830 2833 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.238670) v4 ==== 2004+0+0 (409500267 0 0) 0x146367e0 con 0xeba6d00
   -88> 2018-05-29 14:01:43.241243 b5dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.36:0/5830 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.238670) v4 -- 0x11eb0000 con 0
   -87> 2018-05-29 14:01:43.241564 b4dc18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.36:0/5830 conn(0xe11fa00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=10 cs=1 l=1). rx osd.22 seq 2833 0x143f1600 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.238670) v4
   -86> 2018-05-29 14:01:43.241633 b4dc18e0  1 -- 10.0.2.45:6801/5358 <== osd.22 10.0.2.36:0/5830 2833 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.238670) v4 ==== 2004+0+0 (409500267 0 0) 0x143f1600 con 0xe11fa00
   -85> 2018-05-29 14:01:43.241668 b4dc18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.36:0/5830 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.238670) v4 -- 0x11e96ec0 con 0
   -84> 2018-05-29 14:01:43.274040 b5dc18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.25:0/5865 conn(0xf109a00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=3 cs=1 l=1). rx osd.1 seq 2867 0x11eb0000 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.272861) v4
   -83> 2018-05-29 14:01:43.274133 b5dc18e0  1 -- 10.0.2.45:6801/5358 <== osd.1 10.0.2.25:0/5865 2867 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.272861) v4 ==== 2004+0+0 (314573887 0 0) 0x11eb0000 con 0xf109a00
   -82> 2018-05-29 14:01:43.274168 b5dc18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.25:0/5865 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.272861) v4 -- 0x14463a20 con 0
   -81> 2018-05-29 14:01:43.359762 b4dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.25:0/5865 conn(0xf108d00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=9 cs=1 l=1). rx osd.1 seq 2867 0x11e96ec0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.272861) v4
   -80> 2018-05-29 14:01:43.359893 b4dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.1 10.0.1.25:0/5865 2867 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.272861) v4 ==== 2004+0+0 (314573887 0 0) 0x11e96ec0 con 0xf108d00
   -79> 2018-05-29 14:01:43.359953 b4dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.25:0/5865 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.272861) v4 -- 0x144e4b00 con 0
   -78> 2018-05-29 14:01:43.498766 b5dc18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.30:6802/5826 conn(0xf088700 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=5 cs=1 l=0). rx osd.10 seq 4322 0xff48500 osd_repop(client.171686.0:25013805 5.7c4 e2690/2688) v2
   -77> 2018-05-29 14:01:43.498868 b5dc18e0  1 -- 10.0.2.45:6800/5358 <== osd.10 10.0.2.30:6802/5826 4322 ==== osd_repop(client.171686.0:25013805 5.7c4 e2690/2688) v2 ==== 1030+0+4195055 (2804727904 0 312576923) 0xff48500 con 0xf088700
   -76> 2018-05-29 14:01:43.499508 9c3b38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90642, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -75> 2018-05-29 14:01:43.532780 b55c18e0  5 -- 10.0.1.45:6800/5358 >> 10.0.1.253:0/2113151122 conn(0xf11ad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=398 cs=1 l=1). rx client.171686 seq 201482 0x122f5000 osd_op(client.171686.0:25013829 5.31 5.42082831 (undecoded) ondisk+write+known_if_redirected e2690) v8
   -74> 2018-05-29 14:01:43.532867 b55c18e0  1 -- 10.0.1.45:6800/5358 <== client.171686 10.0.1.253:0/2113151122 201482 ==== osd_op(client.171686.0:25013829 5.31 5.42082831 (undecoded) ondisk+write+known_if_redirected e2690) v8 ==== 277+0+4194304 (703813657 0 1152518466) 0x122f5000 con 0xf11ad00
   -73> 2018-05-29 14:01:43.548080 b55c18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.32:6802/5839 conn(0xeb44e00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=16 cs=1 l=0). rx osd.14 seq 8581 0x142d0800 osd_repop(client.171686.0:25013082 5.56f e2690/2686) v2
   -72> 2018-05-29 14:01:43.548163 b55c18e0  1 -- 10.0.2.45:6800/5358 <== osd.14 10.0.2.32:6802/5839 8581 ==== osd_repop(client.171686.0:25013082 5.56f e2690/2686) v2 ==== 1030+0+4195055 (2472916882 0 3874645697) 0x142d0800 con 0xeb44e00
   -71> 2018-05-29 14:01:43.548346 9d3b38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90704, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -70> 2018-05-29 14:01:43.607572 b5dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.45:0/5430 conn(0xeb40000 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=1). rx osd.40 seq 2909 0x11eb0000 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.602238) v4
   -69> 2018-05-29 14:01:43.607632 b5dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.40 10.0.1.45:0/5430 2909 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.602238) v4 ==== 2004+0+0 (3944437170 0 0) 0x11eb0000 con 0xeb40000
   -68> 2018-05-29 14:01:43.607666 b5dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.45:0/5430 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.602238) v4 -- 0x1449b080 con 0
   -67> 2018-05-29 14:01:43.611604 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.30:6802/5826 -- osd_repop_reply(client.171686.0:25013805 5.7c4 e2690/2688 ondisk, result = 0) v2 -- 0x1593afc0 con 0
   -66> 2018-05-29 14:01:43.612546 b55c18e0  5 -- 10.0.1.45:6800/5358 >> 10.0.1.253:0/2113151122 conn(0xf11ad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=398 cs=1 l=1). rx client.171686 seq 201483 0xdb93a00 osd_op(client.171686.0:25013842 5.5df 5.981e25df (undecoded) ondisk+write+known_if_redirected e2690) v8
   -65> 2018-05-29 14:01:43.612678 b55c18e0  1 -- 10.0.1.45:6800/5358 <== client.171686 10.0.1.253:0/2113151122 201483 ==== osd_op(client.171686.0:25013842 5.5df 5.981e25df (undecoded) ondisk+write+known_if_redirected e2690) v8 ==== 277+0+4194304 (275264676 0 1211141430) 0xdb93a00 con 0xf11ad00
   -64> 2018-05-29 14:01:43.614216 b55c18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.45:0/5430 conn(0xeb40d00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx osd.40 seq 2909 0x14444100 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.602238) v4
   -63> 2018-05-29 14:01:43.614310 b55c18e0  1 -- 10.0.2.45:6801/5358 <== osd.40 10.0.2.45:0/5430 2909 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.602238) v4 ==== 2004+0+0 (3944437170 0 0) 0x14444100 con 0xeb40d00
   -62> 2018-05-29 14:01:43.614403 b55c18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.45:0/5430 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.602238) v4 -- 0x14384c60 con 0
   -61> 2018-05-29 14:01:43.615683 9c3b38e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.28:6802/5845 -- osd_repop(client.171686.0:25013842 5.5df e2690/2688 5:fba47819:::rbd_data.8a6b74b0dc51.0000000000009540:head v 2690'89317) v2 -- 0x123c0300 con 0
   -60> 2018-05-29 14:01:43.615862 9c3b38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'89317, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -59> 2018-05-29 14:01:43.645829 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.32:6802/5839 -- osd_repop_reply(client.171686.0:25013082 5.56f e2690/2686 ondisk, result = 0) v2 -- 0x120f0e00 con 0
   -58> 2018-05-29 14:01:43.708451 b55c18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.32:6802/5839 conn(0xeb44e00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=16 cs=1 l=0). rx osd.14 seq 8582 0x12d0cd00 osd_repop(client.171686.0:25013108 5.56f e2690/2686) v2
   -57> 2018-05-29 14:01:43.708568 b55c18e0  1 -- 10.0.2.45:6800/5358 <== osd.14 10.0.2.32:6802/5839 8582 ==== osd_repop(client.171686.0:25013108 5.56f e2690/2686) v2 ==== 1030+0+4195055 (3517417442 0 766499828) 0x12d0cd00 con 0xeb44e00
   -56> 2018-05-29 14:01:43.708806 9d3b38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90705, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -55> 2018-05-29 14:01:43.735227 9bbb38e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.33:6800/5767 -- osd_repop(client.171686.0:25013829 5.31 e2690/2684 5:8c141042:::rbd_data.8a6b74b0dc51.00000000000098c7:head v 2690'90942) v2 -- 0xdf76f00 con 0
   -54> 2018-05-29 14:01:43.735418 9bbb38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90942, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -53> 2018-05-29 14:01:43.768294 b55c18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.25:0/5934 conn(0xdf41a00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=15 cs=1 l=1). rx osd.0 seq 2836 0x14444100 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.733191) v4
   -52> 2018-05-29 14:01:43.768405 b55c18e0  1 -- 10.0.2.45:6801/5358 <== osd.0 10.0.2.25:0/5934 2836 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.733191) v4 ==== 2004+0+0 (2907866159 0 0) 0x14444100 con 0xdf41a00
   -51> 2018-05-29 14:01:43.768461 b55c18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.25:0/5934 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.733191) v4 -- 0x11e97180 con 0
   -50> 2018-05-29 14:01:43.775675 b4dc18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.28:6800/5757 conn(0xf086000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=0). rx osd.6 seq 6379 0xb090f00 osd_repop(client.171686.0:25013848 5.236 e2690/2688) v2
   -49> 2018-05-29 14:01:43.775786 b4dc18e0  1 -- 10.0.2.45:6800/5358 <== osd.6 10.0.2.28:6800/5757 6379 ==== osd_repop(client.171686.0:25013848 5.236 e2690/2688) v2 ==== 1030+0+4195055 (2141852154 0 1798471273) 0xb090f00 con 0xf086000
   -48> 2018-05-29 14:01:43.776004 9d3b38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90011, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -47> 2018-05-29 14:01:43.782549 b4dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.25:0/5934 conn(0xea48000 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=1). rx osd.0 seq 2836 0x11e96ec0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.733191) v4
   -46> 2018-05-29 14:01:43.782672 b4dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.0 10.0.1.25:0/5934 2836 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.733191) v4 ==== 2004+0+0 (2907866159 0 0) 0x11e96ec0 con 0xea48000
   -45> 2018-05-29 14:01:43.782731 b4dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.25:0/5934 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.733191) v4 -- 0x1449d700 con 0
   -44> 2018-05-29 14:01:43.852087 b4dc18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.25:6800/5865 conn(0xeb41a00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=0). rx osd.1 seq 6946 0xff4b200 osd_repop(client.171686.0:25013874 5.6b8 e2690/2688) v2
   -43> 2018-05-29 14:01:43.852207 b4dc18e0  1 -- 10.0.2.45:6800/5358 <== osd.1 10.0.2.25:6800/5865 6946 ==== osd_repop(client.171686.0:25013874 5.6b8 e2690/2688) v2 ==== 1030+0+4195055 (1072725382 0 835456317) 0xff4b200 con 0xeb41a00
   -42> 2018-05-29 14:01:43.852471 9dbb38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'87999, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -41> 2018-05-29 14:01:43.872468 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.32:6802/5839 -- osd_repop_reply(client.171686.0:25013108 5.56f e2690/2686 ondisk, result = 0) v2 -- 0x15981100 con 0
   -40> 2018-05-29 14:01:43.872642 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.28:6800/5757 -- osd_repop_reply(client.171686.0:25013848 5.236 e2690/2688 ondisk, result = 0) v2 -- 0x12d9b6c0 con 0
   -39> 2018-05-29 14:01:43.884797 b55c18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.33:6800/5767 conn(0xeeba000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=23 cs=1 l=0). rx osd.17 seq 6863 0x15981100 osd_repop_reply(client.171686.0:25013829 5.31 e2690/2684) v2
   -38> 2018-05-29 14:01:43.884877 b55c18e0  1 -- 10.0.2.45:6800/5358 <== osd.17 10.0.2.33:6800/5767 6863 ==== osd_repop_reply(client.171686.0:25013829 5.31 e2690/2684) v2 ==== 111+0+0 (2759373061 0 0) 0x15981100 con 0xeeba000
   -37> 2018-05-29 14:01:43.885098 9bbb38e0  1 -- 10.0.1.45:6800/5358 --> 10.0.1.253:0/2113151122 -- osd_op_reply(25013829 rbd_data.8a6b74b0dc51.00000000000098c7 [set-alloc-hint object_size 4194304 write_size 4194304,writefull 0~4194304] v2690'90942 uv90942 ondisk = 0) v8 -- 0x12e53e00 con 0
   -36> 2018-05-29 14:01:43.892501 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.25:6800/5865 -- osd_repop_reply(client.171686.0:25013874 5.6b8 e2690/2688 ondisk, result = 0) v2 -- 0x11d3d6c0 con 0
   -35> 2018-05-29 14:01:43.930810 b4dc18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.33:0/5839 conn(0xeba7a00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=5 cs=1 l=1). rx osd.16 seq 2832 0x11e96ec0 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.926770) v4
   -34> 2018-05-29 14:01:43.930872 b4dc18e0  1 -- 10.0.2.45:6801/5358 <== osd.16 10.0.2.33:0/5839 2832 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.926770) v4 ==== 2004+0+0 (350033554 0 0) 0x11e96ec0 con 0xeba7a00
   -33> 2018-05-29 14:01:43.930929 b4dc18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.33:0/5839 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.926770) v4 -- 0xf8f6580 con 0
   -32> 2018-05-29 14:01:43.931687 b55c18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.33:0/5839 conn(0xebaa100 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=17 cs=1 l=1). rx osd.16 seq 2832 0x14444100 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.926770) v4
   -31> 2018-05-29 14:01:43.931741 b55c18e0  1 -- 10.0.1.45:6801/5358 <== osd.16 10.0.1.33:0/5839 2832 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.926770) v4 ==== 2004+0+0 (350033554 0 0) 0x14444100 con 0xebaa100
   -30> 2018-05-29 14:01:43.931774 b55c18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.33:0/5839 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.926770) v4 -- 0xec24c00 con 0
   -29> 2018-05-29 14:01:43.935452 b5dc18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.35:6800/5038 conn(0xeadad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=13 cs=1 l=0). rx osd.21 seq 5966 0x123c0300 osd_repop(client.171686.0:25013902 5.3dd e2690/2682) v2
   -28> 2018-05-29 14:01:43.935659 b5dc18e0  1 -- 10.0.2.45:6800/5358 <== osd.21 10.0.2.35:6800/5038 5966 ==== osd_repop(client.171686.0:25013902 5.3dd e2690/2682) v2 ==== 1030+0+4195055 (4014797479 0 976628961) 0x123c0300 con 0xeadad00
   -27> 2018-05-29 14:01:43.935960 9bbb38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'89973, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -26> 2018-05-29 14:01:43.986613 b5dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.46:0/5361 conn(0xea4c100 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=1). rx osd.42 seq 2935 0x11eb0000 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.982983) v4
   -25> 2018-05-29 14:01:43.986779 b5dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.42 10.0.1.46:0/5361 2935 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.982983) v4 ==== 2004+0+0 (1087633563 0 0) 0x11eb0000 con 0xea4c100
   -24> 2018-05-29 14:01:43.986832 b5dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.46:0/5361 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.982983) v4 -- 0x12239080 con 0
   -23> 2018-05-29 14:01:43.991798 b55c18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.46:0/5361 conn(0xe122100 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx osd.42 seq 2935 0x14444100 osd_ping(ping e2690 stamp 2018-05-29 14:01:43.982983) v4
   -22> 2018-05-29 14:01:43.991875 b55c18e0  1 -- 10.0.2.45:6801/5358 <== osd.42 10.0.2.46:0/5361 2935 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:43.982983) v4 ==== 2004+0+0 (1087633563 0 0) 0x14444100 con 0xe122100
   -21> 2018-05-29 14:01:43.991914 b55c18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.46:0/5361 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:43.982983) v4 -- 0xf786260 con 0
   -20> 2018-05-29 14:01:44.003950 abbad8e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.35:6800/5038 -- osd_repop_reply(client.171686.0:25013902 5.3dd e2690/2682 ondisk, result = 0) v2 -- 0x105f4300 con 0
   -19> 2018-05-29 14:01:44.014252 b5dc18e0  5 -- 10.0.2.45:6800/5358 >> 10.0.2.28:6802/5845 conn(0xf08ae00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=9 cs=1 l=0). rx osd.7 seq 3762 0x105f4300 osd_repop_reply(client.171686.0:25013842 5.5df e2690/2688) v2
   -18> 2018-05-29 14:01:44.014348 b5dc18e0  1 -- 10.0.2.45:6800/5358 <== osd.7 10.0.2.28:6802/5845 3762 ==== osd_repop_reply(client.171686.0:25013842 5.5df e2690/2688) v2 ==== 111+0+0 (3341532712 0 0) 0x105f4300 con 0xf08ae00
   -17> 2018-05-29 14:01:44.014495 9c3b38e0  1 -- 10.0.1.45:6800/5358 --> 10.0.1.253:0/2113151122 -- osd_op_reply(25013842 rbd_data.8a6b74b0dc51.0000000000009540 [set-alloc-hint object_size 4194304 write_size 4194304,writefull 0~4194304] v2690'89317 uv89317 ondisk = 0) v8 -- 0x1220e800 con 0
   -16> 2018-05-29 14:01:44.040575 b55c18e0  5 -- 10.0.1.45:6800/5358 >> 10.0.1.253:0/2113151122 conn(0xf11ad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=398 cs=1 l=1). rx client.171686 seq 201484 0x121fb200 osd_op(client.171686.0:25013924 5.48f 5.4e057c8f (undecoded) ondisk+write+known_if_redirected e2690) v8
   -15> 2018-05-29 14:01:44.040717 b55c18e0  1 -- 10.0.1.45:6800/5358 <== client.171686 10.0.1.253:0/2113151122 201484 ==== osd_op(client.171686.0:25013924 5.48f 5.4e057c8f (undecoded) ondisk+write+known_if_redirected e2690) v8 ==== 277+0+4194304 (2863078870 0 1674244990) 0x121fb200 con 0xf11ad00
   -14> 2018-05-29 14:01:44.088007 b55c18e0  5 -- 10.0.1.45:6800/5358 >> 10.0.1.253:0/2113151122 conn(0xf11ad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=398 cs=1 l=1). rx client.171686 seq 201485 0x1220e800 osd_op(client.171686.0:25013928 5.11d 5.f333811d (undecoded) ondisk+write+known_if_redirected e2690) v8
   -13> 2018-05-29 14:01:44.088093 b55c18e0  1 -- 10.0.1.45:6800/5358 <== client.171686 10.0.1.253:0/2113151122 201485 ==== osd_op(client.171686.0:25013928 5.11d 5.f333811d (undecoded) ondisk+write+known_if_redirected e2690) v8 ==== 277+0+4194304 (680551522 0 1315291934) 0x1220e800 con 0xf11ad00
   -12> 2018-05-29 14:01:44.088693 9dbb38e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.33:6800/5767 -- osd_repop(client.171686.0:25013928 5.11d e2690/2684 5:b881cccf:::rbd_data.8a6b74b0dc51.00000000000096a0:head v 2690'89285) v2 -- 0x14322000 con 0
   -11> 2018-05-29 14:01:44.089006 9dbb38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'89285, trimmed: , trimmed_dups: , clear_divergent_priors: 0
   -10> 2018-05-29 14:01:44.115232 b5dc18e0  5 -- 10.0.1.45:6801/5358 >> 10.0.1.45:0/5430 conn(0xeb40000 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=1). rx osd.40 seq 2910 0x11eb0000 osd_ping(ping e2690 stamp 2018-05-29 14:01:44.105205) v4
    -9> 2018-05-29 14:01:44.115373 b5dc18e0  1 -- 10.0.1.45:6801/5358 <== osd.40 10.0.1.45:0/5430 2910 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:44.105205) v4 ==== 2004+0+0 (646435965 0 0) 0x11eb0000 con 0xeb40000
    -8> 2018-05-29 14:01:44.115431 b5dc18e0  1 -- 10.0.1.45:6801/5358 --> 10.0.1.45:0/5430 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:44.105205) v4 -- 0xf810dc0 con 0
    -7> 2018-05-29 14:01:44.133383 b55c18e0  5 -- 10.0.2.45:6801/5358 >> 10.0.2.45:0/5430 conn(0xeb40d00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx osd.40 seq 2910 0xf6814a0 osd_ping(ping e2690 stamp 2018-05-29 14:01:44.105205) v4
    -6> 2018-05-29 14:01:44.133513 b55c18e0  1 -- 10.0.2.45:6801/5358 <== osd.40 10.0.2.45:0/5430 2910 ==== osd_ping(ping e2690 stamp 2018-05-29 14:01:44.105205) v4 ==== 2004+0+0 (646435965 0 0) 0xf6814a0 con 0xeb40d00
    -5> 2018-05-29 14:01:44.133579 b55c18e0  1 -- 10.0.2.45:6801/5358 --> 10.0.2.45:0/5430 -- osd_ping(ping_reply e2690 stamp 2018-05-29 14:01:44.105205) v4 -- 0x144f2100 con 0
    -4> 2018-05-29 14:01:44.158649 b55c18e0  5 -- 10.0.1.45:6800/5358 >> 10.0.1.253:0/2113151122 conn(0xf11ad00 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=398 cs=1 l=1). rx client.171686 seq 201486 0x12e53e00 osd_op(client.171686.0:25013938 5.31 5.8c0ce031 (undecoded) ondisk+write+known_if_redirected e2690) v8
    -3> 2018-05-29 14:01:44.158766 b55c18e0  1 -- 10.0.1.45:6800/5358 <== client.171686 10.0.1.253:0/2113151122 201486 ==== osd_op(client.171686.0:25013938 5.31 5.8c0ce031 (undecoded) ondisk+write+known_if_redirected e2690) v8 ==== 277+0+4194304 (3375979554 0 3889757252) 0x12e53e00 con 0xf11ad00
    -2> 2018-05-29 14:01:44.159152 9bbb38e0  1 -- 10.0.2.45:6800/5358 --> 10.0.2.33:6800/5767 -- osd_repop(client.171686.0:25013938 5.31 e2690/2684 5:8c073031:::rbd_data.8a6b74b0dc51.0000000000009448:head v 2690'90943) v2 -- 0x1438d400 con 0
    -1> 2018-05-29 14:01:44.159312 9bbb38e0  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2690'90943, trimmed: , trimmed_dups: , clear_divergent_priors: 0
     0> 2018-05-29 14:01:44.215803 9cbb38e0 -1 *** Caught signal (Segmentation fault) **

in thread 9cbb38e0 thread_name:tp_osd_tp

 ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)
 1: (()+0x99e9a8) [0xe6f9a8]
 2: (()+0x26aa0) [0xb6668aa0]
 3: (std::enable_if<boost::is_same<osd_reqid_t const, osd_reqid_t>::value||boost::is_same<osd_reqid_t const, osd_reqid_t const>::value, void>::type _denc_friend<osd_reqid_t const, ceph::buf
fer::list::contiguous_appender>(osd_reqid_t const&, ceph::buffer::list::contiguous_appender&)+0x60) [0xb79dcc]
 4: (pg_log_entry_t::encode(ceph::buffer::list&) const+0x134) [0xb5e674]
 5: (ReplicatedBackend::generate_subop(hobject_t const&, eversion_t const&, unsigned long long, osd_reqid_t, eversion_t, eversion_t, hobject_t, hobject_t, std::vector<pg_log_entry_t, std::a
llocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, ObjectStore::Transaction&, pg_shard_t, pg_info_t const&)+0x698) [0xbf2008]
 6: (ReplicatedBackend::issue_op(hobject_t const&, eversion_t const&, unsigned long long, osd_reqid_t, eversion_t, eversion_t, hobject_t, hobject_t, std::vector<pg_log_entry_t, std::allocat
or<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, ReplicatedBackend::InProgressOp*, ObjectStore::Transaction&)+0x324) [0xbf2978]
 7: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t c
onst&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, unsigned long long, osd_
reqid_t, boost::intrusive_ptr<OpRequest>)+0x968) [0xbffd8c]
 8: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x708) [0xa90600]
 9: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x1134) [0xad927c]
 10: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2bb4) [0xadc5b4]
 11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xf8c) [0xa9c11c]
 12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3b4) [0x92bea0]
 13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x88) [0xb9875c]
 14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xd64) [0x95489c]
 15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7dc) [0xebba84]
 16: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0xec08fc]
 17: (Thread::entry_wrapper()+0x98) [0xfbec04]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

History

#1 Updated by Sage Weil almost 6 years ago

  • Subject changed from OSD crashed due to segmentation fault to osd: segv in pg_log_entry_t::encode()
  • Status changed from New to Need More Info

This crash doesn't look familiar, and it's not clear to me what might cause segfault here. Do you have a core file?

#2 Updated by frank lin almost 6 years ago

Sage Weil wrote:

This crash doesn't look familiar, and it's not clear to me what might cause segfault here. Do you have a core file?

Nope how could I get a core file? For now I can't reproduce the crash.

Also available in: Atom PDF