Project

General

Profile

Bug #6371 ยป ceph-segfault.txt

rados bench and segfault dump - Lincoln Bryant, 09/23/2013 05:08 PM

 
# rados bench -p test 120 write -t 32 --block-size=16777216 --no-cleanup
Maintaining 32 concurrent writes of 16777216 bytes for up to 120 seconds or 0 objects
Object prefix: benchmark_data_faxbox2.mwt2.org_16829
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 32 49 17 271.926 272 0.796911 0.732304
2 32 71 39 311.933 352 1.09599 0.859511
3 31 92 61 325.27 352 1.4874 1.07978
4 32 124 92 367.931 496 0.91507 1.12363
5 32 159 127 406.327 560 0.932131 1.0944
6 31 198 167 445.257 640 0.754153 1.04486
7 32 224 192 438.782 400 1.14419 1.04838
8 32 264 232 463.92 640 0.784376 1.0252
9 32 293 261 463.92 464 1.02077 1.01426
10 32 331 299 478.32 608 0.860362 1.00452
11 32 366 334 485.735 560 0.871194 0.994232
12 31 405 374 498.582 640 0.820231 0.973497
13 32 440 408 502.069 544 0.834575 0.967116
14 31 470 439 501.63 496 0.996832 0.970277
15 31 507 476 507.649 592 0.802816 0.965332
16 31 545 514 513.915 608 0.809 0.959431
17 31 584 553 520.384 624 0.815182 0.947848
18 32 624 592 526.136 624 0.701792 0.940091
19 32 662 630 530.441 608 0.887834 0.92729
2013-09-23 17:51:59.284552min lat: 0.570131 max lat: 2.63634 avg lat: 0.927297
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
20 31 701 670 535.915 640 0.720035 0.927297
21 32 740 708 539.344 608 0.79119 0.91755
22 32 775 743 540.28 560 0.937821 0.917467
23 32 811 779 541.83 576 1.69498 0.918834
24 31 850 819 545.917 640 0.707609 0.915907
25 31 894 863 552.236 704 0.739232 0.905454
26 31 922 891 548.225 448 0.905098 0.901534
27 32 959 927 549.251 576 0.894611 0.912962
28 32 994 962 549.631 560 0.778835 0.911098
29 32 1029 997 549.987 560 0.950483 0.909543
30 32 1059 1027 547.652 480 1.0991 0.912018
31 32 1092 1060 547.015 528 0.953138 0.914795
32 32 1130 1098 548.918 608 0.785575 0.913855
33 31 1173 1142 553.614 704 0.708181 0.908474
34 31 1198 1167 549.094 400 1.00662 0.907309
35 32 1243 1211 553.517 704 0.610841 0.911724
36 31 1281 1250 555.472 624 0.866509 0.906427
37 31 1316 1285 555.593 560 0.916337 0.904983
38 31 1354 1323 556.97 608 0.854641 0.902278
39 31 1376 1345 551.713 352 1.34414 0.905591
2013-09-23 17:52:19.287300min lat: 0.566257 max lat: 2.75597 avg lat: 0.911814
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
40 32 1412 1380 551.919 560 0.872171 0.911814
41 32 1450 1418 553.284 608 0.801945 0.911094
42 32 1492 1460 556.109 672 0.657317 0.908644
43 32 1534 1502 558.802 672 0.791866 0.903956
44 31 1561 1530 556.283 448 1.04407 0.903575
45 32 1593 1561 554.942 496 1.1128 0.909071
46 31 1621 1590 552.964 464 1.0225 0.909306
47 32 1652 1620 551.411 480 1.22612 0.915611
48 31 1686 1655 551.588 560 0.960975 0.915991
49 32 1714 1682 549.146 432 1.12425 0.917897
50 31 1751 1720 550.321 608 0.875741 0.919176
51 32 1776 1744 547.059 384 1.144 0.92096
52 32 1823 1791 550.998 752 0.714947 0.91894
53 31 1859 1828 551.77 592 0.914371 0.917125
54 32 1905 1873 554.884 720 0.695822 0.912803
55 31 1944 1913 556.43 640 0.893588 0.908337
56 32 1973 1941 554.492 448 1.16289 0.909842
57 32 1994 1962 550.659 336 1.45184 0.915379
58 31 2036 2005 553.025 688 0.7453 0.915796
59 32 2061 2029 550.16 384 1.23776 0.917931
2013-09-23 17:52:39.289855min lat: 0.566257 max lat: 2.75597 avg lat: 0.919711
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
60 32 2086 2054 547.656 400 0.979331 0.919711
61 32 2121 2089 547.857 560 0.825813 0.92453
62 32 2147 2115 545.73 416 1.10541 0.927203
63 32 2170 2138 542.908 368 1.21986 0.930147
64 31 2195 2164 540.924 416 1.1443 0.935303
65 32 2212 2180 536.539 256 1.45592 0.939103
66 32 2236 2204 534.227 384 1.29511 0.944971
67 32 2267 2235 533.655 496 1.15071 0.949104
68 32 2292 2260 531.689 400 1.07133 0.950718
69 32 2313 2281 528.852 336 1.52854 0.954131
70 32 2342 2310 527.925 464 1.40644 0.958493
71 32 2368 2336 526.347 416 1.04168 0.960131
72 32 2399 2367 525.924 496 1.26904 0.963418
73 32 2433 2401 526.171 544 0.933912 0.963581
74 32 2456 2424 524.032 368 1.39795 0.964818
75 32 2480 2448 522.164 384 1.23374 0.968231
76 32 2508 2476 521.187 448 1.14382 0.971629
77 32 2535 2503 520.028 432 1.16248 0.973864
78 31 2557 2526 518.078 368 1.47916 0.976908
79 32 2577 2545 515.367 304 1.53336 0.979014
2013-09-23 17:52:59.293111min lat: 0.566257 max lat: 2.75597 avg lat: 0.98457
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
80 32 2599 2567 513.325 352 1.67903 0.98457
81 32 2638 2606 514.69 624 0.876542 0.986945
82 31 2676 2645 516.022 624 0.759485 0.984959
83 32 2709 2677 515.974 512 0.917286 0.984339
84 32 2747 2715 517.068 608 0.789335 0.982441
85 31 2783 2752 517.949 592 0.900808 0.98047
86 32 2822 2790 518.995 608 0.753515 0.9799
87 32 2869 2837 521.672 752 0.68252 0.975027
88 32 2897 2865 520.834 448 1.06714 0.974308
89 31 2931 2900 521.273 560 0.852529 0.975558
90 31 2958 2927 520.28 432 1.00496 0.975193
91 31 2988 2957 519.837 480 1.08082 0.978018
92 32 3009 2977 517.664 320 1.17396 0.978598
93 32 3032 3000 516.054 368 1.62493 0.98365
94 32 3073 3041 517.542 656 0.6902 0.982253
95 31 3095 3064 515.967 368 1.29344 0.982884
96 32 3132 3100 516.592 576 0.867192 0.984411
97 32 3167 3135 517.038 560 0.913034 0.983178
98 32 3204 3172 517.802 592 0.841481 0.982476
99 31 3235 3204 517.743 512 1.01731 0.982612
2013-09-23 17:53:19.295973min lat: 0.566257 max lat: 2.75597 avg lat: 0.982179
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
100 31 3269 3238 518.004 544 0.856677 0.982179
101 32 3298 3266 517.311 448 1.13459 0.981765
102 32 3341 3309 518.983 688 0.782421 0.980835
103 32 3372 3340 518.759 496 1.01106 0.980056
104 32 3396 3364 517.463 384 1.26139 0.980999
105 32 3431 3399 517.867 560 0.837868 0.982756
106 32 3457 3425 516.906 416 1.16187 0.98276
107 31 3492 3461 517.457 576 0.933955 0.983275
108 32 3529 3497 517.998 576 0.859948 0.982832
109 32 3568 3536 518.97 624 0.776788 0.980565
110 32 3596 3564 518.324 448 1.11067 0.981761
111 32 3625 3593 517.834 464 0.926186 0.981187
112 31 3661 3630 518.495 592 0.907946 0.981884
113 32 3690 3658 517.871 448 0.990787 0.982482
114 32 3724 3692 518.099 544 0.927768 0.982065
115 31 3746 3715 516.794 368 1.3144 0.983664
116 32 3765 3733 514.821 288 1.43722 0.986332
117 32 3788 3756 513.566 368 1.59618 0.989938
118 32 3822 3790 513.823 544 0.967381 0.990613
119 32 3843 3811 512.328 336 1.40179 0.991898
2013-09-23 17:53:39.299082min lat: 0.566257 max lat: 2.75597 avg lat: 0.994292
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
120 25 3867 3842 512.191 496 1.01717 0.994292
Total time run: 120.508515
Total writes made: 3867
Write size: 16777216
Bandwidth (MB/sec): 513.424

Stddev Bandwidth: 123.94
Max bandwidth (MB/sec): 752
Min bandwidth (MB/sec): 0
Average Latency: 0.991749
Stddev Latency: 0.244053
Max latency: 2.75597
Min latency: 0.276771

# rados bench -p test 60 seq -t 32 --block-size=8388608
*** Caught signal (Segmentation fault) **
in thread 7fc31615f760
ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
1: rados() [0x43da81]
2: /lib64/libpthread.so.0() [0x381680f500]
3: /lib64/libc.so.6() [0x38164844ed]
4: (ObjBencher::aio_bench(int, int, int, int, int, bool)+0x12b) [0x43693b]
5: rados() [0x41b8b3]
6: (main()+0xe38) [0x420918]
7: (__libc_start_main()+0xfd) [0x381641ecdd]
8: rados() [0x40f5a9]
2013-09-23 17:53:44.461746 7fc31615f760 -1 *** Caught signal (Segmentation fault) **
in thread 7fc31615f760

ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
1: rados() [0x43da81]
2: /lib64/libpthread.so.0() [0x381680f500]
3: /lib64/libc.so.6() [0x38164844ed]
4: (ObjBencher::aio_bench(int, int, int, int, int, bool)+0x12b) [0x43693b]
5: rados() [0x41b8b3]
6: (main()+0xe38) [0x420918]
7: (__libc_start_main()+0xfd) [0x381641ecdd]
8: rados() [0x40f5a9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-60> 2013-09-23 17:53:44.409586 7fc31615f760 5 asok(0x108aa70) register_command perfcounters_dump hook 0x108aa50
-59> 2013-09-23 17:53:44.409610 7fc31615f760 5 asok(0x108aa70) register_command 1 hook 0x108aa50
-58> 2013-09-23 17:53:44.409623 7fc31615f760 5 asok(0x108aa70) register_command perf dump hook 0x108aa50
-57> 2013-09-23 17:53:44.409629 7fc31615f760 5 asok(0x108aa70) register_command perfcounters_schema hook 0x108aa50
-56> 2013-09-23 17:53:44.409632 7fc31615f760 5 asok(0x108aa70) register_command 2 hook 0x108aa50
-55> 2013-09-23 17:53:44.409634 7fc31615f760 5 asok(0x108aa70) register_command perf schema hook 0x108aa50
-54> 2013-09-23 17:53:44.409641 7fc31615f760 5 asok(0x108aa70) register_command config show hook 0x108aa50
-53> 2013-09-23 17:53:44.409644 7fc31615f760 5 asok(0x108aa70) register_command config set hook 0x108aa50
-52> 2013-09-23 17:53:44.409646 7fc31615f760 5 asok(0x108aa70) register_command config get hook 0x108aa50
-51> 2013-09-23 17:53:44.409648 7fc31615f760 5 asok(0x108aa70) register_command log flush hook 0x108aa50
-50> 2013-09-23 17:53:44.409650 7fc31615f760 5 asok(0x108aa70) register_command log dump hook 0x108aa50
-49> 2013-09-23 17:53:44.409656 7fc31615f760 5 asok(0x108aa70) register_command log reopen hook 0x108aa50
-48> 2013-09-23 17:53:44.441065 7fc31615f760 10 monclient(hunting): build_initial_monmap
-47> 2013-09-23 17:53:44.441164 7fc31615f760 1 librados: starting msgr at :/0
-46> 2013-09-23 17:53:44.441201 7fc31615f760 1 librados: starting objecter
-45> 2013-09-23 17:53:44.441243 7fc31615f760 1 -- :/0 messenger.start
-44> 2013-09-23 17:53:44.441327 7fc31615f760 1 librados: setting wanted keys
-43> 2013-09-23 17:53:44.441371 7fc31615f760 1 librados: calling monclient init
-42> 2013-09-23 17:53:44.441373 7fc31615f760 10 monclient(hunting): init
-41> 2013-09-23 17:53:44.441376 7fc31615f760 5 adding auth protocol: none
-40> 2013-09-23 17:53:44.441377 7fc31615f760 10 monclient(hunting): auth_supported 1 method none
-39> 2013-09-23 17:53:44.441552 7fc31615f760 10 monclient(hunting): _reopen_session rank -1 name
-38> 2013-09-23 17:53:44.441620 7fc31615f760 10 monclient(hunting): picked mon.a con 0x10b1000 addr 10.1.5.67:6789/0
-37> 2013-09-23 17:53:44.441654 7fc31615f760 10 monclient(hunting): _send_mon_message to mon.a at 10.1.5.67:6789/0
-36> 2013-09-23 17:53:44.441660 7fc31615f760 1 -- :/1018018 --> 10.1.5.67:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x10b1490 con 0x10b1000
-35> 2013-09-23 17:53:44.441674 7fc31615f760 10 monclient(hunting): renew_subs
-34> 2013-09-23 17:53:44.441676 7fc31615f760 10 monclient(hunting): authenticate will time out at 2013-09-23 17:58:44.441676
-33> 2013-09-23 17:53:44.441964 7fc312c8a700 1 -- 10.1.5.67:0/1018018 learned my addr 10.1.5.67:0/1018018
-32> 2013-09-23 17:53:44.442616 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 1 ==== mon_map v1 ==== 191+0+0 (1083688949 0 0) 0x7fc3040009f0 con 0x10b1000
-31> 2013-09-23 17:53:44.442700 7fc314c8e700 10 monclient(hunting): handle_monmap mon_map v1
-30> 2013-09-23 17:53:44.442713 7fc314c8e700 10 monclient(hunting): got monmap 1, mon.a is now rank 0
-29> 2013-09-23 17:53:44.442719 7fc314c8e700 10 monclient(hunting): dump:
epoch 1
fsid 14fe4a82-1e2f-4bf3-88d2-90abda4bbc34
last_changed 2013-09-23 14:56:23.574976
created 2013-09-23 14:56:23.574976
0: 10.1.5.67:6789/0 mon.a

-28> 2013-09-23 17:53:44.442736 7fc314c8e700 1 monclient(hunting): found mon.a
-27> 2013-09-23 17:53:44.442746 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 2 ==== auth_reply(proto 1 0 Success) v1 ==== 24+0+0 (3469762266 0 0) 0x7fc304000e10 con 0x10b1000
-26> 2013-09-23 17:53:44.442808 7fc314c8e700 10 monclient: my global_id is 4327
-25> 2013-09-23 17:53:44.442810 7fc314c8e700 10 monclient: _send_mon_message to mon.a at 10.1.5.67:6789/0
-24> 2013-09-23 17:53:44.442812 7fc314c8e700 1 -- 10.1.5.67:0/1018018 --> 10.1.5.67:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x10b1a20 con 0x10b1000
-23> 2013-09-23 17:53:44.442873 7fc31615f760 5 monclient: authenticate success, global_id 4327
-22> 2013-09-23 17:53:44.442960 7fc31615f760 5 asok(0x108aa70) register_command objecter_requests hook 0x10ae6c0
-21> 2013-09-23 17:53:44.443007 7fc31615f760 10 monclient: renew_subs
-20> 2013-09-23 17:53:44.443069 7fc31615f760 10 monclient: _send_mon_message to mon.a at 10.1.5.67:6789/0
-19> 2013-09-23 17:53:44.443074 7fc31615f760 1 -- 10.1.5.67:0/1018018 --> 10.1.5.67:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x10b26c0 con 0x10b1000
-18> 2013-09-23 17:53:44.443115 7fc31615f760 10 monclient: renew_subs
-17> 2013-09-23 17:53:44.443162 7fc31615f760 10 monclient: _send_mon_message to mon.a at 10.1.5.67:6789/0
-16> 2013-09-23 17:53:44.443164 7fc31615f760 1 -- 10.1.5.67:0/1018018 --> 10.1.5.67:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x10b2c60 con 0x10b1000
-15> 2013-09-23 17:53:44.443222 7fc31615f760 1 librados: init done
-14> 2013-09-23 17:53:44.443332 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 3 ==== mon_map v1 ==== 191+0+0 (1083688949 0 0) 0x7fc3040009f0 con 0x10b1000
-13> 2013-09-23 17:53:44.443380 7fc314c8e700 10 monclient: handle_monmap mon_map v1
-12> 2013-09-23 17:53:44.443392 7fc314c8e700 10 monclient: got monmap 1, mon.a is now rank 0
-11> 2013-09-23 17:53:44.443394 7fc314c8e700 10 monclient: dump:
epoch 1
fsid 14fe4a82-1e2f-4bf3-88d2-90abda4bbc34
last_changed 2013-09-23 14:56:23.574976
created 2013-09-23 14:56:23.574976
0: 10.1.5.67:6789/0 mon.a

-10> 2013-09-23 17:53:44.443423 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 4 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (3819298868 0 0) 0x7fc304000e10 con 0x10b1000
-9> 2013-09-23 17:53:44.443434 7fc314c8e700 10 monclient: handle_subscribe_ack sent 2013-09-23 17:53:44.441675 renew after 2013-09-23 17:56:14.441675
-8> 2013-09-23 17:53:44.443626 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 5 ==== osd_map(122..122 src has 1..122) v3 ==== 43354+0+0 (1528757157 0 0) 0x7fc30400b290 con 0x10b1000
-7> 2013-09-23 17:53:44.443901 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (3819298868 0 0) 0x7fc30400b750 con 0x10b1000
-6> 2013-09-23 17:53:44.443913 7fc314c8e700 10 monclient: handle_subscribe_ack sent 0.000000, ignoring
-5> 2013-09-23 17:53:44.443916 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 7 ==== osd_map(122..122 src has 1..122) v3 ==== 43354+0+0 (1528757157 0 0) 0x7fc3040162d0 con 0x10b1000
-4> 2013-09-23 17:53:44.443926 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== mon.0 10.1.5.67:6789/0 8 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (3819298868 0 0) 0x7fc3040166d0 con 0x10b1000
-3> 2013-09-23 17:53:44.443933 7fc314c8e700 10 monclient: handle_subscribe_ack sent 0.000000, ignoring
-2> 2013-09-23 17:53:44.444009 7fc31615f760 1 -- 10.1.5.67:0/1018018 --> 10.1.5.67:7012/3381 -- osd_op(client.4327.0:1 benchmark_last_metadata [read 0~12] 3.8e7a861d e122) v4 -- ?+0 0x10b42b0 con 0x10b3ef0
-1> 2013-09-23 17:53:44.444634 7fc314c8e700 1 -- 10.1.5.67:0/1018018 <== osd.57 10.1.5.67:7012/3381 1 ==== osd_op_reply(1 benchmark_last_metadata [read 0~12] ondisk = 0) v4 ==== 122+0+12 (1222353278 0 3826882326) 0x7fc300000a60 con 0x10b3ef0
0> 2013-09-23 17:53:44.461746 7fc31615f760 -1 *** Caught signal (Segmentation fault) **
in thread 7fc31615f760

ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
1: rados() [0x43da81]
2: /lib64/libpthread.so.0() [0x381680f500]
3: /lib64/libc.so.6() [0x38164844ed]
4: (ObjBencher::aio_bench(int, int, int, int, int, bool)+0x12b) [0x43693b]
5: rados() [0x41b8b3]
6: (main()+0xe38) [0x420918]
7: (__libc_start_main()+0xfd) [0x381641ecdd]
8: rados() [0x40f5a9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 hadoop
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
-2/-2 (syslog threshold)
99/99 (stderr threshold)
max_recent 500
max_new 1000
log_file
--- end dump of recent events ---
Segmentation fault
    (1-1/1)