Project

General

Profile

Actions

Bug #1192

closed

multi-client stall under heavy write load

Added by Jim Schutt almost 13 years ago. Updated almost 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've been having trouble with Linux client stalls under heavy
write load. I think I've got the signature isolated. The
following logs were collected using the current stable branch
(commit f9e9490d77d), patched with a little extra debugging output.

This was with 64 clients against 48 OSDs, with 4 OSDs/server.

When this type of stall happens, multiple clients each have at least
one request that they retry forever. or until I get tired of waiting.
Each of these requests are directed to the same OSD.

Apparently the stalled requests are waiting in the policy throttler;
here's an example:

# grep -Hn "" osd.32.log | grep "0x983dc80 sd=312" 
osd.32.log:4278322:2011-06-15 11:14:29.325620 7f010b5d5940 -- 172.17.40.30:6800/7317 >> :/0 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=0).accept
osd.32.log:4278323:2011-06-15 11:14:29.325766 7f010b5d5940 -- 172.17.40.30:6800/7317 >> :/0 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=0).accept sd=312
osd.32.log:4278324:2011-06-15 11:14:29.325805 7f010b5d5940 -- 172.17.40.30:6800/7317 >> :/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=0).accept peer addr is :/3749931588
osd.32.log:4278325:2011-06-15 11:14:29.325820 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=0).accept peer addr is really 172.17.40.104:0/3749931588 (socket is 172.17.40.104:40535/0)
osd.32.log:4278326:2011-06-15 11:14:29.325855 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=0).accept got peer connect_seq 0 global_seq 374
osd.32.log:4278327:2011-06-15 11:14:29.325868 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=1).accept of host_type 8, policy.lossy=1
osd.32.log:4278328:2011-06-15 11:14:29.325881 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=1).accept my proto 24, their proto 24
osd.32.log:4278331:2011-06-15 11:14:29.325949 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=1).accept existing 0x5f69500.gseq 352 <= 374, looks ok
osd.32.log:4278332:2011-06-15 11:14:29.325959 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
osd.32.log:4278337:2011-06-15 11:14:29.326022 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=0 cs=0 l=1).accept replacing 0x5f69500
osd.32.log:4278343:2011-06-15 11:14:29.326106 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).accept success, connect_seq = 1, sending READY
osd.32.log:4278344:2011-06-15 11:14:29.326119 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).accept features 138
osd.32.log:4278345:2011-06-15 11:14:29.326143 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).register_pipe 0x983dc80
osd.32.log:4278348:2011-06-15 11:14:29.326208 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).accept starting writer, state=2
osd.32.log:4278349:2011-06-15 11:14:29.326254 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).accept done
osd.32.log:4278350:2011-06-15 11:14:29.326282 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).reader reading tag...
osd.32.log:4278351:2011-06-15 11:14:29.326295 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).reader got MSG
osd.32.log:4278352:2011-06-15 11:14:29.326359 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4278353:2011-06-15 11:14:29.326371 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).reader wants 2097280 from policy throttler 23068672/23333333
osd.32.log:4278354:2011-06-15 11:14:29.326494 7f010b4d4940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).writer: state = 2 policy.server=1
osd.32.log:4278355:2011-06-15 11:14:29.326521 7f010b4d4940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).writer sleeping
osd.32.log:4408968:2011-06-15 11:15:29.566658 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).was_session_reset
osd.32.log:4408969:2011-06-15 11:15:29.566677 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).discard_queue
osd.32.log:4408970:2011-06-15 11:15:29.566690 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1). dequeued pipe
osd.32.log:4408973:2011-06-15 11:15:29.566744 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=0 l=1).stop
osd.32.log:4408974:2011-06-15 11:15:29.566760 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=0 l=1).SHUT_RDWR on sd=312
osd.32.log:4408977:2011-06-15 11:15:29.566832 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=0 l=1).unregister_pipe 0x983dc80
osd.32.log:4408981:2011-06-15 11:15:29.566911 7f010b4d4940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=0 l=1).writer finishing
osd.32.log:4408982:2011-06-15 11:15:29.566939 7f010b4d4940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=0 l=1).writer done

That particular request was retried 11 times before I gave up:

# grep -Hn "" osd.32.log | grep "reader got envelope type=42 src client4209 tid=3084" 
osd.32.log:3751659:2011-06-15 11:12:28.790521 7f011b6d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x1125000 sd=184 pgs=314 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4117033:2011-06-15 11:13:30.492619 7f0114261940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x5f69500 sd=248 pgs=352 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4278352:2011-06-15 11:14:29.326359 7f010b5d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983dc80 sd=312 pgs=374 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4408989:2011-06-15 11:15:29.567876 7f0103555940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x503d280 sd=376 pgs=398 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4516095:2011-06-15 11:16:29.806672 7f00fb4d5940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x908c780 sd=440 pgs=417 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4606297:2011-06-15 11:17:29.960538 7f00f3657940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x1236000 sd=503 pgs=423 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4678884:2011-06-15 11:18:30.082235 7f00eb5d7940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x983f500 sd=567 pgs=428 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4739569:2011-06-15 11:19:30.195269 7f00e3557940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x973da00 sd=631 pgs=430 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4800037:2011-06-15 11:20:30.306227 7f00db4d7940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x908f000 sd=695 pgs=433 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4864997:2011-06-15 11:21:30.416680 7f00d3457940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x2069000 sd=759 pgs=434 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4924028:2011-06-15 11:22:30.527851 7f00cb3d7940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x5e9f500 sd=823 pgs=437 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:4980694:2011-06-15 11:23:30.639673 7f00c3357940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x5f4ca00 sd=887 pgs=438 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:5038812:2011-06-15 11:24:30.751323 7f00bb2d7940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x97bf000 sd=951 pgs=441 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0
osd.32.log:5098966:2011-06-15 11:25:30.862398 7f00b3257940 -- 172.17.40.30:6800/7317 >> 172.17.40.104:0/3749931588 pipe(0x9815500 sd=1015 pgs=442 cs=1 l=1).reader got envelope type=42 src client4209 tid=3084 front=128 data=2097152 off 0

When I parse that OSD log looking for messages that were allocated but not
freed, I find 11 such:

102062 messaged allocated, 102051 freed, 11 leaked

osd.32.log:2887851:2011-06-15 11:11:22.812374 7f0125d7b940 -- 172.17.40.30:6800/7317 >> 172.17.40.48:0/710998528 pipe(0x205b000 sd=101 pgs=197 cs=1 l=1).reader got message 23 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2888373:2011-06-15 11:11:22.898354 7f0121d3b940 -- 172.17.40.30:6800/7317 >> 172.17.40.102:0/4124033206 pipe(0x209f500 sd=109 pgs=141 cs=1 l=1).reader got message 42 0x2fe96c0 osd_op(client4202.1:2218 1000000f658.00000899 [write 0~2097152 [1@-1]] 0.54a9 RETRY snapc 1=[])
osd.32.log:2888669:2011-06-15 11:11:22.914830 7f0120927940 -- 172.17.40.30:6800/7317 >> 172.17.40.99:0/4088545569 pipe(0xfde000 sd=119 pgs=99 cs=1 l=1).reader got message 29 0x97f8b40 osd_op(client4201.1:2761 1000000f26f.00000ab4 [write 0~2097152 [1@-1]] 0.43c9 RETRY snapc 1=[])
osd.32.log:2888717:2011-06-15 11:11:22.915870 7f011f715940 -- 172.17.40.30:6800/7317 >> 172.17.40.66:0/2121378630 pipe(0x20bfa00 sd=154 pgs=140 cs=1 l=1).reader got message 42 0x96ecb40 osd_op(client4160.1:2285 10000006d7c.000008da [write 0~2097152 [1@-1]] 0.2879 RETRY snapc 1=[])
osd.32.log:2889440:2011-06-15 11:11:22.965759 7f0122b49940 -- 172.17.40.30:6800/7317 >> 172.17.40.87:0/2771882616 pipe(0x2069280 sd=138 pgs=81 cs=1 l=1).reader got message 46 0x11dec480 osd_op(client4190.1:2397 1000000b7c7.0000094a [write 0~2097152 [1@-1]] 0.8822 RETRY snapc 1=[])
osd.32.log:2889643:2011-06-15 11:11:22.971452 7f0124e6c940 -- 172.17.40.30:6800/7317 >> 172.17.40.71:0/598682247 pipe(0x13f0a00 sd=147 pgs=216 cs=1 l=1).reader got message 33 0x13e6240 osd_op(client4178.1:2817 100000059ef.00000aeb [write 0~2097152 [1@-1]] 0.5b4d RETRY snapc 1=[])
osd.32.log:2890879:2011-06-15 11:11:22.997740 7f0125775940 -- 172.17.40.30:6800/7317 >> 172.17.40.46:0/134575799 pipe(0x1432a00 sd=104 pgs=34 cs=1 l=1).reader got message 77 0x5f106c0 osd_op(client4150.1:2650 10000001f48.00000a4a [write 0~2097152 [1@-1]] 0.2822 RETRY snapc 1=[])
osd.32.log:2891083:2011-06-15 11:11:23.021261 7f011f917940 -- 172.17.40.30:6800/7317 >> 172.17.40.69:0/3984853652 pipe(0x13dc000 sd=157 pgs=178 cs=1 l=1).reader got message 33 0x2077b40 osd_op(client4169.1:3001 10000004279.00000ba3 [write 0~2097152 [1@-1]] 0.6e9c RETRY snapc 1=[])
osd.32.log:2891144:2011-06-15 11:11:23.031152 7f012516f940 -- 172.17.40.30:6800/7317 >> 172.17.40.55:0/2649293018 pipe(0x9088c80 sd=134 pgs=212 cs=1 l=1).reader got message 22 0x97d5480 osd_op(client4177.1:2919 100000088db.00000b51 [write 0~2097152 [1@-1]] 0.ae22 RETRY snapc 1=[])
osd.32.log:2891159:2011-06-15 11:11:23.038210 7f0120d2b940 -- 172.17.40.30:6800/7317 >> 172.17.40.81:0/1726187101 pipe(0x2069a00 sd=173 pgs=138 cs=1 l=1).reader got message 51 0x5111900 osd_op(client4180.1:2846 100000090ad.00000b09 [write 0~2097152 [1@-1]] 0.a677 RETRY snapc 1=[])
osd.32.log:2891782:2011-06-15 11:11:23.188190 7f011d0ef940 -- 172.17.40.30:6800/7317 >> 172.17.40.67:0/988437998 pipe(0x2066500 sd=158 pgs=175 cs=1 l=1).reader got message 31 0x900db40 osd_op(client4171.1:2632 10000004a4b.00000a34 [write 0~2097152 [1@-1]] 0.8ca9 RETRY snapc 1=[]) 

If I look at the processing of one of those messages, I see
that it seems to be stuck waiting for require_same_or_newer_map()
to succeed:

osd.32.log:2886557:2011-06-15 11:11:22.606598 7f0125d7b940 -- 172.17.40.30:6800/7317 >> 172.17.40.48:0/710998528 pipe(0x205b000 sd=101 pgs=197 cs=1 l=1).reader got envelope type=42 src client4153 tid=2857 front=128 data=2097152 off 0
osd.32.log:2887851:2011-06-15 11:11:22.812374 7f0125d7b940 -- 172.17.40.30:6800/7317 >> 172.17.40.48:0/710998528 pipe(0x205b000 sd=101 pgs=197 cs=1 l=1).reader got message 23 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2887864:2011-06-15 11:11:22.812773 7f0137b4b940 -- 172.17.40.30:6800/7317 dispatch_entry pipe 0x205b000 dequeued 0x5f5f900
osd.32.log:2887865:2011-06-15 11:11:22.812795 7f0137b4b940 -- 172.17.40.30:6800/7317 <== client4153 172.17.40.48:0/710998528 23 ==== osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[]) ==== 128+0+2097152 (1533754582 0 0) 0x5f5f900 con 0x9866500
osd.32.log:2887867:2011-06-15 11:11:22.812828 7f0137b4b940 osd32 46 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2887868:2011-06-15 11:11:22.812840 7f0137b4b940 osd32 46 require_same_or_newer_map 47 (i am 46) 0x5f5f900
osd.32.log:2887869:2011-06-15 11:11:22.812850 7f0137b4b940 osd32 46 waiting for newer map epoch 47 > my 46 with 0x5f5f900
osd.32.log:2887878:2011-06-15 11:11:22.812964 7f0137b4b940 -- 172.17.40.30:6800/7317 done calling dispatch on 0x5f5f900
osd.32.log:2901509:2011-06-15 11:11:24.218758 7f0137b4b940 osd32 47 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2901510:2011-06-15 11:11:24.218766 7f0137b4b940 osd32 47 require_same_or_newer_map 47 (i am 47) 0x5f5f900
osd.32.log:2915628:2011-06-15 11:11:25.498130 7f0137b4b940 osd32 49 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2915629:2011-06-15 11:11:25.498137 7f0137b4b940 osd32 49 require_same_or_newer_map 47 (i am 49) 0x5f5f900
osd.32.log:2930434:2011-06-15 11:11:26.616930 7f013734a940 osd32 50 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2930435:2011-06-15 11:11:26.616938 7f013734a940 osd32 50 require_same_or_newer_map 47 (i am 50) 0x5f5f900
osd.32.log:2934698:2011-06-15 11:11:26.754837 7f013734a940 osd32 50 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2934699:2011-06-15 11:11:26.754845 7f013734a940 osd32 50 require_same_or_newer_map 47 (i am 50) 0x5f5f900
osd.32.log:2970736:2011-06-15 11:11:28.662143 7f0137b4b940 osd32 52 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:2970737:2011-06-15 11:11:28.662152 7f0137b4b940 osd32 52 require_same_or_newer_map 47 (i am 52) 0x5f5f900
osd.32.log:3048499:2011-06-15 11:11:34.166768 7f0137b4b940 osd32 58 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:3048500:2011-06-15 11:11:34.166779 7f0137b4b940 osd32 58 require_same_or_newer_map 47 (i am 58) 0x5f5f900
osd.32.log:3091922:2011-06-15 11:11:36.250031 7f0137b4b940 osd32 60 _dispatch 0x5f5f900 osd_op(client4153.1:2857 10000000bbb.00000b17 [write 0~2097152 [1@-1]] 0.c879 RETRY snapc 1=[])
osd.32.log:3091923:2011-06-15 11:11:36.250045 7f0137b4b940 osd32 60 require_same_or_newer_map 47 (i am 60) 0x5f5f900

I can reproduce this, so I can try debug patches if needed, but
this issue is hard to trigger, and can take multiple trials, so
turnaround might be slow.

Actions #1

Updated by Jim Schutt almost 13 years ago

I tried to upload osd.32.log.bz2, but got this:

Internal error

An error occurred on the page you were trying to access.
If you continue to experience problems please contact your redMine administrator for assistance.

Back

Actions #2

Updated by Greg Farnum almost 13 years ago

  • Assignee set to Sage Weil
Actions #3

Updated by Sage Weil almost 13 years ago

  • Category set to OSD
  • Target version set to v0.30

Can you try branch osd_pg_active? The request you mention gets kept around on the waiting_for_active list when it should have discarded it (the pg moved to another osd). This may explain why the throttler stopped accepting new io (too much memory in use).

Actions #4

Updated by Jim Schutt almost 13 years ago

I'm compiling it now.

Actions #5

Updated by Sage Weil almost 13 years ago

Any news?

Actions #6

Updated by Jim Schutt almost 13 years ago

I've caught another stall using your patch, but it is slightly different.

Maybe I waited longer, but this one eventually worked itself out.

During the stall, I had 12 clients resetting osd for half an hour or so.
It seems that some of the requests were redirected to OSDs other than the ones
the kernel client was reporting it was resetting.

Do you want the logs?

Actions #7

Updated by Sage Weil almost 13 years ago

  • Target version changed from v0.30 to v0.31
Actions #8

Updated by Sage Weil almost 13 years ago

  • Translation missing: en.field_position set to 694
Actions #9

Updated by Sage Weil almost 13 years ago

  • Translation missing: en.field_story_points set to 5
  • Translation missing: en.field_position deleted (696)
  • Translation missing: en.field_position set to 696
Actions #10

Updated by Sage Weil almost 13 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF