Project

General

Profile

Actions

Bug #10947

closed

ceph osd setcrushmap loops when crushtool lags

Added by Loïc Dachary about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
hammer
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If crushtool takes longer than mon lease and block the mon, an election will happen and the command will run again, indefinitely.

mon lease renew interval < mon lease < mon lease ack timeout
zhuzha:~/ceph/ceph/src% PATH=$(pwd):$PATH ./vstart.sh
zhuzha:~/ceph/ceph/src% cat > crushtool
#!/bin/sh
sleep 10
exit 0 # success
^D
zhuzha:~/ceph/ceph/src% ceph osd getcrushmap -o /tmp/map
got crush map from osdmap epoch 18
zhuzha:~/ceph/ceph/src% ceph osd setcrushmap -i /tmp/map

Hangs forever. In logs at that time:

zhuzha:~/ceph/ceph/src% fgrep 'preprocess_query mon_command({"prefix": "osd setcrushmap"}' out/mon.a.log |tail -5
2015-02-25 12:07:17.734768 7f9154604700 10 mon.a@0(leader).osd e23 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.14102 172.18.128.29:0/1020924
2015-02-25 12:07:28.158111 7f9154604700 10 mon.a@0(leader).osd e23 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.14102 172.18.128.29:0/1020924
2015-02-25 12:07:38.504739 7f9154604700 10 mon.a@0(leader).osd e23 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.14102 172.18.128.29:0/1020924
2015-02-25 12:07:49.209307 7f9154604700 10 mon.a@0(leader).osd e24 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.14102 172.18.128.29:0/1020924
2015-02-25 12:07:59.577932 7f9154604700 10 mon.a@0(leader).osd e24 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.14102 172.18.128.29:0/1020924

Changing the following fixes the problem:
    Modified   src/vstart.sh
diff --git a/src/vstart.sh b/src/vstart.sh
index bf863dc..e1440e6 100755
--- a/src/vstart.sh
+++ b/src/vstart.sh
@@ -358,6 +358,10 @@ if [ "$start_mon" -eq 1 ]; then
         mon osd full ratio = .99
         mon data avail warn = 10
         mon data avail crit = 1
+        mon lease = 20
+        mon lease renew interval = 18
+        mon lease ack timeout = 40
         osd pool default erasure code directory = $EC_PATH
         osd pool default erasure code profile = plugin=jerasure technique=reed_sol_van k=2 m=1 ruleset-failure-domain=osd
         rgw frontends = fastcgi, civetweb port=$CEPH_RGW_PORT
Actions #1

Updated by Loïc Dachary about 9 years ago

I can't reproduce it on https://github.com/ceph/ceph/tree/hammer at https://github.com/ceph/ceph/commit/29861b1b64540b6b09a735adf619335a48edfad8

2015-02-25 11:38:21.657533 7f757b7ae700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4110 127.0.0.1:0/1017648
2015-02-25 11:38:21.657539 7f757b7ae700  5 mon.a@0(leader).paxos(paxos active c 1..43) is_readable = 1 - now=2015-02-25 11:38:21.657540 lease_expire=0.000000 has v0 lc 43
2015-02-25 11:38:21.657554 7f757b7ae700 10 mon.a@0(leader).osd e10 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4110 127.0.0.1:0/1017648
2015-02-25 11:38:21.657630 7f757b7ae700  7 mon.a@0(leader).osd e10 prepare_update mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4110 127.0.0.1:0/1017648
2015-02-25 11:38:21.657699 7f757b7ae700 10 mon.a@0(leader).osd e10 prepare_command setting new crush map
2015-02-25 11:38:21.657825 7f757b7ae700 10 mon.a@0(leader).osd e10  testing map
2015-02-25 11:38:31.660127 7f757b7ae700 10 mon.a@0(leader).osd e10  result 
2015-02-25 11:38:31.660321 7f757b7ae700 10 mon.a@0(leader).osd e10 should_propose
2015-02-25 11:38:31.660336 7f757b7ae700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  setting proposal_timer 0x7f75680a2a50 with delay of 0.05
2015-02-25 11:38:31.660448 7f757b7ae700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7f75680c37b0 con 0x4e4fc40
2015-02-25 11:38:31.660471 7f757bfff700 11 mon.a@0(leader) e1 tick
2015-02-25 11:38:31.660505 7f757bfff700 10 mon.a@0(leader).pg v15  clearing pg_map delta (28.924947 > 10 seconds old)
2015-02-25 11:38:31.660538 7f757bfff700 10 mon.a@0(leader).pg v15  clearing pg_map delta for pool 0 (28.924981 > 10 seconds old)
2015-02-25 11:38:31.660568 7f757bfff700 10 mon.a@0(leader).pg v15 v15: 8 pgs: 8 active+clean; 0 bytes data, 501 GB used, 46698 MB / 547 GB avail
2015-02-25 11:38:31.660610 7f757bfff700 10 mon.a@0(leader).osd e10 e10: 3 osds: 3 up, 3 in

Actions #2

Updated by Loïc Dachary about 9 years ago

  • Status changed from 12 to Can't reproduce

Mykola, could you let me know the SHA I should use to reproduce the problem ?

Actions #3

Updated by Loïc Dachary about 9 years ago

  • Status changed from Can't reproduce to In Progress

Nevermind, got it. My test was only using one monitor. Using three monitors triggers the problem.

Actions #4

Updated by Loïc Dachary about 9 years ago

2015-02-25 11:53:43.981719 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4114 127.0.0.1:0/1018996
2015-02-25 11:53:43.981727 7ff318914700  5 mon.a@0(leader).paxos(paxos updating c 1..61) is_readable = 1 - now=2015-02-25 11:53:43.981727 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:43.981732 7ff318914700 10 mon.a@0(leader).osd e10 preprocess_query mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4114 127.0.0.1:0/1018996
2015-02-25 11:53:43.981805 7ff318914700  7 mon.a@0(leader).osd e10 prepare_update mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4114 127.0.0.1:0/1018996
2015-02-25 11:53:43.981868 7ff318914700 10 mon.a@0(leader).osd e10 prepare_command setting new crush map
2015-02-25 11:53:43.981986 7ff318914700 10 mon.a@0(leader).osd e10  testing map
2015-02-25 11:53:53.984626 7ff318914700 10 mon.a@0(leader).osd e10  result 
2015-02-25 11:53:53.984801 7ff318914700 10 mon.a@0(leader).osd e10 should_propose
2015-02-25 11:53:53.984817 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  setting proposal_timer 0x7ff300148d50 with delay of 0.05
2015-02-25 11:53:53.984952 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 206 ==== forward(log(1 entries) v1 caps allow * tid 36 con_features 0) to leader v3 ==== 587+0+0 (2713400823 0 0) 0x7ff2fc001b70 con 0x4236480
2015-02-25 11:53:53.984979 7ff319165700 11 mon.a@0(leader) e1 tick
2015-02-25 11:53:53.985008 7ff319165700 10 mon.a@0(leader).pg v16 v16: 8 pgs: 8 active+clean; 0 bytes data, 501 GB used, 46679 MB / 547 GB avail
2015-02-25 11:53:53.985063 7ff319165700 10 mon.a@0(leader).osd e10 e10: 3 osds: 3 up, 3 in
2015-02-25 11:53:53.985081 7ff319165700  5 mon.a@0(leader).paxos(paxos updating c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.985082 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.985104 7ff319165700  5 mon.a@0(leader).paxos(paxos updating c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.985108 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.985118 7ff319165700  5 mon.a@0(leader).paxos(paxos updating c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.985119 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.985131 7ff319165700 10 mon.a@0(leader).auth v5 auth
2015-02-25 11:53:53.985140 7ff319165700 10 cephx keyserver: _check_rotating_secrets
2015-02-25 11:53:53.985156 7ff319165700 20 mon.a@0(leader) e1 sync_trim_providers
2015-02-25 11:53:53.985177 7ff319165700  1 mon.a@0(leader).paxos(paxos updating c 1..61) accept timeout, calling fresh election
2015-02-25 11:53:53.985189 7ff319165700 10 mon.a@0(leader) e1 bootstrap
2015-02-25 11:53:53.985193 7ff319165700 10 mon.a@0(leader) e1 sync_reset_requester
2015-02-25 11:53:53.985197 7ff319165700 10 mon.a@0(leader) e1 unregister_cluster_logger
2015-02-25 11:53:53.985208 7ff319165700 10 mon.a@0(leader) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:53.985213 7ff319165700 10 mon.a@0(probing) e1 _reset
2015-02-25 11:53:53.985217 7ff319165700 10 mon.a@0(probing) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:53.985220 7ff319165700 10 mon.a@0(probing) e1 timecheck_finish
2015-02-25 11:53:53.985245 7ff319165700 15 mon.a@0(probing) e1 health_tick_stop
2015-02-25 11:53:53.985260 7ff319165700 15 mon.a@0(probing) e1 health_interval_stop
2015-02-25 11:53:53.985271 7ff319165700 10 mon.a@0(probing) e1 scrub_reset
2015-02-25 11:53:53.985276 7ff319165700 10 mon.a@0(probing).paxos(paxos updating c 1..61) restart -- canceling timeouts
2015-02-25 11:53:53.985352 7ff319165700 10 mon.a@0(probing).paxosservice(pgmap 1..16) restart
2015-02-25 11:53:53.985374 7ff319165700 10 mon.a@0(probing).paxosservice(mdsmap 1..1) restart
2015-02-25 11:53:53.985378 7ff319165700 10 mon.a@0(probing).paxosservice(osdmap 1..10) restart
2015-02-25 11:53:53.985380 7ff319165700 10 mon.a@0(probing).paxosservice(osdmap 1..10)  canceling proposal_timer 0x7ff300148d50
2015-02-25 11:53:53.985392 7ff319165700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.985397 7ff319165700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: client.4114 :/0 is openallow * for client.4114 :/0
2015-02-25 11:53:53.985408 7ff319165700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.985706 7ff319165700  0 mon.a@0(probing) e1 handle_command mon_command({"prefix": "osd setcrushmap"} v 0) v1
2015-02-25 11:53:53.985775 7ff319165700 20 is_capable service=osd command=osd setcrushmap read write on cap allow *
2015-02-25 11:53:53.985782 7ff319165700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.985785 7ff319165700 20  allow all
2015-02-25 11:53:53.985787 7ff319165700 10 mon.a@0(probing) e1 _allowed_command capable
2015-02-25 11:53:53.985806 7ff319165700  0 log_channel(audit) log [INF] : from='client.4114 :/0' entity='client.admin' cmd=[{"prefix": "osd setcrushmap"}]: dispatch
2015-02-25 11:53:53.985869 7ff319165700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x7ff2ec0539f0 con 0x4225cc0
2015-02-25 11:53:53.985899 7ff319165700 10 mon.a@0(probing).paxosservice(osdmap 1..10) dispatch mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4114 127.0.0.1:0/1018996
2015-02-25 11:53:53.985913 7ff319165700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.985914 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.985927 7ff319165700 10 mon.a@0(probing).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v0)
2015-02-25 11:53:53.985932 7ff319165700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.985933 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.985956 7ff319165700 10 mon.a@0(probing).paxosservice(logm 1..37) restart
2015-02-25 11:53:53.985966 7ff319165700  7 mon.a@0(probing).log v37 _updated_log for osd.2 127.0.0.1:6813/18809
2015-02-25 11:53:53.985978 7ff319165700 15 mon.a@0(probing) e1 send_reply routing reply to 127.0.0.1:6813/18809 via 127.0.0.1:6791/0 for request log(1 entries) v1
2015-02-25 11:53:53.985991 7ff319165700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6791/0 -- route(log(last 5) v1 tid 21) v2 -- ?+0 0x7ff2ec0543f0 con 0x423be50
2015-02-25 11:53:53.986035 7ff319165700 10 mon.a@0(probing).paxosservice(monmap 1..1) restart
2015-02-25 11:53:53.986040 7ff319165700 10 mon.a@0(probing).paxosservice(auth 1..5) restart
2015-02-25 11:53:53.986050 7ff319165700 10 mon.a@0(probing) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:53.986059 7ff319165700 10 mon.a@0(probing) e1 reset_probe_timeout 0x7ff2ec01b630 after 2 seconds
2015-02-25 11:53:53.986079 7ff319165700 10 mon.a@0(probing) e1 probing other monitors
2015-02-25 11:53:53.986086 7ff319165700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name a) v6 -- ?+0 0x7ff2ec054bc0
2015-02-25 11:53:53.986125 7ff319165700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name a) v6 -- ?+0 0x7ff2ec013000
2015-02-25 11:53:53.986178 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986194 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.986218 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986224 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.986228 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986232 7ff318914700 20  allow all
2015-02-25 11:53:53.986236 7ff318914700 10 mon.a@0(probing) e1 received forwarded message from mon.1 127.0.0.1:6790/0 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.986247 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:53.986251 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986253 7ff318914700 20  allow all
2015-02-25 11:53:53.986267 7ff318914700 10 mon.a@0(probing) e1  caps are allow *
2015-02-25 11:53:53.986271 7ff318914700 10 mon.a@0(probing) e1  entity name 'mon.' type 1
2015-02-25 11:53:53.986276 7ff318914700 10 mon.a@0(probing) e1  mesg 0x7ff2fc002330 from 127.0.0.1:6790/0
2015-02-25 11:53:53.986289 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986292 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 :/0 is openallow * for mon.1 :/0
2015-02-25 11:53:53.986298 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986302 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.986315 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986316 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986333 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:53.986338 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986339 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986376 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 190 ==== paxos(accept lc 61 fc 0 pn 300 opn 0) v3 ==== 80+0+0 (3635305193 0 0) 0x7ff2f8001190 con 0x423be50
2015-02-25 11:53:53.986397 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986400 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:53.986409 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986413 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.986416 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986418 7ff318914700 20  allow all
2015-02-25 11:53:53.986421 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:53.986424 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986426 7ff318914700 20  allow all
2015-02-25 11:53:53.986428 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) election in progress, dropping paxos(accept lc 61 fc 0 pn 300 opn 0) v3
2015-02-25 11:53:53.986449 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff30004b5a0 con 0x4225cc0
2015-02-25 11:53:53.986462 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986465 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:53.986474 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986477 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:53.986486 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986490 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986500 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:53.986503 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986505 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986520 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 207 ==== paxos(accept lc 61 fc 0 pn 300 opn 0) v3 ==== 80+0+0 (2503014196 0 0) 0x7ff2fc002600 con 0x4236480
2015-02-25 11:53:53.986534 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986537 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.986546 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986549 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.986553 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986557 7ff318914700 20  allow all
2015-02-25 11:53:53.986561 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:53.986564 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986566 7ff318914700 20  allow all
2015-02-25 11:53:53.986567 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) election in progress, dropping paxos(accept lc 61 fc 0 pn 300 opn 0) v3
2015-02-25 11:53:53.986584 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 191 ==== mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name c) v6 ==== 58+0+0 (1529458076 0 0) 0x7ff2f8001a90 con 0x423be50
2015-02-25 11:53:53.986604 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986607 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:53.986615 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986619 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.986622 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986624 7ff318914700 20  allow all
2015-02-25 11:53:53.986626 7ff318914700 10 mon.a@0(probing) e1 handle_probe mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name c) v6
2015-02-25 11:53:53.986634 7ff318914700 10 mon.a@0(probing) e1 handle_probe_probe mon.2 127.0.0.1:6791/0mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name c) v6 features 1125899906842623
2015-02-25 11:53:53.986705 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6791/0 -- mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name a paxos( fc 1 lc 61 )) v6 -- ?+0 0x7ff3000649d0 con 0x423be50
2015-02-25 11:53:53.986777 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff2ec0539f0 con 0x4225cc0
2015-02-25 11:53:53.986809 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986816 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:53.986833 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986839 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:53.986857 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986859 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986886 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:53.986896 7ff318914700  5 mon.a@0(probing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:53.986898 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:53.986935 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 208 ==== mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name b) v6 ==== 58+0+0 (635250253 0 0) 0x7ff2fc002890 con 0x4236480
2015-02-25 11:53:53.986962 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.986965 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.986974 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.986979 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.986982 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.986984 7ff318914700 20  allow all
2015-02-25 11:53:53.986987 7ff318914700 10 mon.a@0(probing) e1 handle_probe mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name b) v6
2015-02-25 11:53:53.986995 7ff318914700 10 mon.a@0(probing) e1 handle_probe_probe mon.1 127.0.0.1:6790/0mon_probe(probe c98533db-e94a-4062-8b01-43319c52b048 name b) v6 features 1125899906842623
2015-02-25 11:53:53.987052 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6790/0 -- mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name a paxos( fc 1 lc 61 )) v6 -- ?+0 0x7ff300042fa0 con 0x4236480
2015-02-25 11:53:53.987115 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 192 ==== election(c98533db-e94a-4062-8b01-43319c52b048 propose 7) v5 ==== 529+0+0 (3021517391 0 0) 0x7ff2f8001fe0 con 0x423be50
2015-02-25 11:53:53.987154 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.987160 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:53.987177 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.987188 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.987194 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.987198 7ff318914700 20  allow all
2015-02-25 11:53:53.987203 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:53.987208 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.987211 7ff318914700 20  allow all
2015-02-25 11:53:53.987239 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 209 ==== election(c98533db-e94a-4062-8b01-43319c52b048 propose 7) v5 ==== 529+0+0 (3021517391 0 0) 0x7ff2fc002e20 con 0x4236480
2015-02-25 11:53:53.987274 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.987280 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.987297 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.987304 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.987309 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.987312 7ff318914700 20  allow all
2015-02-25 11:53:53.987317 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:53.987322 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.987325 7ff318914700 20  allow all
2015-02-25 11:53:53.987350 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 210 ==== mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name b paxos( fc 1 lc 61 )) v6 ==== 527+0+0 (1101047061 0 0) 0x7ff2fc001170 con 0x4236480
2015-02-25 11:53:53.987387 7ff318914700 20 mon.a@0(probing) e1 have connection
2015-02-25 11:53:53.987393 7ff318914700 20 mon.a@0(probing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:53.987410 7ff318914700 20 mon.a@0(probing) e1  caps allow *
2015-02-25 11:53:53.987417 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:53.987421 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:53.987425 7ff318914700 20  allow all
2015-02-25 11:53:53.987427 7ff318914700 10 mon.a@0(probing) e1 handle_probe mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name b paxos( fc 1 lc 61 )) v6
2015-02-25 11:53:53.987441 7ff318914700 10 mon.a@0(probing) e1 handle_probe_reply mon.1 127.0.0.1:6790/0mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name b paxos( fc 1 lc 61 )) v6
2015-02-25 11:53:53.987452 7ff318914700 10 mon.a@0(probing) e1  monmap is e1: 3 mons at {a=127.0.0.1:6789/0,b=127.0.0.1:6790/0,c=127.0.0.1:6791/0}
2015-02-25 11:53:53.987523 7ff318914700 10 mon.a@0(probing) e1  peer name is b
2015-02-25 11:53:53.987533 7ff318914700 10 mon.a@0(probing) e1  mon.b is outside the quorum
2015-02-25 11:53:53.987542 7ff318914700 10 mon.a@0(probing) e1  outside_quorum now a,b, need 2
2015-02-25 11:53:53.987549 7ff318914700 10 mon.a@0(probing) e1  that's enough to form a new quorum, calling election
2015-02-25 11:53:53.987552 7ff318914700 10 mon.a@0(probing) e1 start_election
2015-02-25 11:53:53.987556 7ff318914700 10 mon.a@0(probing) e1 _reset
2015-02-25 11:53:53.987560 7ff318914700 10 mon.a@0(probing) e1 cancel_probe_timeout 0x7ff2ec01b630
2015-02-25 11:53:53.987568 7ff318914700 10 mon.a@0(probing) e1 timecheck_finish
2015-02-25 11:53:53.987573 7ff318914700 15 mon.a@0(probing) e1 health_tick_stop
2015-02-25 11:53:53.987576 7ff318914700 15 mon.a@0(probing) e1 health_interval_stop
2015-02-25 11:53:53.987582 7ff318914700 10 mon.a@0(probing) e1 scrub_reset
2015-02-25 11:53:53.987586 7ff318914700 10 mon.a@0(probing).paxos(paxos recovering c 1..61) restart -- canceling timeouts
2015-02-25 11:53:53.987596 7ff318914700 10 mon.a@0(probing).paxosservice(pgmap 1..16) restart
2015-02-25 11:53:53.987601 7ff318914700 10 mon.a@0(probing).paxosservice(mdsmap 1..1) restart
2015-02-25 11:53:53.987605 7ff318914700 10 mon.a@0(probing).paxosservice(osdmap 1..10) restart
2015-02-25 11:53:53.987608 7ff318914700 10 mon.a@0(probing).paxosservice(logm 1..37) restart
2015-02-25 11:53:53.987611 7ff318914700 10 mon.a@0(probing).paxosservice(monmap 1..1) restart
2015-02-25 11:53:53.987614 7ff318914700 10 mon.a@0(probing).paxosservice(auth 1..5) restart
2015-02-25 11:53:53.987619 7ff318914700 10 mon.a@0(electing) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:53.987631 7ff318914700  0 log_channel(cluster) log [INF] : mon.a calling new monitor election
2015-02-25 11:53:53.987679 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x7ff300053bb0 con 0x4225cc0
2015-02-25 11:53:53.987780 7ff318914700  5 mon.a@0(electing).elector(6) start -- can i be leader?
2015-02-25 11:53:53.987848 7ff318914700  1 mon.a@0(electing).elector(6) init, last seen epoch 6
2015-02-25 11:53:53.987852 7ff318914700 10 mon.a@0(electing).elector(6) bump_epoch 6 to 7
2015-02-25 11:53:54.020000 7ff318914700 10 mon.a@0(electing) e1 join_election
2015-02-25 11:53:54.020026 7ff318914700 10 mon.a@0(electing) e1 _reset
2015-02-25 11:53:54.020029 7ff318914700 10 mon.a@0(electing) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:54.020031 7ff318914700 10 mon.a@0(electing) e1 timecheck_finish
2015-02-25 11:53:54.020034 7ff318914700 15 mon.a@0(electing) e1 health_tick_stop
2015-02-25 11:53:54.020036 7ff318914700 15 mon.a@0(electing) e1 health_interval_stop
2015-02-25 11:53:54.020039 7ff318914700 10 mon.a@0(electing) e1 scrub_reset
2015-02-25 11:53:54.020041 7ff318914700 10 mon.a@0(electing).paxos(paxos recovering c 1..61) restart -- canceling timeouts
2015-02-25 11:53:54.020050 7ff318914700 10 mon.a@0(electing).paxosservice(pgmap 1..16) restart
2015-02-25 11:53:54.020054 7ff318914700 10 mon.a@0(electing).paxosservice(mdsmap 1..1) restart
2015-02-25 11:53:54.020056 7ff318914700 10 mon.a@0(electing).paxosservice(osdmap 1..10) restart
2015-02-25 11:53:54.020057 7ff318914700 10 mon.a@0(electing).paxosservice(logm 1..37) restart
2015-02-25 11:53:54.020059 7ff318914700 10 mon.a@0(electing).paxosservice(monmap 1..1) restart
2015-02-25 11:53:54.020060 7ff318914700 10 mon.a@0(electing).paxosservice(auth 1..5) restart
2015-02-25 11:53:54.020109 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- election(c98533db-e94a-4062-8b01-43319c52b048 propose 7) v5 -- ?+0 0x7ff300058f50
2015-02-25 11:53:54.020146 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- election(c98533db-e94a-4062-8b01-43319c52b048 propose 7) v5 -- ?+0 0x7ff300059210
2015-02-25 11:53:54.020181 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 193 ==== mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name c paxos( fc 1 lc 61 )) v6 ==== 527+0+0 (3748692843 0 0) 0x7ff2f8001820 con 0x423be50
2015-02-25 11:53:54.020195 7ff318914700 20 mon.a@0(electing) e1 have connection
2015-02-25 11:53:54.020198 7ff318914700 20 mon.a@0(electing) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.020203 7ff318914700 20 mon.a@0(electing) e1  caps allow *
2015-02-25 11:53:54.020207 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.020209 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.020210 7ff318914700 20  allow all
2015-02-25 11:53:54.020212 7ff318914700 10 mon.a@0(electing) e1 handle_probe mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name c paxos( fc 1 lc 61 )) v6
2015-02-25 11:53:54.020216 7ff318914700 10 mon.a@0(electing) e1 handle_probe_reply mon.2 127.0.0.1:6791/0mon_probe(reply c98533db-e94a-4062-8b01-43319c52b048 name c paxos( fc 1 lc 61 )) v6
2015-02-25 11:53:54.020220 7ff318914700 10 mon.a@0(electing) e1  monmap is e1: 3 mons at {a=127.0.0.1:6789/0,b=127.0.0.1:6790/0,c=127.0.0.1:6791/0}
2015-02-25 11:53:54.020239 7ff318914700 10 mon.a@0(electing) e1  peer name is c
2015-02-25 11:53:54.020255 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff300053bb0 con 0x4225cc0
2015-02-25 11:53:54.020268 7ff318914700 20 mon.a@0(electing) e1 have connection
2015-02-25 11:53:54.020271 7ff318914700 20 mon.a@0(electing) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.020275 7ff318914700 20 mon.a@0(electing) e1  caps allow *
2015-02-25 11:53:54.020276 7ff318914700 10 mon.a@0(electing).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.020281 7ff318914700  5 mon.a@0(electing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.020281 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:54.020289 7ff318914700 10 mon.a@0(electing).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.020291 7ff318914700  5 mon.a@0(electing).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.020292 lease_expire=2015-02-25 11:53:47.553936 has v0 lc 61
2015-02-25 11:53:54.020622 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 194 ==== election(c98533db-e94a-4062-8b01-43319c52b048 ack 7) v5 ==== 26058+0+0 (108993628 0 0) 0x7ff2f80008c0 con 0x423be50
2015-02-25 11:53:54.020650 7ff318914700 20 mon.a@0(electing) e1 have connection
2015-02-25 11:53:54.020653 7ff318914700 20 mon.a@0(electing) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.020658 7ff318914700 20 mon.a@0(electing) e1  caps allow *
2015-02-25 11:53:54.020660 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.020663 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.020664 7ff318914700 20  allow all
2015-02-25 11:53:54.020665 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.020666 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.020667 7ff318914700 20  allow all
2015-02-25 11:53:54.020687 7ff318914700  5 mon.a@0(electing).elector(7) handle_ack from mon.2
2015-02-25 11:53:54.020691 7ff318914700  5 mon.a@0(electing).elector(7)  so far i have {0=1125899906842623,2=1125899906842623}
2015-02-25 11:53:54.020705 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 211 ==== election(c98533db-e94a-4062-8b01-43319c52b048 ack 7) v5 ==== 26058+0+0 (108993628 0 0) 0x7ff2fc001b70 con 0x4236480
2015-02-25 11:53:54.020714 7ff318914700 20 mon.a@0(electing) e1 have connection
2015-02-25 11:53:54.020715 7ff318914700 20 mon.a@0(electing) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.020719 7ff318914700 20 mon.a@0(electing) e1  caps allow *
2015-02-25 11:53:54.020721 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.020722 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.020723 7ff318914700 20  allow all
2015-02-25 11:53:54.020724 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.020725 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.020726 7ff318914700 20  allow all
2015-02-25 11:53:54.020739 7ff318914700  5 mon.a@0(electing).elector(7) handle_ack from mon.1
2015-02-25 11:53:54.020742 7ff318914700  5 mon.a@0(electing).elector(7)  so far i have {0=1125899906842623,1=1125899906842623,2=1125899906842623}
2015-02-25 11:53:54.020750 7ff318914700 10 mon.a@0(electing).elector(7) bump_epoch 7 to 8
2015-02-25 11:53:54.053558 7ff318914700 10 mon.a@0(electing) e1 join_election
2015-02-25 11:53:54.053594 7ff318914700 10 mon.a@0(electing) e1 _reset
2015-02-25 11:53:54.053598 7ff318914700 10 mon.a@0(electing) e1 cancel_probe_timeout (none scheduled)
2015-02-25 11:53:54.053601 7ff318914700 10 mon.a@0(electing) e1 timecheck_finish
2015-02-25 11:53:54.053605 7ff318914700 15 mon.a@0(electing) e1 health_tick_stop
2015-02-25 11:53:54.053608 7ff318914700 15 mon.a@0(electing) e1 health_interval_stop
2015-02-25 11:53:54.053613 7ff318914700 10 mon.a@0(electing) e1 scrub_reset
2015-02-25 11:53:54.053617 7ff318914700 10 mon.a@0(electing).paxos(paxos recovering c 1..61) restart -- canceling timeouts
2015-02-25 11:53:54.053629 7ff318914700 10 mon.a@0(electing).paxosservice(pgmap 1..16) restart
2015-02-25 11:53:54.053637 7ff318914700 10 mon.a@0(electing).paxosservice(mdsmap 1..1) restart
2015-02-25 11:53:54.053640 7ff318914700 10 mon.a@0(electing).paxosservice(osdmap 1..10) restart
2015-02-25 11:53:54.053642 7ff318914700 10 mon.a@0(electing).paxosservice(logm 1..37) restart
2015-02-25 11:53:54.053644 7ff318914700 10 mon.a@0(electing).paxosservice(monmap 1..1) restart
2015-02-25 11:53:54.053647 7ff318914700 10 mon.a@0(electing).paxosservice(auth 1..5) restart
2015-02-25 11:53:54.053732 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- election(c98533db-e94a-4062-8b01-43319c52b048 victory 8) v5 -- ?+0 0x7ff30005b4f0
2015-02-25 11:53:54.053802 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- election(c98533db-e94a-4062-8b01-43319c52b048 victory 8) v5 -- ?+0 0x7ff30018f460
2015-02-25 11:53:54.053825 7ff318914700 10 mon.a@0(electing) e1 win_election epoch 8 quorum 0,1,2 features 1125899906842623
2015-02-25 11:53:54.053843 7ff318914700  0 log_channel(cluster) log [INF] : mon.a@0 won leader election with quorum 0,1,2
2015-02-25 11:53:54.053883 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x7ff300070690 con 0x4225cc0
2015-02-25 11:53:54.053906 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) leader_init -- starting paxos recovery
2015-02-25 11:53:54.054068 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) learned uncommitted 62 pn 300 (13530 bytes) from myself
2015-02-25 11:53:54.112165 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) get_new_proposal_number = 400
2015-02-25 11:53:54.112208 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) collect with pn 400
2015-02-25 11:53:54.112225 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- paxos(collect lc 61 fc 1 pn 400 opn 0) v3 -- ?+0 0x7ff3000709a0
2015-02-25 11:53:54.112272 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- paxos(collect lc 61 fc 1 pn 400 opn 0) v3 -- ?+0 0x7ff300070d00
2015-02-25 11:53:54.112306 7ff318914700 10 mon.a@0(leader).paxosservice(monmap 1..1) election_finished
2015-02-25 11:53:54.112314 7ff318914700 10 mon.a@0(leader).paxosservice(monmap 1..1) _active - not active
2015-02-25 11:53:54.112322 7ff318914700 10 mon.a@0(leader).paxosservice(pgmap 1..16) election_finished
2015-02-25 11:53:54.112325 7ff318914700 10 mon.a@0(leader).paxosservice(pgmap 1..16) _active - not active
2015-02-25 11:53:54.112330 7ff318914700 10 mon.a@0(leader).paxosservice(mdsmap 1..1) election_finished
2015-02-25 11:53:54.112335 7ff318914700 10 mon.a@0(leader).paxosservice(mdsmap 1..1) _active - not active
2015-02-25 11:53:54.112344 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) election_finished
2015-02-25 11:53:54.112348 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) _active - not active
2015-02-25 11:53:54.112355 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) election_finished
2015-02-25 11:53:54.112358 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) _active - not active
2015-02-25 11:53:54.112362 7ff318914700 10 mon.a@0(leader).paxosservice(auth 1..5) election_finished
2015-02-25 11:53:54.112366 7ff318914700 10 mon.a@0(leader).paxosservice(auth 1..5) _active - not active
2015-02-25 11:53:54.112371 7ff318914700 10 mon.a@0(leader).data_health(8) start_epoch epoch 8
2015-02-25 11:53:54.112399 7ff318914700 10 mon.a@0(leader) e1 timecheck_finish
2015-02-25 11:53:54.112406 7ff318914700 10 mon.a@0(leader) e1 resend_routed_requests
2015-02-25 11:53:54.112413 7ff318914700 10 mon.a@0(leader) e1 register_cluster_logger
2015-02-25 11:53:54.112430 7ff318914700 10 mon.a@0(leader) e1 timecheck_start
2015-02-25 11:53:54.112435 7ff318914700 10 mon.a@0(leader) e1 timecheck_start_round curr 0
2015-02-25 11:53:54.112441 7ff318914700 10 mon.a@0(leader) e1 timecheck_start_round new 1
2015-02-25 11:53:54.112445 7ff318914700 10 mon.a@0(leader) e1 timecheck
2015-02-25 11:53:54.112448 7ff318914700 10 mon.a@0(leader) e1 timecheck start timecheck epoch 8 round 1
2015-02-25 11:53:54.112469 7ff318914700 10 mon.a@0(leader) e1 timecheck send time_check( ping e 8 r 1 ) v1 to mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.112486 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- time_check( ping e 8 r 1 ) v1 -- ?+0 0x7ff300191010
2015-02-25 11:53:54.112521 7ff318914700 10 mon.a@0(leader) e1 timecheck send time_check( ping e 8 r 1 ) v1 to mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.112533 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- time_check( ping e 8 r 1 ) v1 -- ?+0 0x7ff30018bfa0
2015-02-25 11:53:54.112563 7ff318914700 10 mon.a@0(leader) e1 timecheck_start_round setting up next event
2015-02-25 11:53:54.112577 7ff318914700 15 mon.a@0(leader) e1 health_tick_start
2015-02-25 11:53:54.112581 7ff318914700 15 mon.a@0(leader) e1 health_tick_stop
2015-02-25 11:53:54.112589 7ff318914700 10 mon.a@0(leader) e1 do_health_to_clog_interval
2015-02-25 11:53:54.112592 7ff318914700 10 mon.a@0(leader) e1 do_health_to_clog (force)
2015-02-25 11:53:54.112661 7ff318914700 10 mon.a@0(leader).data_health(8) get_health
2015-02-25 11:53:54.112691 7ff318914700  0 log_channel(cluster) log [INF] : HEALTH_WARN; 3 near full osd(s)
2015-02-25 11:53:54.112732 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x7ff30018dfb0 con 0x4225cc0
2015-02-25 11:53:54.112758 7ff318914700 15 mon.a@0(leader) e1 health_interval_start
2015-02-25 11:53:54.112762 7ff318914700 15 mon.a@0(leader) e1 health_interval_stop
2015-02-25 11:53:54.112765 7ff318914700 20 mon.a@0(leader) e1 health_interval_calc_next_update now: 2015-02-25 11:53:54.112764, next: 2015-02-25 12:00:00.000000, interval: 3600
2015-02-25 11:53:54.112811 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff300070690 con 0x4225cc0
2015-02-25 11:53:54.112827 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.112830 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.112840 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.112843 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.112852 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.112853 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.112862 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.112866 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.112867 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.113034 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff30018dfb0 con 0x4225cc0
2015-02-25 11:53:54.113065 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.113069 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.113078 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.113081 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.113090 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.113091 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.113104 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.113137 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.113139 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.113335 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 195 ==== forward(osd_boot(osd.2 booted 0 features 1125899906842623 v8) v6 caps allow * tid 13 con_features 1125899906842623) to leader v3 ==== 1978+0+0 (4197059272 0 0) 0x7ff2f80008c0 con 0x423be50
2015-02-25 11:53:54.113382 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.113387 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.113395 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.113402 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.113406 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.113410 7ff318914700 20  allow all
2015-02-25 11:53:54.113413 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from osd.2 127.0.0.1:6813/18809 via mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.113420 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.113423 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.113424 7ff318914700 20  allow all
2015-02-25 11:53:54.113433 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.113436 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.113440 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2f8001d50 from 127.0.0.1:6791/0
2015-02-25 11:53:54.113444 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.113447 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: osd.2 :/0 is openallow * for osd.2 :/0
2015-02-25 11:53:54.113452 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.113454 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch osd_boot(osd.2 booted 0 features 1125899906842623 v8) v6 from osd.2 127.0.0.1:6813/18809
2015-02-25 11:53:54.113463 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.113464 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.113476 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v8)
2015-02-25 11:53:54.113479 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.113480 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115223 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 212 ==== forward(log(1 entries) v1 caps allow * tid 36 con_features 0) to leader v3 ==== 587+0+0 (3901180112 0 0) 0x7ff2fc001b70 con 0x4236480
2015-02-25 11:53:54.115262 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115269 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115284 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115291 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.115295 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115299 7ff318914700 20  allow all
2015-02-25 11:53:54.115304 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from mon.1 127.0.0.1:6790/0 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115318 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.115322 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115325 7ff318914700 20  allow all
2015-02-25 11:53:54.115338 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.115344 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.115351 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2fc002860 from 127.0.0.1:6790/0
2015-02-25 11:53:54.115360 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115364 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 :/0 is openallow * for mon.1 :/0
2015-02-25 11:53:54.115374 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115378 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115393 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115394 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115410 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.115417 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115418 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115451 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 213 ==== forward(osd_boot(osd.0 booted 0 features 1125899906842623 v3) v6 caps allow * tid 5 con_features 1125899906842623) to leader v3 ==== 1978+0+0 (3061426784 0 0) 0x7ff2fc003300 con 0x4236480
2015-02-25 11:53:54.115479 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115484 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115497 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115504 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.115508 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115511 7ff318914700 20  allow all
2015-02-25 11:53:54.115514 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from osd.0 127.0.0.1:6800/18322 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115528 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.115532 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115535 7ff318914700 20  allow all
2015-02-25 11:53:54.115545 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.115550 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.115556 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2fc003ac0 from 127.0.0.1:6790/0
2015-02-25 11:53:54.115564 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115568 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: osd.0 :/0 is openallow * for osd.0 :/0
2015-02-25 11:53:54.115577 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115581 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch osd_boot(osd.0 booted 0 features 1125899906842623 v3) v6 from osd.0 127.0.0.1:6800/18322
2015-02-25 11:53:54.115594 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115595 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115608 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v3)
2015-02-25 11:53:54.115614 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115615 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115642 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 214 ==== forward(osd_boot(osd.1 booted 0 features 1125899906842623 v5) v6 caps allow * tid 11 con_features 1125899906842623) to leader v3 ==== 1978+0+0 (1198834337 0 0) 0x7ff2fc005720 con 0x4236480
2015-02-25 11:53:54.115668 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115672 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115685 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115740 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.115745 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115748 7ff318914700 20  allow all
2015-02-25 11:53:54.115751 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from osd.1 127.0.0.1:6809/18556 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115766 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.115770 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115772 7ff318914700 20  allow all
2015-02-25 11:53:54.115787 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.115792 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.115798 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2fc005a30 from 127.0.0.1:6790/0
2015-02-25 11:53:54.115807 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115811 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: osd.1 :/0 is openallow * for osd.1 :/0
2015-02-25 11:53:54.115820 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115824 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch osd_boot(osd.1 booted 0 features 1125899906842623 v5) v6 from osd.1 127.0.0.1:6809/18556
2015-02-25 11:53:54.115837 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115839 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115853 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v5)
2015-02-25 11:53:54.115858 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.115860 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.115890 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 215 ==== forward(osd_alive(want up_thru 6 have 6) v1 caps allow * tid 14 con_features 1125899906842623) to leader v3 ==== 286+0+0 (407096547 0 0) 0x7ff2fc007780 con 0x4236480
2015-02-25 11:53:54.115915 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.115921 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115934 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.115939 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.115943 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115946 7ff318914700 20  allow all
2015-02-25 11:53:54.115949 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from osd.0 127.0.0.1:6800/18322 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.115963 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.115966 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.115969 7ff318914700 20  allow all
2015-02-25 11:53:54.115979 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.115984 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.115990 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2fc0082b0 from 127.0.0.1:6790/0
2015-02-25 11:53:54.115998 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.116002 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: osd.0 :/0 is openallow * for osd.0 :/0
2015-02-25 11:53:54.116010 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.116014 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch osd_alive(want up_thru 6 have 6) v1 from osd.0 127.0.0.1:6800/18322
2015-02-25 11:53:54.116026 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116027 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.116040 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v6)
2015-02-25 11:53:54.116046 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116047 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.116081 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 216 ==== forward(mon_command({"prefix": "osd setcrushmap"} v 0) v1 caps allow * tid 35 con_features 1125899906842623) to leader v3 ==== 776+0+0 (644466655 0 0) 0x7ff2fc007f10 con 0x4236480
2015-02-25 11:53:54.116106 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.116110 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.116123 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.116128 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.116132 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.116135 7ff318914700 20  allow all
2015-02-25 11:53:54.116138 7ff318914700 10 mon.a@0(leader) e1 received forwarded message from client.4114 127.0.0.1:0/1018996 via mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.116152 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.116155 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.116158 7ff318914700 20  allow all
2015-02-25 11:53:54.116167 7ff318914700 10 mon.a@0(leader) e1  caps are allow *
2015-02-25 11:53:54.116171 7ff318914700 10 mon.a@0(leader) e1  entity name 'unknown.' type 0
2015-02-25 11:53:54.116177 7ff318914700 10 mon.a@0(leader) e1  mesg 0x7ff2fc008d40 from 127.0.0.1:6790/0
2015-02-25 11:53:54.116185 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.116189 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: client.4114 :/0 is openallow * for client.4114 :/0
2015-02-25 11:53:54.116198 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.116491 7ff318914700  0 mon.a@0(leader) e1 handle_command mon_command({"prefix": "osd setcrushmap"} v 0) v1
2015-02-25 11:53:54.116544 7ff318914700 20 is_capable service=osd command=osd setcrushmap read write on cap allow *
2015-02-25 11:53:54.116551 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.116554 7ff318914700 20  allow all
2015-02-25 11:53:54.116556 7ff318914700 10 mon.a@0(leader) e1 _allowed_command capable
2015-02-25 11:53:54.116574 7ff318914700  0 log_channel(audit) log [INF] : from='client.4114 :/0' entity='unknown.' cmd=[{"prefix": "osd setcrushmap"}]: dispatch
2015-02-25 11:53:54.116628 7ff318914700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x7ff300194030 con 0x4225cc0
2015-02-25 11:53:54.116662 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10) dispatch mon_command({"prefix": "osd setcrushmap"} v 0) v1 from client.4114 127.0.0.1:0/1018996
2015-02-25 11:53:54.116678 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116680 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.116693 7ff318914700 10 mon.a@0(leader).paxosservice(osdmap 1..10)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.116700 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116701 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.116745 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.0 127.0.0.1:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x7ff300194030 con 0x4225cc0
2015-02-25 11:53:54.116768 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.116773 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.0 127.0.0.1:6789/0 is openallow * for mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.116787 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.116791 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37) dispatch log(1 entries) v1 from mon.0 127.0.0.1:6789/0
2015-02-25 11:53:54.116804 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116805 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.116817 7ff318914700 10 mon.a@0(leader).paxosservice(logm 1..37)  waiting for paxos -> readable (v0)
2015-02-25 11:53:54.116822 7ff318914700  5 mon.a@0(leader).paxos(paxos recovering c 1..61) is_readable = 0 - now=2015-02-25 11:53:54.116824 lease_expire=0.000000 has v0 lc 61
2015-02-25 11:53:54.164582 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 196 ==== paxos(last lc 61 fc 1 pn 400 opn 300) v3 ==== 13622+0+0 (1717090164 0 0) 0x7ff2f80008c0 con 0x423be50
2015-02-25 11:53:54.164603 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.164606 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.164621 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.164624 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.164625 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.164626 7ff318914700 20  allow all
2015-02-25 11:53:54.164627 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.164629 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.164629 7ff318914700 20  allow all
2015-02-25 11:53:54.164631 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) handle_last paxos(last lc 61 fc 1 pn 400 opn 300) v3
2015-02-25 11:53:54.164638 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) store_state nothing to commit
2015-02-25 11:53:54.164662 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61)  they accepted our pn, we now have 2 peons
2015-02-25 11:53:54.164667 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) we learned an uncommitted value for 62 pn 300 13530 bytes
2015-02-25 11:53:54.164677 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 197 ==== time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.164458 ) v1 ==== 36+0+0 (2444859969 0 0) 0x7ff2f80068b0 con 0x423be50
2015-02-25 11:53:54.164686 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.164688 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.164692 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.164694 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.164695 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.164696 7ff318914700 20  allow all
2015-02-25 11:53:54.164697 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.164458 ) v1
2015-02-25 11:53:54.164701 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck_leader time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.164458 ) v1
2015-02-25 11:53:54.164713 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck_leader from mon.2 127.0.0.1:6791/0 ts 2015-02-25 11:53:54.164458 delta -0.000246525 skew_bound 0 latency 0.0521936
2015-02-25 11:53:54.187725 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 217 ==== paxos(last lc 61 fc 1 pn 400 opn 300) v3 ==== 13622+0+0 (3722418234 0 0) 0x7ff2fc003300 con 0x4236480
2015-02-25 11:53:54.187754 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.187757 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.187760 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.187762 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.187763 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.187764 7ff318914700 20  allow all
2015-02-25 11:53:54.187765 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.187766 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.187766 7ff318914700 20  allow all
2015-02-25 11:53:54.187767 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) handle_last paxos(last lc 61 fc 1 pn 400 opn 300) v3
2015-02-25 11:53:54.187772 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) store_state nothing to commit
2015-02-25 11:53:54.187793 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61)  they accepted our pn, we now have 3 peons
2015-02-25 11:53:54.187797 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) we learned an uncommitted value for 62 pn 300 13530 bytes
2015-02-25 11:53:54.187803 7ff318914700 10 mon.a@0(leader).paxos(paxos recovering c 1..61) that's everyone.  begin on old learned value
2015-02-25 11:53:54.187804 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61) begin for 62 13530 bytes
2015-02-25 11:53:54.221061 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61)  sending begin to mon.1
2015-02-25 11:53:54.221100 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- paxos(begin lc 61 fc 0 pn 400 opn 0) v3 -- ?+0 0x7ff3000770f0
2015-02-25 11:53:54.221129 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61)  sending begin to mon.2
2015-02-25 11:53:54.221136 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- paxos(begin lc 61 fc 0 pn 400 opn 0) v3 -- ?+0 0x7ff300077f20
2015-02-25 11:53:54.221178 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 218 ==== time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.187582 ) v1 ==== 36+0+0 (128518225 0 0) 0x7ff2fc005720 con 0x4236480
2015-02-25 11:53:54.221201 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.221205 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.221211 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.221215 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.221217 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.221219 7ff318914700 20  allow all
2015-02-25 11:53:54.221221 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.187582 ) v1
2015-02-25 11:53:54.221225 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck_leader time_check( pong e 8 r 1 ts 2015-02-25 11:53:54.187582 ) v1
2015-02-25 11:53:54.221240 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck_leader from mon.1 127.0.0.1:6790/0 ts 2015-02-25 11:53:54.187582 delta -0.0336463 skew_bound 0 latency 0.108768
2015-02-25 11:53:54.221256 7ff318914700 10 mon.a@0(leader) e1 handle_timecheck_leader got pongs from everybody (3 total)
2015-02-25 11:53:54.221258 7ff318914700 10 mon.a@0(leader) e1 timecheck_finish_round curr 1
2015-02-25 11:53:54.221260 7ff318914700 10 mon.a@0(leader) e1 timecheck_report
2015-02-25 11:53:54.221271 7ff318914700 10 mon.a@0(leader) e1 timecheck_report send report to mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.221274 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- time_check( report e 8 r 2 #skews 3 #latencies 3 ) v1 -- ?+0 0x7ff30007a0c0
2015-02-25 11:53:54.221297 7ff318914700 10 mon.a@0(leader) e1 timecheck_report send report to mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.221303 7ff318914700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- time_check( report e 8 r 2 #skews 3 #latencies 3 ) v1 -- ?+0 0x7ff30007b190
2015-02-25 11:53:54.263269 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.2 127.0.0.1:6791/0 198 ==== paxos(accept lc 61 fc 0 pn 400 opn 0) v3 ==== 80+0+0 (3718585490 0 0) 0x7ff2f80008c0 con 0x423be50
2015-02-25 11:53:54.263299 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.263302 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.2 127.0.0.1:6791/0 is openallow * for mon.2 127.0.0.1:6791/0
2015-02-25 11:53:54.263308 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.263311 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.263314 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.263315 7ff318914700 20  allow all
2015-02-25 11:53:54.263317 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.263318 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.263319 7ff318914700 20  allow all
2015-02-25 11:53:54.263321 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61) handle_accept paxos(accept lc 61 fc 0 pn 400 opn 0) v3
2015-02-25 11:53:54.263329 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61)  now 0,2 have accepted
2015-02-25 11:53:54.263343 7ff318914700  1 -- 127.0.0.1:6789/0 <== mon.1 127.0.0.1:6790/0 219 ==== paxos(accept lc 61 fc 0 pn 400 opn 0) v3 ==== 80+0+0 (3889582458 0 0) 0x7ff2fc007780 con 0x4236480
2015-02-25 11:53:54.263351 7ff318914700 20 mon.a@0(leader) e1 have connection
2015-02-25 11:53:54.263352 7ff318914700 20 mon.a@0(leader) e1 ms_dispatch existing session MonSession: mon.1 127.0.0.1:6790/0 is openallow * for mon.1 127.0.0.1:6790/0
2015-02-25 11:53:54.263356 7ff318914700 20 mon.a@0(leader) e1  caps allow *
2015-02-25 11:53:54.263358 7ff318914700 20 is_capable service=mon command= read on cap allow *
2015-02-25 11:53:54.263359 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.263360 7ff318914700 20  allow all
2015-02-25 11:53:54.263362 7ff318914700 20 is_capable service=mon command= exec on cap allow *
2015-02-25 11:53:54.263363 7ff318914700 20  allow so far , doing grant allow *
2015-02-25 11:53:54.263364 7ff318914700 20  allow all
2015-02-25 11:53:54.263364 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61) handle_accept paxos(accept lc 61 fc 0 pn 400 opn 0) v3
2015-02-25 11:53:54.263369 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61)  now 0,1,2 have accepted
2015-02-25 11:53:54.263373 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61)  got majority, committing, done with update
2015-02-25 11:53:54.263375 7ff318914700 10 mon.a@0(leader).paxos(paxos updating-previous c 1..61) commit_start 62
2015-02-25 11:53:54.296667 7ff31a207700 20 mon.a@0(leader).paxos(paxos writing-previous c 1..61) commit_finish 62
2015-02-25 11:53:54.296764 7ff31a207700 10 mon.a@0(leader).paxos(paxos writing-previous c 1..62)  sending commit to mon.1
2015-02-25 11:53:54.296781 7ff31a207700  1 -- 127.0.0.1:6789/0 --> mon.1 127.0.0.1:6790/0 -- paxos(commit lc 62 fc 0 pn 400 opn 0) v3 -- ?+0 0x7ff2f0168080
2015-02-25 11:53:54.296809 7ff31a207700 10 mon.a@0(leader).paxos(paxos writing-previous c 1..62)  sending commit to mon.2
2015-02-25 11:53:54.296816 7ff31a207700  1 -- 127.0.0.1:6789/0 --> mon.2 127.0.0.1:6791/0 -- paxos(commit lc 62 fc 0 pn 400 opn 0) v3 -- ?+0 0x7ff2f0040960
2015-02-25 11:53:54.296841 7ff31a207700 10 mon.a@0(leader) e1 refresh_from_paxos
2015-02-25 11:53:54.296920 7ff31a207700 10 mon.a@0(leader).paxosservice(pgmap 1..16) refresh
2015-02-25 11:53:54.296968 7ff31a207700 10 mon.a@0(leader).paxosservice(mdsmap 1..1) refresh
2015-02-25 11:53:54.297010 7ff31a207700 10 mon.a@0(leader).paxosservice(osdmap 1..10) refresh
2015-02-25 11:53:54.297079 7ff31a207700 10 mon.a@0(leader).paxosservice(logm 1..38) refresh
Actions #5

Updated by Loïc Dachary about 9 years ago

  • Description updated (diff)
Actions #6

Updated by Loïc Dachary about 9 years ago

  • Description updated (diff)
Actions #7

Updated by Loïc Dachary about 9 years ago

  • Description updated (diff)
Actions #8

Updated by Loïc Dachary about 9 years ago

  • Description updated (diff)
Actions #9

Updated by Loïc Dachary about 9 years ago

<jluis> if the tool hangs for more than one of the timeouts (I believe the lowest is 5 seconds), elections will be triggered when it snaps out of it
<jluis> by then it will retry the message once election finishes
<jluis> that doesn't happen solely with the crushtool though; anything that takes more than that time, on the monitor, will end up having the same ill effects
<jluis> one of the reasons why leveldb writes are now performed on a separate thread
<jluis> this could be improved though
<loicd> the problem is that it is quite possible that crushtool will lag for that long
<jluis> for this particular purpose, having the crushtool map stuff be performed asynchronously, having a callback that would finish processing the update once the validation is finished; or retrying the message if we lost quorum
<loicd> yeah
<jluis> we would have to make sure that the callback would check if the state once if finishes validation is the same as the state once it started validation
<jluis> otherwise would have to revalidate
<jluis> e.g., two concurrent requests, as it is now, would not break anything as they'd be serialized
<jluis> if we async the validation, a concurrent request could change the crushmap in-between
Actions #10

Updated by Loïc Dachary about 9 years ago

Actions #11

Updated by Loïc Dachary about 9 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF