Project

General

Profile

Actions

Bug #13921

closed

Error connecting to cluster: TypeError

Added by Sage Weil over 8 years ago. Updated over 8 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2015-11-30T08:57:24.981 INFO:tasks.workunit.client.0.burnupi26.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:1707: test_mon_ping:  expect_false ceph ping mon.foo
2015-11-30T08:57:24.981 INFO:tasks.workunit.client.0.burnupi26.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:30: expect_false:  set -x
2015-11-30T08:57:24.981 INFO:tasks.workunit.client.0.burnupi26.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:31: expect_false:  ceph ping mon.foo
2015-11-30T08:57:25.095 INFO:tasks.workunit.client.0.burnupi26.stderr:Error connecting to cluster: ObjectNotFound
2015-11-30T08:57:25.108 INFO:tasks.workunit.client.0.burnupi26.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:31: expect_false:  return 0
2015-11-30T08:57:25.109 INFO:tasks.workunit.client.0.burnupi26.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:1709: test_mon_ping:  ceph ping 'mon.*'
2015-11-30T08:57:25.246 INFO:tasks.workunit.client.0.burnupi26.stdout:mon.a
2015-11-30T08:57:25.246 INFO:tasks.workunit.client.0.burnupi26.stdout:{
2015-11-30T08:57:25.247 INFO:tasks.workunit.client.0.burnupi26.stdout:    "health": {
2015-11-30T08:57:25.247 INFO:tasks.workunit.client.0.burnupi26.stdout:        "health": {
...
2015-11-30T08:57:25.288 INFO:tasks.workunit.client.0.burnupi26.stdout:            ]
2015-11-30T08:57:25.288 INFO:tasks.workunit.client.0.burnupi26.stdout:        }
2015-11-30T08:57:25.288 INFO:tasks.workunit.client.0.burnupi26.stdout:    }
2015-11-30T08:57:25.288 INFO:tasks.workunit.client.0.burnupi26.stdout:}
2015-11-30T08:57:25.288 INFO:tasks.workunit.client.0.burnupi26.stdout:
2015-11-30T08:57:25.289 INFO:tasks.workunit.client.0.burnupi26.stderr:Error connecting to cluster: TypeError

/a/sage-2015-11-30_05:29:21-rados-wip-sage-testing---basic-multi/1163936
Actions #1

Updated by Josh Durgin over 8 years ago

  • Assignee set to Josh Durgin

this is likely related to the python3 rados.py changes

Actions #2

Updated by Josh Durgin over 8 years ago

  • Assignee deleted (Josh Durgin)

Can't reproduce on master (at least not easily), maybe a bug in a testing commit

Actions #3

Updated by Loïc Dachary over 8 years ago

  • Assignee set to Loïc Dachary
Actions #4

Updated by Loïc Dachary over 8 years ago

  • Status changed from New to In Progress
Actions #5

Updated by Loïc Dachary over 8 years ago

source ~/Downloads/ovh-gra1-openrc.sh
teuthology-openstack --verbose --num 10 --simultaneous-jobs 50 --key-filename ~/Downloads/myself --key-name loic --teuthology-git-url http://github.com/dachary/teuthology --teuthology-branch openstack --ceph-qa-suite-git-url http://github.com/ceph/ceph-qa-suite --suite-branch master --suite rados --filter 'rados/singleton/{all/cephtool.yaml fs/btrfs.yaml msgr/simple.yaml msgr-failures/many.yaml}' 
Actions #6

Updated by Loïc Dachary over 8 years ago

I don't see anything even remotely capable of causing the error in the commits from wip-sage-testing that were not already in master.

a7f520c auth: fix a crash issue due to CryptoHandler::create() failed 
e9e0533 auth: fix double PK11_DestroyContext() if PK11_DigestFinal() failed 
574e319 Test:bencher ops counter doesn't increase Signed-off-by: Tao Chang <changtao@hihuron.com> 
f5e0cce osd: don't update rollback_info for replicated pool rollback_info is just needed for ec-pool to rollback the patial committed chunks to previous version. Avoid recording rollback_info in replicated pool to save cpu cost and disk bandwidth 
2b390fc osd: don't update unneccessary epoch for pg epoch always remains unless state of cluster changes. Therefore, avoid update epoch for every Op in order to same cpu cost and disk bandwidth. 
84945d0 Implemented log message size predictor. It tracks size of log messages. Initial allocation size is derived from last log message from the same line of code. Fixed bug in test. 
a172fef Speed optimizations. Merged 3 writes into 1. Got rid of std::string construction. More unification on syslog,stderr,fd. 
56da106 Test:bencher wrong test margin casuses writes over object_size 
917d85f osbench: Adds handling for the lack of required folders ( data & journal ) and adds checking for previous data presence to avoid assertion 
2902030 osbench: Fix race condition that may cause Sequencer::dtor assertion on benchmark completion 
daae180 Doubled marking from line 1151 
ada6e32 ceph/bp-smaller-pglog-2 osd: slightly reduce actual size of pg_log_entry_t 
93984e1 osd: reuse coll and object encode resut in get_encoded_bytes 
Actions #7

Updated by Loïc Dachary over 8 years ago

With wip-sage-testing

teuthology-openstack --verbose --num 10 --simultaneous-jobs 50 --key-filename ~/Downloads/myself --key-name loic --teuthology-git-url http://github.com/dachary/teuthology --teuthology-branch openstack --ceph-qa-suite-git-url http://github.com/ceph/ceph-qa-suite --suite-branch master --ceph-git-url http://github.com/dachary/ceph --ceph wip-sage-testing --suite rados --filter 'rados/singleton/{all/cephtool.yaml fs/btrfs.yaml msgr/simple.yaml msgr-failures/many.yaml}' 

Because of a bug that was fixed since in master

Actions #8

Updated by Loïc Dachary over 8 years ago

after rebasing wip-sage-testing on master

teuthology-openstack --verbose --num 10 --simultaneous-jobs 50 --key-filename ~/Downloads/myself --key-name loic --teuthology-git-url http://github.com/dachary/teuthology --teuthology-branch openstack --ceph-qa-suite-git-url http://github.com/ceph/ceph-qa-suite --suite-branch master --ceph-git-url http://github.com/dachary/ceph --ceph wip-sage-testing-rebased --suite rados --filter 'rados/singleton/{all/cephtool.yaml fs/btrfs.yaml msgr/simple.yaml msgr-failures/many.yaml}' 

After rebasing the commits that are left are

b53cce2 wip-sage-testing-rebased loic/wip-sage-testing-rebased Revert "Merge branch 'wip-log-alloc-predictor' of git://github.com/aclamk/ceph into wip-sage-testing" 
5dafb9f Revert "Merge branch 'wip-encode-bytes' of git://github.com/XinzeChi/ceph into wip-sage-testing" 
60ab746 auth: fix a crash issue due to CryptoHandler::create() failed 
a7736ea auth: fix double PK11_DestroyContext() if PK11_DigestFinal() failed 
a613ffe Implemented log message size predictor. It tracks size of log messages. Initial allocation size is derived from last log message from the same line of code. Fixed bug in test. 

9 out of 10 passed and the fail is unrelated

2015-12-04T13:35:28.964 INFO:tasks.workunit.client.0.target149202191082.stderr:2015-12-04 13:35:28.963437 7f6a0657e700  0 monclient: hunting for new mon
2015-12-04T13:35:29.162 INFO:tasks.workunit.client.0.target149202191082.stdout:ceph version 10.0.0-861-gb53cce2 (b53cce2c674707b42ffe1ad166ebfa30194c509d)
2015-12-04T13:35:29.174 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:1638: test_mon_tell:  ceph_watch_wait 'mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch'
2015-12-04T13:35:29.174 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:158: ceph_watch_wait:  local 'regexp=mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch'
2015-12-04T13:35:29.174 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:159: ceph_watch_wait:  local timeout=30
2015-12-04T13:35:29.175 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:161: ceph_watch_wait:  '[' -n '' ']'
2015-12-04T13:35:29.176 INFO:tasks.workunit.client.0.target149202191082.stderr://home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:165: ceph_watch_wait:  seq 30
2015-12-04T13:35:29.176 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:165: ceph_watch_wait:  for i in '`seq ${timeout}`'
2015-12-04T13:35:29.176 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:166: ceph_watch_wait:  sleep 1
2015-12-04T13:35:30.178 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:167: ceph_watch_wait:  grep -q 'mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch' /tmp/cephtool10061/CEPH_WATCH_10061
2015-12-04T13:35:30.180 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:165: ceph_watch_wait:  for i in '`seq ${timeout}`'
2015-12-04T13:35:30.180 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:166: ceph_watch_wait:  sleep 1
2015-12-04T13:35:31.182 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:167: ceph_watch_wait:  grep -q 'mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch' /tmp/cephtool10061/CEPH_WATCH_10061
2015-12-04T13:35:31.185 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:165: ceph_watch_wait:  for i in '`seq ${timeout}`'
2015-12-04T13:35:31.185 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:166: ceph_watch_wait:  sleep 1
2015-12-04T13:35:31.944 INFO:tasks.workunit.client.0.target149202191082.stderr:2015-12-04 13:35:31.942940 7faff4f35700  0 -- 149.202.191.82:0/3836557283 submit_message mon_command({"prefix": "status"} v 0) v1 remote, 149.202.191.82:6791/0, failed lossy con, dropping message 0x7faff004e400
2015-12-04T13:35:31.944 INFO:tasks.workunit.client.0.target149202191082.stderr:2015-12-04 13:35:31.942977 7fafedd7d700  0 monclient: hunting for new mon
2015-12-04T13:35:31.949 INFO:tasks.workunit.client.0.target149202191082.stderr:2015-12-04 13:35:31.946968 7fafedd7d700  0 monclient: hunting for new mon
2015-12-04T13:35:32.187 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:167: ceph_watch_wait:  grep -q 'mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch' /tmp/cephtool10061/CEPH_WATCH_10061
...
n"}\]: dispatch' /tmp/cephtool10061/CEPH_WATCH_10061
2015-12-04T13:35:59.325 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:170: ceph_watch_wait:  kill 5973
2015-12-04T13:35:59.325 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:172: ceph_watch_wait:  grep 'mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch' /tmp/cephtool10061/CEPH_WATCH_10061
2015-12-04T13:35:59.329 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh: line 157:  5973 Terminated              ceph $whatch_opt > $CEPH_WATCH_FILE
2015-12-04T13:35:59.329 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:173: ceph_watch_wait:  echo 'pattern mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch not found in watch file. Full watch file content:'
2015-12-04T13:35:59.330 INFO:tasks.workunit.client.0.target149202191082.stderr:pattern mon.0 \[DBG\] from.*cmd=\[{"prefix": "version"}\]: dispatch not found in watch file. Full watch file content:
2015-12-04T13:35:59.330 INFO:tasks.workunit.client.0.target149202191082.stderr:/home/ubuntu/cephtest/workunit.client.0/cephtool/test.sh:174: ceph_watch_wait:  cat /tmp/cephtool10061/CEPH_WATCH_10061
2015-12-04T13:35:59.331 INFO:tasks.workunit.client.0.target149202191082.stderr:    cluster aaa48ccc-347f-463f-9f90-c3dacf556458
2015-12-04T13:35:59.331 INFO:tasks.workunit.client.0.target149202191082.stderr:     health HEALTH_OK
2015-12-04T13:35:59.331 INFO:tasks.workunit.client.0.target149202191082.stderr:     monmap e1: 3 mons at {a=149.202.191.82:6789/0,b=149.202.191.82:6790/0,c=149.202.191.82:6791/0}
2015-12-04T13:35:59.331 INFO:tasks.workunit.client.0.target149202191082.stderr:            election epoch 8, quorum 0,1,2 a,b,c
2015-12-04T13:35:59.331 INFO:tasks.workunit.client.0.target149202191082.stderr:     osdmap e407: 3 osds: 3 up, 3 in
2015-12-04T13:35:59.332 INFO:tasks.workunit.client.0.target149202191082.stderr:      pgmap v499: 13 pgs, 2 pools, 0 bytes data, 0 objects
2015-12-04T13:35:59.332 INFO:tasks.workunit.client.0.target149202191082.stderr:            29289 MB used, 86558 MB / 118 GB avail
2015-12-04T13:35:59.332 INFO:tasks.workunit.client.0.target149202191082.stderr:                  13 active+clean
2015-12-04T13:35:59.332 INFO:tasks.workunit.client.0.target149202191082.stderr:
Actions #9

Updated by Loïc Dachary over 8 years ago

  • Status changed from In Progress to Can't reproduce
Actions

Also available in: Atom PDF