Project

General

Profile

Bug #12550

CEPH_QA_SUITE/AARCH64: ceph_test_async_driver fail

Added by Yazen Ghannam over 8 years ago. Updated over 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
aarch64 arm64
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The following test fails on AArch64:
rados:singleton-nomsgr/{all/msgr.yaml}

2015-07-09T06:46:32.044 INFO:teuthology.orchestra.run.teuth6.stdout:[----------] Global test environment tear-down
2015-07-09T06:46:32.044 INFO:teuthology.orchestra.run.teuth6.stdout:[==========] 5 tests from 2 test cases ran. (5122 ms total)
2015-07-09T06:46:32.045 INFO:teuthology.orchestra.run.teuth6.stdout:[  PASSED  ] 1 test.
2015-07-09T06:46:32.046 INFO:teuthology.orchestra.run.teuth6.stdout:[  FAILED  ] 4 tests, listed below:
2015-07-09T06:46:32.047 INFO:teuthology.orchestra.run.teuth6.stdout:[  FAILED  ] EventCenterTest.FileEventExpansion
2015-07-09T06:46:32.048 INFO:teuthology.orchestra.run.teuth6.stdout:[  FAILED  ] AsyncMessenger/EventDriverTest.PipeTest/1, where GetParam() = "select" 
2015-07-09T06:46:32.049 INFO:teuthology.orchestra.run.teuth6.stdout:[  FAILED  ] AsyncMessenger/EventDriverTest.NetworkSocketTest/0, where GetParam() = "epoll" 
2015-07-09T06:46:32.050 INFO:teuthology.orchestra.run.teuth6.stdout:[  FAILED  ] AsyncMessenger/EventDriverTest.NetworkSocketTest/1, where GetParam() = "select" 
2015-07-09T06:46:32.051 INFO:teuthology.orchestra.run.teuth6.stdout:
2015-07-09T06:46:32.051 INFO:teuthology.orchestra.run.teuth6.stdout: 4 FAILED TESTS
2015-07-09T06:46:32.053 INFO:teuthology.orchestra.run.teuth6.stderr:SetUp start set up select
2015-07-09T06:46:32.053 INFO:teuthology.orchestra.run.teuth6.stderr:2015-07-09 06:46:32.028267 3ff925c95f0 -1 EpollDriver.init unable to do epoll_create: (24) Too many open files
2015-07-09T06:46:32.107 ERROR:teuthology.run_tasks:Saw exception from tasks.

teuthology-rados_singleton-nomsgr_ceph_test_async_driver_fail.log View (604 KB) Yazen Ghannam, 07/31/2015 12:29 PM

History

#1 Updated by Haomai Wang over 8 years ago

I don't know why AARCH64 will cause "::socket(AF_INET, SOCK_STREAM, 0)" return -1:

2015-07-09 06:46:26.966675 3ff925c95f0 -1 EpollDriver.add_event epoll_ctl: add fd=-1 failed. (9) Bad file descriptor

#2 Updated by Loïc Dachary about 8 years ago

  • Tags changed from aarch64 to aarch64 arm64

#3 Updated by Dan Mick almost 8 years ago

  • Assignee set to Dan Mick

#4 Updated by Dan Mick almost 8 years ago

So, the error message says "too many open files". Any idea what the open files resource limits might have been?

#5 Updated by Haomai Wang almost 8 years ago

is it this?https://github.com/ceph/ceph/blob/master/src/test/msgr/test_async_driver.cc#L258

but it only open 300 fd which I don't think is a big number

#6 Updated by Dan Mick almost 8 years ago

Just noticed this was 8 mos old. Yazen, do you have any info about what the open-file limit on the machine you were testing might have been, or any more-recent experience with this?

#7 Updated by Yazen Ghannam almost 8 years ago

Dan,
We were using a mix of Fedora 21 and Ubuntu 14.04, and I didn't change the limits from the defaults. Unfortunately, we tore down our Teuthology cluster months ago so I can't verify what the exact values were.

I think some Ceph developers are using machines in the Linaro Colo Lab for testing. I'm not sure if they're running through the Ceph QA Suite though. If they are they may have more information that could confirm or invalidate this issue.

#8 Updated by Dan Mick over 7 years ago

  • Status changed from New to Can't reproduce

Please reopen if this reoccurs.

Also available in: Atom PDF