Bug #9381
"jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so)" error in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
100%
Description
Per Josh analysis:
"looking at one of the ones that timed out waiting to be healthy: http://pulpito.ceph.com/teuthology-2014-09-06_17:08:01-upgrade:dumpling-firefly-x-master-distro-basic-vps/470461/ teuthology.log reports that 1 mon is down 2:28 and it's mon.a 2:29 the end of mon.a's log has an error: -1 load: jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so): /usr/lib64/ceph/erasure-code/libec_lrc.so: cannot open shared object file: No such file or directory 2:29 no crash, but maybe that failure caused the mon to exit"
Error from vpm019/log/ceph-mon.a.log.gz :
vpm019/log/ceph-mon.a.log.gz:75490040-2014-09-06 21:16:11.934638 7f532445e700 15 mon.a@0(leader).mds e10 _note_beacon mdsbeacon(4500/a up:active seq 89 v10) v2 noting time vpm019/log/ceph-mon.a.log.gz:75490174-2014-09-06 21:16:11.934646 7f532445e700 1 -- 10.214.138.58:6789/0 --> 10.214.138.58:6808/19385 -- mdsbeacon(4500/a up:active seq 89 v10) v2 -- ?+0 0x1e0b9c0 con 0x1d9d960 vpm019/log/ceph-mon.a.log.gz:75490346-2014-09-06 21:16:19.749711 7f37967407a0 0 ceph version 0.84-1029-g7d8fe2d (7d8fe2d994a673f2187bf99ac8e20df6a0cd2514), process ceph-mon, pid 20161 vpm019/log/ceph-mon.a.log.gz:75490493:2014-09-06 21:16:19.992684 7f37967407a0 -1 load: jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so): /usr/lib64/ceph/erasure-code/libec_lrc.so: cannot open shared object file: No such file or directory ^C teuthology@teuthology:/a/teuthology-2014-09-06_17:08:01-upgrade:dumpling-firefly-x-master-distro-basic-vps/470461/remote$ zgrep "cannot open shared object file" vpm019/log/ceph-mon.a.log.gz -a20 vpm019/log/ceph-mon.a.log.gz:2014-09-06 21:16:11.743071 7f532445e700 1 mon.a@0(leader).paxos(paxos active c 4268..4827) is_readable now=2014-09-06 21:16:11.743072 lease_expire=2014-09-06 21:16:16.739306 has v0 lc 4827 vpm019/log/ceph-mon.a.log.gz:2014-09-06 21:16:11.743083 7f532445e700 10 mon.a@0(leader).osd e1292 preprocess_query mon_command({"prefix": "osd pool create", "pool": "test-rados-api-vpm050-10162-59", "pool_type":"erasure", "pg_num":8, "pgp_num":8, "erasure_code_profile":"testprofile"} v 0) v1 from client.4681 10.214.138.113:0/59010162 vpm019/log/ceph-mon.a.log.gz:2014-09-06 21:16:11.743142 7f532445e700 7 mon.a@0(leader).osd e1292 prepare_update mon_command({"prefix": "osd pool create", "pool": "test-rados-api-vpm050-10162-59", "pool_type":"erasure", "pg_num":8, "pgp_num":8, "erasure_code_profile":"testprofile"} v 0) v1 from client.4681 10.214.138.113:0/59010162 vpm019/log/ceph-mon.a.log.gz:2014-09-06 21:16:11.743258 7f532445e700 1 mon.a@0(leader).osd e1292 implicitly use ruleset named after the pool: test-rados-api-vpm050-10162-59 vpm019/log/ceph-mon.a.log.gz:2014-09-06 21:16:11.743496 7f532445e700 10 mon.a@0(leader).osd e1292 should_propose
Associated revisions
packaging: add all erasure code plugins to RPM packages
It means distributing a few plugins that are only used for unit testing
but it does not use much disk space and this is otherwise harmless.
Explicitly listing which plugins are to be installed is problematic
because some of them (isa for now and maybe more later) are not
available for all architectures. Properly maintaining the list of
plugins to install would therefore mean exactly matching which
architecture has which plugins.
http://tracker.ceph.com/issues/9381 Fixes: #9381
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
History
#1 Updated by Yuri Weinstein about 9 years ago
More from Josh:
Also 'ceph pg dump --format json' failed it's the same root cause, but in this case all 3 mons went down from errors loading libec_lrc.so, so the 'ceph pg dump' timed out i.e. http://pulpito.ceph.com/teuthology-2014-09-07_17:08:02-upgrade:dumpling-firefly-x-master-distro-basic-vps/471368/
#2 Updated by Yuri Weinstein about 9 years ago
Looks the same on giant (centos and rhel specific (?)) - http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-08_17:00:01-upgrade:dumpling-giant-x-master-distro-basic-vps/472895/teuthology.log
suite:upgrade:dumpling-giant-x
#3 Updated by Loïc Dachary about 9 years ago
- Status changed from New to Duplicate
It is a duplicate of http://tracker.ceph.com/issues/9343
#4 Updated by Loïc Dachary about 9 years ago
- Category set to Monitor
- Status changed from Duplicate to 12
- Assignee set to Loïc Dachary
- Priority changed from Normal to High
It does not look like a duplicate after all. It fails when preloading the lrc erasure code plugin.
#5 Updated by Loïc Dachary about 9 years ago
- Status changed from 12 to In Progress
#6 Updated by Loïc Dachary about 9 years ago
ceph-mon and the plugins are in the ceph package . However the lrc and isa plugins need to be explicitly mentioned and that was not done.
#7 Updated by Loïc Dachary about 9 years ago
- Status changed from In Progress to Fix Under Review
#8 Updated by Loïc Dachary about 9 years ago
- % Done changed from 0 to 90
#9 Updated by Sage Weil about 9 years ago
- Status changed from Fix Under Review to Pending Backport
merged to master
#10 Updated by Yuri Weinstein about 9 years ago
Note for re-testing:
Same issues on suite:upgrade:dumpling-giant-x
Visibly tests fail with error
"CommandFailedError: Command failed on vpm078 with status 1: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd dump --format=json'"
#11 Updated by Loïc Dachary about 9 years ago
This is because giant is at tag v0.85 which does not include the fix. The fix is in the giant branch though so it will work when the next giant release candidate is available.
2014-09-11T16:59:55.839 INFO:teuthology.orchestra.run.vpm078:Running: 'sudo yum -y install ceph-debuginfo-0.85 ceph-radosgw-0.85 ceph-test-0.85 ceph-devel-0.85 ceph-0.85 ceph-fuse-0.85 rest-bench-0.85 libcephfs_jni1-0.85 libcephfs1-0.85 python-ceph-0.85' 2014-09-11T17:03:02.577 INFO:teuthology.task.install:config contains sha1|tag|branch, removing those keys from override 2014-09-11T17:03:02.577 INFO:teuthology.task.install:remote ubuntu@vpm068.front.sepia.ceph.com config {'branch': 'giant'} 2014-09-11T17:03:02.578 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo lsb_release -is' 2014-09-11T17:03:04.436 DEBUG:teuthology.misc:System to be installed: CentOS 2014-09-11T17:03:04.436 INFO:teuthology.task.install:Upgrading ceph rpm packages: ceph-debuginfo, ceph-radosgw, ceph-test, ceph-devel, ceph, ceph-fuse, rest-bench, libcephfs_jni1, libcephfs1, python-ceph 2014-09-11T17:03:04.436 INFO:teuthology.orchestra.run.vpm068:Running: 'arch' 2014-09-11T17:03:04.467 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -is' 2014-09-11T17:03:04.579 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -rs' 2014-09-11T17:03:04.624 INFO:teuthology.task.install:config is {'project': 'ceph', 'branch': 'giant'} 2014-09-11T17:03:04.624 INFO:teuthology.task.install:Host vpm068 is: CentOS 6.5 x86_64 2014-09-11T17:03:04.624 INFO:teuthology.orchestra.run.vpm068:Running: 'arch' 2014-09-11T17:03:04.717 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -is' 2014-09-11T17:03:04.827 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -rs' 2014-09-11T17:03:04.936 INFO:teuthology.task.install:config is {'project': 'ceph', 'branch': 'giant'} 2014-09-11T17:03:04.937 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo lsb_release -is' 2014-09-11T17:03:05.055 DEBUG:teuthology.misc:System to be installed: CentOS 2014-09-11T17:03:05.055 INFO:teuthology.task.install:Repo base URL: http://gitbuilder.ceph.com/ceph-rpm-centos6-x86_64-basic/ref/giant 2014-09-11T17:03:05.056 INFO:teuthology.orchestra.run.vpm068:Running: 'wget -q -O- http://gitbuilder.ceph.com/ceph-rpm-centos6-x86_64-basic/ref/giant/version' 2014-09-11T17:03:05.371 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo rpm -ev ceph-release' 2014-09-11T17:03:06.036 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo rpm -Uv http://gitbuilder.ceph.com/ceph-rpm-centos6-x86_64-basic/ref/giant/noarch/ceph-release-1-0.el6.noarch.rpm' 2014-09-11T17:03:06.668 INFO:teuthology.orchestra.run.vpm068:Running: 'arch' 2014-09-11T17:03:06.694 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -is' 2014-09-11T17:03:06.803 INFO:teuthology.orchestra.run.vpm068:Running: 'lsb_release -rs' 2014-09-11T17:03:06.846 INFO:teuthology.task.install:config is {'project': 'ceph', 'branch': 'giant'} 2014-09-11T17:03:06.846 INFO:teuthology.orchestra.run.vpm068:Running: "sudo sed -i -e ':a;N;$!ba;s/enabled=1\\ngpg/enabled=1\\npriority=1\\ngpg/g' -e 's;ref/[a-zA-Z0-9_]*/;ref/giant/;g' /etc/yum.repos.d/ceph.repo" 2014-09-11T17:03:06.945 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo yum clean all' 2014-09-11T17:03:08.182 INFO:teuthology.orchestra.run.vpm068.stdout:Loaded plugins: fastestmirror, priorities 2014-09-11T17:03:08.288 INFO:teuthology.orchestra.run.vpm068.stdout:Cleaning repos: Ceph Ceph-noarch base centos6-apache-ceph centos6-fcgi-ceph 2014-09-11T17:03:08.288 INFO:teuthology.orchestra.run.vpm068.stdout: : centos6-misc-ceph centos6-qemu-ceph ceph-source epel extras 2014-09-11T17:03:08.289 INFO:teuthology.orchestra.run.vpm068.stdout: : rpmforge updates 2014-09-11T17:03:08.289 INFO:teuthology.orchestra.run.vpm068.stdout:Cleaning up Everything 2014-09-11T17:03:08.328 INFO:teuthology.orchestra.run.vpm068.stdout:Cleaning up list of fastest mirrors 2014-09-11T17:03:08.346 INFO:teuthology.orchestra.run.vpm068:Running: 'sudo yum -y install ceph-debuginfo-0.85 ceph-radosgw-0.85 ceph-test-0.85 ceph-devel-0.85 ceph-0.85 ceph-fuse-0.85 rest-bench-0.85 libcephfs_jni1-0.85 libcephfs1-0.85 python-ceph-0.85' 2014-09-11T17:06:42.937 INFO:teuthology.run_tasks:Running task print... 2014-09-11T17:06:42.937 INFO:teuthology.task.print:**** done install.upgrade 2014-09-11T17:06:42.937 INFO:teuthology.run_tasks:Running task ceph.restart... 2014-09-11T17:06:48.937 INFO:tasks.ceph.mon.a:Stopped 2014-09-11T17:06:48.937 INFO:tasks.ceph.mon.a:Restarting daemon 2014-09-11T17:06:48.937 INFO:teuthology.orchestra.run.vpm078:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i a' 2014-09-11T17:06:49.000 INFO:tasks.ceph.mon.a:Started 2014-09-11T17:06:51.857 INFO:tasks.ceph.mon.a.vpm078.stderr:2014-09-11 20:06:51.856791 7f6fc03407a0 -1 load: jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so): /usr/lib64/ceph/erasure-code/libec_lrc.so: cannot open shared object file: No such file or directory
#12 Updated by Loïc Dachary about 9 years ago
- Status changed from Pending Backport to Resolved
- % Done changed from 90 to 100
All rpm packages were eventually updated.