https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2015-10-18T10:53:29Z
Ceph
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=60298
2015-10-18T10:53:29Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/60298/diff?detail_id=58053">diff</a>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=60299
2015-10-18T10:54:12Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/60299/diff?detail_id=58054">diff</a>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=60300
2015-10-18T10:54:52Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/60300/diff?detail_id=58055">diff</a>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=60807
2015-10-28T01:41:37Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/60807/diff?detail_id=58525">diff</a>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=61493
2015-11-12T16:02:21Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Can't reproduce</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67675
2016-03-17T08:13:54Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-5 priority-high3 closed" href="/issues/8433">Bug #8433</a>: SSHException: Key-exchange timed out waiting for key negotiation</i> added</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67677
2016-03-17T08:14:43Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Duplicated by</strong> deleted (<i><a class="issue tracker-1 status-10 priority-5 priority-high3 closed" href="/issues/8433">Bug #8433</a>: SSHException: Key-exchange timed out waiting for key negotiation</i>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67678
2016-03-17T08:14:47Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Duplicates</strong> <i><a class="issue tracker-1 status-10 priority-5 priority-high3 closed" href="/issues/8433">Bug #8433</a>: SSHException: Key-exchange timed out waiting for key negotiation</i> added</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67680
2016-03-17T08:15:06Z
Nathan Cutler
ncutler@suse.cz
<ul></ul><p>I can reproduce this.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67686
2016-03-17T08:32:30Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Status</strong> changed from <i>Can't reproduce</i> to <i>12</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67914
2016-03-21T09:56:24Z
Loïc Dachary
loic@dachary.org
<ul></ul><p>When I looked at the VMs while the test was failing it had plenty of RAM and disk space. I think getting past the re-keying problem reveals another problem. It happens at</p>
<pre>
- ceph.restart:
daemons:
- mon.b
wait-for-healthy: false
wait-for-osds-up: true
- print: '**** done ceph.restart mon.b 6-next-mon'
</pre>
<p>which is not supposed to lose an OSD or even restart one. Yet, one of them goes down. Running another test and setting up an alarm to look into the logs while it happens.</p>
<code>ceph-workbench ceph-qa-suite --verbose --simultaneous-jobs 10 --suite upgrade/infernalis-x --suite-branch master --ceph jewel --ceph-git-url https://github.com/ceph/ceph --filter 'upgrade:infernalis-x/stress-split/{0-cluster/start.yaml 1-infernalis-install/infernalis.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/{rbd-cls.yaml rbd-import-export.yaml readwrite.yaml snaps-few-objects.yaml} 6-next-mon/monb.yaml 7-workload/{radosbench.yaml rbd_api.yaml} 8-next-mon/monc.yaml 9-workload/{rbd-python.yaml rgw-swift.yaml snaps-many-objects.yaml} distros/centos_7.2.yaml}'</code>
<ul>
<li><strong>running</strong> <a class="external" href="http://167.114.241.163:8081/ubuntu-2016-03-21_09:59:39-upgrade:infernalis-x-jewel---basic-openstack/">http://167.114.241.163:8081/ubuntu-2016-03-21_09:59:39-upgrade:infernalis-x-jewel---basic-openstack/</a></li>
</ul>
<p>After ~7h and on the last workload<br /><pre>
2016-03-21T15:58:12.906 INFO:teuthology.task.print:**** done swift 9-workload
...
2016-03-21T16:49:44.947 INFO:tasks.rados.rados.0.target167114242166.stderr:
2016-03-21T16:49:44.948 DEBUG:paramiko.transport:[chan 108] EOF received (108)
2016-03-21T16:49:44.949 DEBUG:paramiko.transport:[chan 108] EOF sent (108)
2016-03-21T16:49:44.950 INFO:tasks.ceph.ceph_manager:removing pool_name unique_pool_3
2016-03-21T16:49:44.951 INFO:teuthology.orchestra.run.target167114242167:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage rados rmpool unique_pool_3 unique_pool_3 --yes-i-really-really-mean-it'
2016-03-21T16:49:44.951 DEBUG:paramiko.transport:[chan 5212] Max packet in: 32768 bytes
2016-03-21T16:49:44.953 DEBUG:paramiko.transport:Rekeying (hit 173137 packets, 2100587824 bytes received)
2016-03-21T16:49:44.953 DEBUG:paramiko.transport:[chan 5212] Max packet out: 32768 bytes
2016-03-21T16:49:44.953 DEBUG:paramiko.transport:Secsh channel 5212 opened.
2016-03-21T16:49:44.962 DEBUG:paramiko.transport:Rekeying (hit 173138 packets, 2100587888 bytes received)
2016-03-21T16:49:44.963 DEBUG:paramiko.transport:Rekeying (hit 173139 packets, 2100587936 bytes received)
2016-03-21T16:49:44.963 DEBUG:paramiko.transport:[chan 5212] Sesch channel 5212 request ok
2016-03-21T16:49:44.964 DEBUG:paramiko.transport:[chan 5212] EOF sent (5212)
2016-03-21T16:49:45.184 DEBUG:paramiko.transport:Rekeying (hit 173140 packets, 2100588240 bytes received)
2016-03-21T16:49:45.184 INFO:teuthology.orchestra.run.target167114242167.stderr:2016-03-21 16:49:45.230131 7efd7cf01a40 -1 asok(0x7efd7d61ac60) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.16638.asok': (13) Permission denied
2016-03-21T16:49:45.314 DEBUG:paramiko.transport:Rekeying (hit 173141 packets, 2100588336 bytes received)
2016-03-21T16:49:45.315 INFO:teuthology.orchestra.run.target167114242167.stdout:successfully deleted pool unique_pool_3
2016-03-21T16:49:45.316 DEBUG:paramiko.transport:Rekeying (hit 173142 packets, 2100588416 bytes received)
2016-03-21T16:49:45.316 DEBUG:paramiko.transport:Rekeying (hit 173143 packets, 2100588464 bytes received)
2016-03-21T16:49:45.316 DEBUG:paramiko.transport:[chan 5212] EOF received (5212)
2016-03-21T16:49:45.317 DEBUG:paramiko.transport:Rekeying (hit 173144 packets, 2100588512 bytes received)
2016-03-21T16:49:45.317 DEBUG:teuthology.run_tasks:Unwinding manager swift
2016-03-21T16:49:45.336 DEBUG:teuthology.run_tasks:Unwinding manager rgw
2016-03-21T16:49:45.350 INFO:tasks.rgw:Stopping apache...
2016-03-21T16:49:45.350 DEBUG:paramiko.transport:[chan 98] EOF sent (98)
2016-03-21T16:49:45.361 DEBUG:paramiko.transport:[chan 98] EOF received (98)
2016-03-21T16:49:45.362 INFO:teuthology.misc:Shutting down rgw daemons...
2016-03-21T16:49:45.363 DEBUG:paramiko.transport:[chan 95] EOF sent (95)
2016-03-21T16:49:45.363 DEBUG:tasks.rgw.client.0:waiting for process to exit
2016-03-21T16:49:45.405 INFO:tasks.rgw.client.0.target167114242166.stdout:2016-03-21 16:49:45.452149 7f5d66993880 -1 shutting down
2016-03-21T16:49:45.809 DEBUG:paramiko.transport:[chan 95] EOF received (95)
2016-03-21T16:49:46.819 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:47.828 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:48.837 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:49.846 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:50.855 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:51.364 INFO:tasks.rgw.client.0:Stopped
2016-03-21T16:49:51.365 INFO:teuthology.orchestra.run.target167114242166:Running: 'rm -f /home/ubuntu/cephtest/rgw.opslog.client.0.sock'
2016-03-21T16:49:51.365 DEBUG:paramiko.transport:[chan 109] Max packet in: 32768 bytes
2016-03-21T16:49:51.366 DEBUG:paramiko.transport:[chan 109] Max packet out: 32768 bytes
2016-03-21T16:49:51.367 DEBUG:paramiko.transport:Secsh channel 109 opened.
2016-03-21T16:49:51.373 DEBUG:paramiko.transport:[chan 109] Sesch channel 109 request ok
2016-03-21T16:49:51.373 DEBUG:paramiko.transport:[chan 109] EOF sent (109)
2016-03-21T16:49:51.444 DEBUG:paramiko.transport:[chan 109] EOF received (109)
2016-03-21T16:49:51.445 INFO:tasks.rgw:Removing apache config...
2016-03-21T16:49:51.446 INFO:teuthology.orchestra.run.target167114242166:Running: 'rm -f /home/ubuntu/cephtest/apache/apache.client.0.conf && rm -f /home/ubuntu/cephtest/apache/htdocs.client.0/rgw.fcgi'
2016-03-21T16:49:51.446 DEBUG:paramiko.transport:[chan 110] Max packet in: 32768 bytes
2016-03-21T16:49:51.447 DEBUG:paramiko.transport:[chan 110] Max packet out: 32768 bytes
2016-03-21T16:49:51.448 DEBUG:paramiko.transport:Secsh channel 110 opened.
2016-03-21T16:49:51.453 DEBUG:paramiko.transport:[chan 110] Sesch channel 110 request ok
2016-03-21T16:49:51.453 DEBUG:paramiko.transport:[chan 110] EOF sent (110)
2016-03-21T16:49:51.515 DEBUG:paramiko.transport:[chan 110] EOF received (110)
2016-03-21T16:49:51.516 INFO:tasks.rgw:Cleaning up apache directories...
2016-03-21T16:49:51.516 INFO:teuthology.orchestra.run.target167114242166:Running: 'rm -rf /home/ubuntu/cephtest/apache/tmp.client.0 && rmdir /home/ubuntu/cephtest/apache/htdocs.client.0'
2016-03-21T16:49:51.517 DEBUG:paramiko.transport:[chan 111] Max packet in: 32768 bytes
2016-03-21T16:49:51.518 DEBUG:paramiko.transport:[chan 111] Max packet out: 32768 bytes
2016-03-21T16:49:51.519 DEBUG:paramiko.transport:Secsh channel 111 opened.
2016-03-21T16:49:51.523 DEBUG:paramiko.transport:[chan 111] Sesch channel 111 request ok
2016-03-21T16:49:51.523 DEBUG:paramiko.transport:[chan 111] EOF sent (111)
2016-03-21T16:49:51.581 DEBUG:paramiko.transport:[chan 111] EOF received (111)
2016-03-21T16:49:51.583 INFO:teuthology.orchestra.run.target167114242166:Running: 'rmdir /home/ubuntu/cephtest/apache'
2016-03-21T16:49:51.583 DEBUG:paramiko.transport:[chan 112] Max packet in: 32768 bytes
2016-03-21T16:49:51.584 DEBUG:paramiko.transport:[chan 112] Max packet out: 32768 bytes
2016-03-21T16:49:51.585 DEBUG:paramiko.transport:Secsh channel 112 opened.
2016-03-21T16:49:51.590 DEBUG:paramiko.transport:[chan 112] Sesch channel 112 request ok
2016-03-21T16:49:51.590 DEBUG:paramiko.transport:[chan 112] EOF sent (112)
2016-03-21T16:49:51.645 DEBUG:paramiko.transport:[chan 112] EOF received (112)
2016-03-21T16:49:51.646 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2016-03-21T16:49:51.665 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2016-03-21T16:49:51.679 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2016-03-21T16:49:51.693 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2016-03-21T16:49:51.708 INFO:tasks.thrashosds:joining thrashosds
2016-03-21T16:49:52.615 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T16:49:52.632 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
...
2016-03-21T17:32:32.622 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
2016-03-21T17:32:32.720 DEBUG:paramiko.transport:Sending global request "keepalive@lag.net"
</pre></p>
<p>When on the machine running osd.5<br /><pre>
[ubuntu@target167114242169 ~]$ sudo ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 6.00000 root default
-3 6.00000 rack localrack
-2 6.00000 host localhost
0 1.00000 osd.0 up 0.27551 0.56276
1 1.00000 osd.1 up 0 0
2 1.00000 osd.2 up 1.00000 1.00000
3 1.00000 osd.3 up 1.00000 0.32735
4 1.00000 osd.4 up 1.00000 0.12149
5 1.00000 osd.5 down 0 1.00000
[ubuntu@target167114242169 ~]$ ps fauwwwx
...
root 15035 0.0 0.0 142860 4668 ? Ss 11:58 0:00 \_ sshd: ubuntu [priv]
ubuntu 15040 0.0 0.0 142976 2456 ? S 11:58 0:09 | \_ sshd: ubuntu@notty
root 4938 0.0 0.0 195512 2792 ? Ss 15:42 0:00 | \_ sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 3
root 4989 0.0 0.0 151604 6112 ? S 15:42 0:04 | | \_ /usr/bin/python /bin/daemon-helper kill ceph-osd -f -i 3
root 4992 5.3 2.1 927168 169208 ? Ssl 15:42 6:20 | | \_ ceph-osd -f -i 3
root 13890 0.0 0.0 195512 2788 ? Ss 15:51 0:00 | \_ sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 4
root 13940 0.0 0.0 151604 6112 ? S 15:51 0:03 | | \_ /usr/bin/python /bin/daemon-helper kill ceph-osd -f -i 4
root 13944 4.2 1.9 881596 151844 ? Ssl 15:51 4:39 | | \_ ceph-osd -f -i 4
root 17571 0.0 0.0 195512 2788 ? Ss 15:55 0:00 | \_ sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 5
root 17621 0.0 0.0 151604 6108 ? S 15:55 0:03 | | \_ /usr/bin/python /bin/daemon-helper kill ceph-osd -f -i 5
root 17625 0.0 0.2 325260 17400 ? Ssl 15:55 0:05 | | \_ ceph-osd -f -i 5
root 17754 0.0 0.0 195512 2784 ? Ss 15:55 0:00 | \_ sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight
root 17779 0.0 0.1 209116 11132 ? S 15:55 0:00 | \_ python /bin/ceph --admin-daemon /var/run/ceph/ceph
[ubuntu@target167114242169 ~]$ sudo lsof -p 17625
ceph-osd 17625 root 0r CHR 1,3 0t0 1028 /dev/null
ceph-osd 17625 root 1w FIFO 0,8 0t0 366541 pipe
ceph-osd 17625 root 2w FIFO 0,8 0t0 366542 pipe
ceph-osd 17625 root 3w REG 253,1 8250976636 109618888 /var/log/ceph/ceph-osd.5.log
ceph-osd 17625 root 4u IPv4 367639 0t0 TCP *:6800 (LISTEN)
ceph-osd 17625 root 5u IPv4 367640 0t0 TCP *:acnet (LISTEN)
ceph-osd 17625 root 6u IPv4 367641 0t0 TCP *:6802 (LISTEN)
ceph-osd 17625 root 7u IPv4 367642 0t0 TCP *:6803 (LISTEN)
ceph-osd 17625 root 8r FIFO 0,8 0t0 367643 pipe
ceph-osd 17625 root 9w FIFO 0,8 0t0 367643 pipe
ceph-osd 17625 root 10u unix 0xffff880225ac7480 0t0 367644 /var/run/ceph/ceph-osd.5.asok
ceph-osd 17625 root 11uW REG 253,1 37 109618889 /var/lib/ceph/osd/ceph-5/fsid
ceph-osd 17625 root 12r DIR 253,1 4096 109618885 /var/lib/ceph/osd/ceph-5
ceph-osd 17625 root 13r DIR 253,1 24576 11018 /var/lib/ceph/osd/ceph-5/current
ceph-osd 17625 root 14u REG 253,1 7 11019 /var/lib/ceph/osd/ceph-5/current/commit_op_seq
ceph-osd 17625 root 15w REG 253,1 319 8646712 /var/lib/ceph/osd/ceph-5/current/omap/LOG
ceph-osd 17625 root 16uW REG 253,1 0 8646713 /var/lib/ceph/osd/ceph-5/current/omap/LOCK
ceph-osd 17625 root 17u REG 253,1 0 10240874 /var/lib/ceph/osd/ceph-5/current/omap/000352.log
ceph-osd 17625 root 18u REG 253,1 65536 10240876 /var/lib/ceph/osd/ceph-5/current/omap/MANIFEST-000350
ceph-osd 17625 root 19u REG 253,1 104857600 109618892 /var/lib/ceph/osd/ceph-5/journal
ceph-osd 17625 root 20u REG 253,1 417 17531111 /var/lib/ceph/osd/ceph-5/current/meta/osd\usuperblock__0_23C2FCDE__none
ceph-osd 17625 root 21r DIR 253,1 4096 75684106 /usr/lib64/rados-classes
ceph-osd 17625 root 22u unix 0xffff8802287a61c0 0t0 366796 /var/run/ceph/ceph-osd.5.asok
[ubuntu@target167114242169 ~]$ tail /var/log/ceph/ceph-osd.5.log
2016-03-21 15:55:36.350651 7f9b1a781900 10 _load_class log from /usr/lib64/rados-classes/libcls_log.so
2016-03-21 15:55:36.351549 7f9b1a781900 10 register_class log status 3
2016-03-21 15:55:36.351551 7f9b1a781900 10 register_cxx_method log.add flags 3 0x7f9b0a130380
2016-03-21 15:55:36.351553 7f9b1a781900 10 register_cxx_method log.list flags 1 0x7f9b0a131810
2016-03-21 15:55:36.351554 7f9b1a781900 10 register_cxx_method log.trim flags 3 0x7f9b0a130e50
2016-03-21 15:55:36.351556 7f9b1a781900 10 register_cxx_method log.info flags 1 0x7f9b0a12fc90
2016-03-21 15:55:36.351557 7f9b1a781900 10 _load_class log success
2016-03-21 15:55:36.351559 7f9b1a781900 10 open_all_classes found numops
2016-03-21 15:55:36.351561 7f9b1a781900 10 _get_class adding new class name numops 0x7f9b1ee1cda8
2016-03-21 15:55:36.351562 7f9b1a781900 10 _load_class numops from /usr/lib64/rados-classes/libcls_numops.so
[ubuntu@target167114242169 ~]$ sudo ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight
(hangs)
[ubuntu@target167114242169 ~]$ sudo strace -p 17625
Process 17625 attached
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f9b19af1820, FUTEX_WAIT_PRIVATE, 2, {0, 118126640}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f9b19af1820, FUTEX_WAIT_PRIVATE, 2, {0, 149946576}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f9b19af1820, FUTEX_WAIT_PRIVATE, 2, {0, 183111664}) = -1 ETIMEDOUT (Connection timed out)
futex(0x7f9b19af1820, FUTEX_WAIT_PRIVATE, 2, {0, 61817392}) = -1 ETIMEDOUT (Connection timed out)
</pre></p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67922
2016-03-21T12:51:24Z
Loïc Dachary
loic@dachary.org
<ul></ul><code>ceph-workbench ceph-qa-suite --dry-run --verbose --suite upgrade/infernalis-x --suite-branch master --ceph jewel --ceph-git-url https://github.com/ceph/ceph</code>
<ul>
<li><strong>running</strong> <a class="external" href="http://167.114.241.163:8081/ubuntu-2016-03-21_12:49:26-upgrade:infernalis-x-jewel---basic-openstack">http://167.114.241.163:8081/ubuntu-2016-03-21_12:49:26-upgrade:infernalis-x-jewel---basic-openstack</a></li>
</ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67961
2016-03-21T17:49:05Z
Samuel Just
sjust@redhat.com
<ul></ul><p>(gdb) bt<br />#0 0x00007f9b198ba573 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b198ba447 in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b198ab078 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198bd31b in tc_calloc () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b1a58a9ae in _dl_check_map_versions () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b1a58dd36 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b1a58d1ab in _dl_open () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b18f0402b in dlopen_doit () from /lib64/libdl.so.2<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b18f0462d in _dlerror_run () from /lib64/libdl.so.2<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b18f040c1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2<br /><a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mds rejoin: invented dirfrags (MDCache.cc:3469) (Resolved)" href="https://tracker.ceph.com/issues/15">#15</a> 0x00007f9b1ab3f3fb in ClassHandler::_load_class(ClassHandler::ClassData*) ()<br /><a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: mds restart vs dbench (Resolved)" href="https://tracker.ceph.com/issues/16">#16</a> 0x00007f9b1ab3f7ff in ClassHandler::open_class(std::string const&, ClassHandler::ClassData**) ()<br /><a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a> 0x00007f9b1ab3fad1 in ClassHandler::open_all_classes() ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: reconnect fixups (Resolved)" href="https://tracker.ceph.com/issues/18">#18</a> 0x00007f9b1aadc34c in OSD::init() ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: rbd (Resolved)" href="https://tracker.ceph.com/issues/19">#19</a> 0x00007f9b1aa5e1c8 in main ()</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67962
2016-03-21T17:49:24Z
Samuel Just
sjust@redhat.com
<ul></ul><p>(gdb) thread apply all bt</p>
<p>Thread 23 (Thread 0x7f9b16e6a700 (LWP 17626)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1afc1dfb in ceph::log::Log::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 22 (Thread 0x7f9b15ef6700 (LWP 17633)):<br />#0 0x00007f9b19112a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b09625a in CephContextServiceThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 21 (Thread 0x7f9b156f5700 (LWP 17634)):<br />#0 0x00007f9b198ba573 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b198ba447 in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b198ab078 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198be718 in tc_new () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b1b0c71aa in boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t>& boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t>::operator=<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::action<boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > > >(boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::alternative<boost::spirit::classic::action<boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > >, boost::spirit::classic::action<boost::spirit::classic::strlit<char const*>, boost::function<void (_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>)> > > const&) ()<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b1b0cf15e in json_spirit::Json_grammer<json_spirit::Value_impl<json_spirit::Config_map<std::string> >, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> >::definition<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >::definition(json_spirit::Json_grammer<json_spirit::Value_impl<json_spirit::Config_map<std::string> >, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> > const&) ()<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b1b0d00b8 in json_spirit::Json_grammer<json_spirit::Value_impl<json_spirit::Config_map<std::string> >, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> >::definition<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >& boost::spirit::classic::impl::get_definition<json_spirit::Json_grammer<json_spirit::Value_impl<json_spirit::Config_map<std::string> >, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> >, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t>, boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<char const*, std::string>, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >(boost::spirit::classic::grammar<json_spirit::Json_grammer<json_spirit::Value_impl<json_spirit::Config_map<std::string> >, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> >, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t> > const*) ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b1b0d03a4 in _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string> json_spirit::read_range_or_throw<__gnu_cxx::__normal_iterator<char const*, std::string>, json_spirit::Value_impl<json_spirit::Config_map<std::string> > >(_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, json_spirit::Value_impl<json_spirit::Config_map<std::string> >&) ()<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b1b0d050c in bool json_spirit::read_range<__gnu_cxx::__normal_iterator<char const*, std::string>, json_spirit::Value_impl<json_spirit::Config_map<std::string> > >(_<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>&, _<em>gnu_cxx::</em>_normal_iterator<char const*, std::string>, json_spirit::Value_impl<json_spirit::Config_map<std::string> >&) ()<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b1b0bb54d in json_spirit::read(std::string const&, json_spirit::Value_impl<json_spirit::Config_map<std::string> >&) ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b1b09d78e in cmdmap_from_json(std::vector<std::string, std::allocator<std::string> >, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >*, std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) ()<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b1b06d237 in AdminSocket::do_accept() ()<br /><a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mds rejoin: invented dirfrags (MDCache.cc:3469) (Resolved)" href="https://tracker.ceph.com/issues/15">#15</a> 0x00007f9b1b06f560 in AdminSocket::entry() ()<br /><a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: mds restart vs dbench (Resolved)" href="https://tracker.ceph.com/issues/16">#16</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 20 (Thread 0x7f9b1437b700 (LWP 17636)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b063c97 in SafeTimer::timer_thread() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06529d in SafeTimerThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 19 (Thread 0x7f9b13b7a700 (LWP 17637)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 18 (Thread 0x7f9b13379700 (LWP 17638)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 17 (Thread 0x7f9b12b78700 (LWP 17639)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 16 (Thread 0x7f9b12377700 (LWP 17640)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 15 (Thread 0x7f9b11b76700 (LWP 17641)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 14 (Thread 0x7f9b11375700 (LWP 17642)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b052eff in SimpleMessenger::reaper_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b05aacd in SimpleMessenger::ReaperThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 13 (Thread 0x7f9b10b74700 (LWP 17643)):<br />#0 0x00007f9b19114f4d in __lll_lock_wait () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b19110d02 in _L_lock_791 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b19110c08 in pthread_mutex_lock () from /lib64/libpthread.so.0<br />---Type <return> to continue, or q <return> to quit---<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1b02b638 in Mutex::Lock(bool) ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b1b0637d2 in SafeTimer::timer_thread() ()<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b1b06529d in SafeTimerThread::entry() ()<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 12 (Thread 0x7f9b10373700 (LWP 17644)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b063c97 in SafeTimer::timer_thread() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06529d in SafeTimerThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 11 (Thread 0x7f9b0fb72700 (LWP 17645)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b063c97 in SafeTimer::timer_thread() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06529d in SafeTimerThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 10 (Thread 0x7f9b0f371700 (LWP 17646)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1adebbbf in WBThrottle::get_next_should_flush(boost::tuples::tuple<ghobject_t, std::shared_ptr<FDCache::FD>, WBThrottle::PendingWB, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>*) ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1adec242 in WBThrottle::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 9 (Thread 0x7f9b0eb70700 (LWP 17647)):<br />#0 0x00007f9b198ba573 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b198ba447 in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b198ab078 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198be718 in tc_new () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b1afc13b4 in ceph::log::Log::create_entry(int, int) ()<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b1ad236d6 in FileStore::sync_entry() ()<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b1ad4c45d in FileStore::SyncThread::entry() ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 8 (Thread 0x7f9b0e36f700 (LWP 17648)):<br />#0 0x00007f9b19114f4d in __lll_lock_wait () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b19110d1d in _L_lock_840 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b19110c3a in pthread_mutex_lock () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1a57b029 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ba849 in GetStackTrace_libunwind(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198bb0be in GetStackTrace(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198ac314 in tcmalloc::PageHeap::GrowHeap(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b198ac633 in tcmalloc::PageHeap::New(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b198aaf64 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b198be718 in tc_new () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b1afc13b4 in ceph::log::Log::create_entry(int, int) ()<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b1af0c9f7 in FileJournal::do_write(ceph::buffer::list&) ()<br /><a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mds rejoin: invented dirfrags (MDCache.cc:3469) (Resolved)" href="https://tracker.ceph.com/issues/15">#15</a> 0x00007f9b1af0e5dd in FileJournal::write_thread_entry() ()<br /><a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: mds restart vs dbench (Resolved)" href="https://tracker.ceph.com/issues/16">#16</a> 0x00007f9b1ad442bd in FileJournal::Writer::entry() ()<br /><a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: reconnect fixups (Resolved)" href="https://tracker.ceph.com/issues/18">#18</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 7 (Thread 0x7f9b0db6e700 (LWP 17649)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1afa9bbc in Finisher::finisher_thread_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 6 (Thread 0x7f9b0d36d700 (LWP 17650)):<br />#0 0x00007f9b19112a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b06b2b2 in ThreadPool::worker(ThreadPool::WorkThread*) ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06c530 in ThreadPool::WorkThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 5 (Thread 0x7f9b0cb6c700 (LWP 17651)):<br />#0 0x00007f9b19112a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b06b2b2 in ThreadPool::worker(ThreadPool::WorkThread*) ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06c530 in ThreadPool::WorkThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 4 (Thread 0x7f9b0c36b700 (LWP 17652)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1afa9bbc in Finisher::finisher_thread_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 3 (Thread 0x7f9b0bb6a700 (LWP 17653)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1afa9bbc in Finisher::finisher_thread_entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 2 (Thread 0x7f9b0b369700 (LWP 17654)):<br />#0 0x00007f9b191126d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b1b063c97 in SafeTimer::timer_thread() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b1b06529d in SafeTimerThread::entry() ()<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>Thread 1 (Thread 0x7f9b1a781900 (LWP 17625)):<br />#0 0x00007f9b198ba573 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b198ba447 in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b198ab078 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br />---Type <return> to continue, or q <return> to quit---<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198bd31b in tc_calloc () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b1a58a9ae in _dl_check_map_versions () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b1a58dd36 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b1a58d1ab in _dl_open () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b18f0402b in dlopen_doit () from /lib64/libdl.so.2<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b18f0462d in _dlerror_run () from /lib64/libdl.so.2<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b18f040c1 in dlopen@@GLIBC_2.2.5</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67963
2016-03-21T17:51:34Z
Samuel Just
sjust@redhat.com
<ul></ul><p>Thread 8 (Thread 0x7f9b0e36f700 (LWP 17648)):<br />#0 0x00007f9b19114f4d in __lll_lock_wait () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b19110d1d in _L_lock_840 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b19110c3a in pthread_mutex_lock () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1a57b029 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ba849 in GetStackTrace_libunwind(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198bb0be in GetStackTrace(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198ac314 in tcmalloc::PageHeap::GrowHeap(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b198ac633 in tcmalloc::PageHeap::New(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b198aaf64 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b198be718 in tc_new () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b1afc13b4 in ceph::log::Log::create_entry(int, int) ()<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b1af0c9f7 in FileJournal::do_write(ceph::buffer::list&) ()<br /><a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mds rejoin: invented dirfrags (MDCache.cc:3469) (Resolved)" href="https://tracker.ceph.com/issues/15">#15</a> 0x00007f9b1af0e5dd in FileJournal::write_thread_entry() ()<br /><a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: mds restart vs dbench (Resolved)" href="https://tracker.ceph.com/issues/16">#16</a> 0x00007f9b1ad442bd in FileJournal::Writer::entry() ()<br /><a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a> 0x00007f9b1910edc5 in start_thread () from /lib64/libpthread.so.0<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: reconnect fixups (Resolved)" href="https://tracker.ceph.com/issues/18">#18</a> 0x00007f9b179b528d in clone () from /lib64/libc.so.6</p>
<p>So it failed to allocate a logging message entry?</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67964
2016-03-21T18:36:50Z
Samuel Just
sjust@redhat.com
<ul></ul><p><a class="external" href="http://osxr.org:8080/glibc/source/elf/dl-tls.c?v=glibc-2.17#0696">http://osxr.org:8080/glibc/source/elf/dl-tls.c?v=glibc-2.17#0696</a></p>
<p>#0 0x00007f9b198ba573 in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b198ba447 in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b198ab078 in tcmalloc::CentralFreeList::Populate() () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b198ab148 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ab1dd in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198ae235 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198bd31b in tc_calloc () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b1a58a9ae in _dl_check_map_versions () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-3 status-5 priority-4 priority-default closed" title="Support: Document differences from S3 (Closed)" href="https://tracker.ceph.com/issues/8">#8</a> 0x00007f9b1a58dd36 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-8 priority-3 priority-lowest closed" title="Feature: Access unimported data (Won't Fix)" href="https://tracker.ceph.com/issues/9">#9</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: osd: Replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/10">#10</a> 0x00007f9b1a58d1ab in _dl_open () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: mds: replace ALLOW_MESSAGES_FROM macro (Resolved)" href="https://tracker.ceph.com/issues/11">#11</a> 0x00007f9b18f0402b in dlopen_doit () from /lib64/libdl.so.2<br /><a class="issue tracker-2 status-3 priority-3 priority-lowest closed" title="Feature: uclient: Make cap handling smarter (Resolved)" href="https://tracker.ceph.com/issues/12">#12</a> 0x00007f9b1a5891b4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" title="Feature: uclient: Make readdir use the cache (Resolved)" href="https://tracker.ceph.com/issues/13">#13</a> 0x00007f9b18f0462d in _dlerror_run () from /lib64/libdl.so.2<br /><a class="issue tracker-1 status-10 priority-4 priority-default closed" title="Bug: osd: pg split breaks if not all osds are up (Duplicate)" href="https://tracker.ceph.com/issues/14">#14</a> 0x00007f9b18f040c1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2<br /><a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mds rejoin: invented dirfrags (MDCache.cc:3469) (Resolved)" href="https://tracker.ceph.com/issues/15">#15</a> 0x00007f9b1ab3f3fb in ClassHandler::_load_class(ClassHandler::ClassData*) ()<br /><a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: mds restart vs dbench (Resolved)" href="https://tracker.ceph.com/issues/16">#16</a> 0x00007f9b1ab3f7ff in ClassHandler::open_class(std::string const&, ClassHandler::ClassData**) ()<br /><a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a> 0x00007f9b1ab3fad1 in ClassHandler::open_all_classes() ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: reconnect fixups (Resolved)" href="https://tracker.ceph.com/issues/18">#18</a> 0x00007f9b1aadc34c in OSD::init() ()<br /><a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: rbd (Resolved)" href="https://tracker.ceph.com/issues/19">#19</a> 0x00007f9b1aa5e1c8 in main ()</p>
<p>Suspicious, tls_get_addr_tail has special handling for when it's called concurrently with dlopen.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67965
2016-03-21T18:39:25Z
Samuel Just
sjust@redhat.com
<ul></ul><p>glibc is 2.17</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67966
2016-03-21T18:41:45Z
Samuel Just
sjust@redhat.com
<ul></ul><p>This looks like a circular lock problem between glibc (dlopen trying to call into tcmalloc) and tcmalloc (calling into glibc via libunwind and blocking on state held by dlopen in tls_get_addr_tail?).</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67967
2016-03-21T18:49:09Z
Samuel Just
sjust@redhat.com
<ul></ul><p><a class="external" href="https://github.com/gperftools/gperftools/releases">https://github.com/gperftools/gperftools/releases</a></p>
<p>I guess in the new version of gperftools they have an --enable-emergency-malloc configure flag.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67968
2016-03-21T18:59:03Z
Samuel Just
sjust@redhat.com
<ul></ul><p>Hmm, more information.</p>
<p>#0 0x00007f9b19114f4d in __lll_lock_wait () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b19110d1d in _L_lock_840 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b19110c3a in pthread_mutex_lock () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1a57b029 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ba849 in GetStackTrace_libunwind(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198bb0be in GetStackTrace(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198ac314 in tcmalloc::PageHeap::GrowHeap(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b198ac633 in tcmalloc::PageHeap::New(unsigned long) () from /lib64/libtcmalloc.so.4</p>
<p>It looks like a crash, but the call to GetStackTrace is in page_heap.cc::RecordGrowth, which appears to be informational, not a crash (it's called in the normal path of PageHeap::GrowHeap). So, tcmalloc here is just fine. It continues to look like a deadlock between threads 8 and 1. Probably time to get a libc master on this.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=67969
2016-03-21T19:01:12Z
Samuel Just
sjust@redhat.com
<ul><li><strong>Project</strong> changed from <i>teuthology</i> to <i>Ceph</i></li><li><strong>Subject</strong> changed from <i>openstack: SSHException: Key-exchange timed out</i> to <i>Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=68000
2016-03-22T14:42:38Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/68000/diff?detail_id=64970">diff</a>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=68034
2016-03-22T19:54:54Z
Carlos O'Donell
carlos@redhat.com
<ul></ul><p>Samuel Just wrote:</p>
<blockquote>
<p>Hmm, more information.</p>
<p>#0 0x00007f9b19114f4d in __lll_lock_wait () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00007f9b19110d1d in _L_lock_840 () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00007f9b19110c3a in pthread_mutex_lock () from /lib64/libpthread.so.0<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00007f9b1a57b029 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00007f9b198ba849 in GetStackTrace_libunwind(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f9b198bb0be in GetStackTrace(void**, int, int) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f9b198ac314 in tcmalloc::PageHeap::GrowHeap(unsigned long) () from /lib64/libtcmalloc.so.4<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x00007f9b198ac633 in tcmalloc::PageHeap::New(unsigned long) () from /lib64/libtcmalloc.so.4</p>
<p>It looks like a crash, but the call to GetStackTrace is in page_heap.cc::RecordGrowth, which appears to be informational, not a crash (it's called in the normal path of PageHeap::GrowHeap). So, tcmalloc here is just fine. It continues to look like a deadlock between threads 8 and 1. Probably time to get a libc master on this.</p>
</blockquote>
<p>I'm an senior upstream glibc developer and project steward and I'll try to provide some detail here and some possible solutions. I am neither a Ceph nor TCMalloc expert, but some friends have asked me to comment here and I'm very happy to do that. The glibc community is here to help :-)</p>
<p>The deadlock in tcmalloc is most likely because tcmalloc fails to initialize all TLS variables which are required for logging, and this creates a deadlock between the dyanmic loader loading a shared library and tcmalloc's own internal spinlocks.</p>
<p>There are many intricate details of how the implementation supports thread local storage, lazy allocation, and dynamic library loading that must all be understood in order to implement a robust and high performance memory allocator. There is no guarantee that all C/C++ code can be run from a malloc given the caller's context. Malloc is used by the entire runtime and may be called in places where it is not safe to dynamically load other libraries (tcmalloc expects this via assert/logging dlopen) or even call other C runtime functions (tcmalloc appears to call anything in the C/C++ runtime). It is sufficiently complex that in upstream we have coined the term "synchronously-reetrant safe" (SR-safe) to refer to those functions which malloc might be able to call safely. We have discussed upstream that libdl's entire API should be SR-safe, to allow interposed mallocs to load libraries that contain helper routines, but nothing further is set in stone. Consider that malloc itself is trying to call back into the same runtime that provides it's own services. Extreme care must be taken.</p>
<p>In this particular case it's a lock ordering issue and tcmalloc needs fixing to avoid this problematic ordering.</p>
<p>At a high level I <strong>believe</strong> the deadlock is this:</p>
<pre><code>T1<br /> -> dlopen (hold the load lock for shared library loading consistency)<br /> -> new</code></pre>
<pre><code>T8<br /> -> new<br /> -> tcmalloc::CentralFreeList (lock the free list spinlock)</code></pre>
<pre><code>T1<br /> -> tcmalloc:CentralFreeList (wait on free list spinlock)</code></pre>
<pre><code>T8<br /> -> tls_get_addr_tail (initialize TLS variable)<br /> This TLS variable is likely from tcmalloc (lazy initialized) or libunwind.<br /> -> pthread_mutex_lock (wait on dlopen to finish).<br /> TLS lazy initialization is waiting for the in-progress<br /> dlopen which may impact the correctness of the TLS variable placement.</code></pre>
<p>T1 is waiting for the free list spinlock while holding the dlopen lock.<br />T8 is waiting for the dlopen lock while holding the free list spinlock.</p>
<p>I can't reproduce this because in all of my tests RecordGrowth is called early on in the application execution and thus initializes the required TLS variables before other threads run. Something about your mode of execution here means RecordGrowth is not called early. The race window is quite small between the two racing threads, but given enough intervals you'll hit it.</p>
<p>In summary:</p>
<ul>
<li>In tcmalloc it is unsafe to call dlopen and a memory allocation in parallel for the first time from different threads when the allocation would grow the PageHeap (and call RecordGrowth). Subsequent parallel calls are safe as the lazy TLS initialization, part of the logging, is complete and has no dependencies on a consistent dynamic loader state.</li>
</ul>
<p>Workarounds:</p>
<p>(1) Initialize TLS variables early.<br />(1a) Ensure that you load and touch all TLS variables required by the implementation before allowing anything else to proceed e.g. calling the logging early just to exercise the code paths required.<br />(1b) Figure out why RecordGrowth is not called early under the singular starting thread before other threads are started (large thread cache size?), and find a way to get it called.</p>
<p>(2) Simplify the logging in tcmalloc.<br />(2a) Don't collect backtraces. Simplify RecordGrowth by removing the backtrace recording.<br />(2b) Careful with backtraces during error conditions. Simplify the ASSERT macro used in GrowHeap. These will also deadlock.<br />(2c) Avoid `static __thread int recursive;` in gperftools/src/stacktrace_libunwind-inl.h to avoid the TLS initialization.</p>
<p>I would take this upstream to gperftools to talk about the problem.</p>
<p>In glibc's malloc we avoid these problems by using IE-model TLS variables which need no dyanmic initialization (limited global distribution-wide resource) and by limiting and working with the developers of the libraries we load (gcc's libgcc_s.so has no TLS and is used for backtracing).</p>
<p>In the future glibc is going to try make it easier for users to write custom allocators, particularly by avoiding deferred TLS allocations (doing them at dlopen time). This is desirable because it makes first use of TLS variables async-signal-safe (a problem which we've tried to correct once already), and dlopen can return reasonable errors about out-of-memory conditions (for which there is no standards compliant way to do this when a variable is accessed). We haven't done this yet because it's not easy. The core dynamic loader is at the bottom of a very large software stack and we need to make sure everything keeps working as expected. Until then, you have to understand some intricate details of the runtime APIs you are interposing. Even then we'll likely have a dlopen flag to revert to the old lazy-allocation behaviour, so both must be supported for legacy applications.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=68061
2016-03-23T15:07:43Z
Aliaksei Kandratsenka
alkondratenko@gmail.com
<ul></ul><p>Hi. Upstream gperftools maintainer here.</p>
<p>Indeed it looks like the issue is that thread that does dlopen took some glibc lock and called into tcmalloc and is waiting for one of central free list locks. So far it's fine. While another thread took central free list lock and triggered heap growth sampling as part of requesting more memory from kernel. So it is capturing backtrace and as part of tls usage there tries to take glibc lock taken by first thread.</p>
<p>One quick recommendation is to try to link to libtcmalloc_minimal rather than libtcmalloc. The former is just malloc, and the later adds support for malloc sampling, heap growth sampling, heap checker, heap profiler etc. Only full malloc ever captures backtraces. So if you only need fast malloc and nothing else, just link to tcmalloc_minimal, it is a tiny bit faster too. Note that, with exception of this seemingly easily fixable bug our stack trace capturing in libtcmalloc is believed to be safe (but stack trace capturing in our cpu profiler can still occasionally deadlock for reasons unrelated to this ticket).</p>
<p>Looks like indeed the issue is that 'recursive' variable isn't initial-exec as rest of our tls variables. So I agree with Carlos opinion above. This is easy to fix. Created ticket at: <a class="external" href="https://github.com/gperftools/gperftools/issues/786">https://github.com/gperftools/gperftools/issues/786</a></p>
<p>Also if you have any issues with tcmalloc, feel free to escalate to us.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=68343
2016-03-29T15:06:27Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>15</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Urgent</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=69054
2016-04-11T21:44:33Z
Yuri Weinstein
yweinste@redhat.com
<ul></ul><p>Possibly the same problem:<br /><a class="external" href="http://qa-proxy.ceph.com/teuthology/teuthology-2016-04-11_02:10:01-upgrade:hammer-x-jewel-distro-basic-vps/120647/teuthology.log">http://qa-proxy.ceph.com/teuthology/teuthology-2016-04-11_02:10:01-upgrade:hammer-x-jewel-distro-basic-vps/120647/teuthology.log</a></p>
<pre>
2016-04-11T04:06:13.432 INFO:tasks.ceph.mon.b:Started
2016-04-11T04:06:13.433 INFO:tasks.ceph:Waiting until ceph osds are all up...
2016-04-11T04:06:13.433 INFO:teuthology.orchestra.run.vpm100:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd dump --format=json'
2016-04-11T04:06:14.671 INFO:tasks.ceph.mon.b.vpm100.stdout:starting mon.b rank 1 at 172.21.2.100:6790/0 mon_data /var/lib/ceph/mon/ceph-b fsid 7afdb138-0cd5-44be-bfdd-49823a70e523
2016-04-11T04:06:15.181 INFO:teuthology.misc.health.vpm100.stderr:2016-04-11 11:06:15.180668 7fde3fbe1700 -1 asok(0x7fde38001680) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.10889.asok': (13) Permission denied
2016-04-11T04:06:15.345 DEBUG:teuthology.misc:4 of 6 OSDs are up
</pre>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=69090
2016-04-12T17:23:00Z
Sage Weil
sage@newdream.net
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-10 priority-5 priority-high3 closed" href="/issues/14457">Bug #14457</a>: tcmalloc oom bug</i> added</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=69744
2016-04-29T03:01:41Z
Haomai Wang
haomaiwang@gmail.com
<ul></ul><p>We recently hit this bug frequently on centos7. It looks like ceph qa lab met this problem rarely, it may related to tcmalloc or libunwind version?</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=69895
2016-05-03T08:30:06Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Assignee</strong> deleted (<del><i>Loïc Dachary</i></del>)</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=70229
2016-05-09T09:43:58Z
Haomai Wang
haomaiwang@gmail.com
<ul></ul><p>After use with tcmalloc_minimal, no deadlock happened again.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=71133
2016-05-20T16:55:51Z
Sage Weil
sage@newdream.net
<ul></ul><p>hit this on smithi015 (centos7). gperftools-libs-2.4-7.el7.x86_64</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=71443
2016-05-25T16:38:31Z
Ken Dreyer
kdreyer@redhat.com
<ul></ul><p>I've opened <a class="external" href="https://bugzilla.redhat.com/1339710">https://bugzilla.redhat.com/1339710</a> to request that the RHEL 7 gperftools maintainers take that patch ( <a class="external" href="https://github.com/gperftools/gperftools/commit/7852eeb75b9375cf52a7da01be044da6e915dd08">https://github.com/gperftools/gperftools/commit/7852eeb75b9375cf52a7da01be044da6e915dd08</a> ) in their downstream package.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=71459
2016-05-26T09:01:26Z
Loïc Dachary
loic@dachary.org
<ul></ul><p>Daniel reports this happens fairly frequently on RHEL VMs. Although we could timeout ceph-osd mkfs and retry, I'm not sure how to assert that it took too long because of this bug or because of something else. Is it reasonable to assume that ceph-osd mkfs can only timeout because of this bug ? At least until <a class="external" href="https://bugzilla.redhat.com/show_bug.cgi?id=1339710">https://bugzilla.redhat.com/show_bug.cgi?id=1339710</a> is resolved.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=71469
2016-05-26T11:00:02Z
Loïc Dachary
loic@dachary.org
<ul></ul><p>When ceph-osd --mkfs blocks because of this, <a class="external" href="https://github.com/ceph/ceph/pull/9343">https://github.com/ceph/ceph/pull/9343</a> may be an acceptable workaround</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=71860
2016-06-01T13:41:13Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-4 priority-default closed" href="/issues/16103">Bug #16103</a>: ceph-disk: workaround gperftool hang</i> added</li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=81586
2016-11-18T20:11:45Z
Ken Dreyer
kdreyer@redhat.com
<ul></ul><p>RHBZ #1339710 is resolved in RHEL 7.3, and CentOS 7.3 (in progress) will have the updated gperftools as well.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=82213
2016-11-29T23:44:04Z
Sage Weil
sage@newdream.net
<ul></ul><p>This appears to trigger very frequently with ceph-objectstore-tool. :(</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=82262
2016-11-30T22:33:08Z
Ken Dreyer
kdreyer@redhat.com
<ul></ul><p>David copied CentOS CR's gperftools packages to lab-extras today in order to test. (see ticket <a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="Cleanup: Remove gperftools from lab-extras when CentOS 7.3 is released (Resolved)" href="https://tracker.ceph.com/issues/18094">#18094</a>)</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=82498
2016-12-04T14:56:22Z
Sage Weil
sage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Urgent</i> to <i>Immediate</i></li></ul><p>quick way to tell if this is (probably) the root cause:</p>
<p>grep ceph-objectstore-tool teuthology.log<br />verify that the last machine is centos</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=82624
2016-12-06T22:13:59Z
Sage Weil
sage@newdream.net
<ul></ul><p>Just saw this on xenial! oh goodie.</p>
<pre>
ii libgoogle-perftools4 2.4-0ubuntu5 amd64 libraries for CPU and heap analysis, plus an efficient thread-caching malloc
ii libtcmalloc-minimal4 2.4-0ubuntu5 amd64 efficient thread-caching malloc
</pre>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=82626
2016-12-06T23:02:31Z
Sage Weil
sage@newdream.net
<ul></ul><p><a class="external" href="https://bugs.launchpad.net/ubuntu/+source/google-perftools/+bug/1647864">https://bugs.launchpad.net/ubuntu/+source/google-perftools/+bug/1647864</a></p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=83133
2016-12-19T22:10:47Z
Sage Weil
sage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Immediate</i> to <i>Urgent</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=83193
2016-12-20T14:33:25Z
Sage Weil
sage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Urgent</i> to <i>Immediate</i></li></ul><p><a class="external" href="http://pulpito.ceph.com/sage-2016-12-20_03:05:39-rados-wip-sage-testing---basic-smithi/">http://pulpito.ceph.com/sage-2016-12-20_03:05:39-rados-wip-sage-testing---basic-smithi/</a></p>
<p>8 hung jobs out of 236 on my last rados run! :(</p>
<p>xenial and centos both.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=86053
2017-02-14T14:34:37Z
Sage Weil
sage@newdream.net
<ul></ul><p>still see this on xenial: /a/sage-2017-02-14_06:55:13-rados-wip-pg-split-interval---basic-smithi/813242</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=86073
2017-02-14T20:33:00Z
Nathan Cutler
ncutler@suse.cz
<ul></ul><p>That this is occuring on xenial is disturbing, because it seems to have gperftools 2.4:</p>
<pre>
(virtualenv) ubuntu@teuthology:~$ apt search libgoogle-perftools4
Sorting... Done
Full Text Search... Done
libgoogle-perftools4/xenial,now 2.4-0ubuntu5 amd64 [installed,automatic]
libraries for CPU and heap analysis, plus an efficient thread-caching malloc
</pre>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=86641
2017-02-23T15:23:25Z
Sage Weil
sage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Immediate</i> to <i>Urgent</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=89734
2017-04-19T15:07:29Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>15</i> to <i>Resolved</i></li></ul>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=90553
2017-04-28T23:39:53Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>15</i></li></ul><p>just saw this again on xenial:</p>
<pre>
Thread 4 (Thread 0x7fb16d3af700 (LWP 93913)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007fb170e81e92 in __GI___pthread_mutex_lock (mutex=0x7fb1729bc948 <_rtld_global+2312>) at ../nptl/pthread_mutex_lock.c:115
#2 0x00007fb1727a8eed in tls_get_addr_tail (ti=0x7fb171c83c30, dtv=0x55bd3d224490, the_map=0x7fb1729a6000) at dl-tls.c:765
#3 0x00007fb171a6d459 in ?? () from /usr/lib/libtcmalloc.so.4
#4 0x00007fb171a6dcee in GetStackTrace(void**, int, int) () from /usr/lib/libtcmalloc.so.4
#5 0x00007fb171a5f0c0 in tcmalloc::PageHeap::GrowHeap(unsigned long) () from /usr/lib/libtcmalloc.so.4
#6 0x00007fb171a5f423 in tcmalloc::PageHeap::New(unsigned long) () from /usr/lib/libtcmalloc.so.4
#7 0x00007fb171a5dd34 in tcmalloc::CentralFreeList::Populate() () from /usr/lib/libtcmalloc.so.4
#8 0x00007fb171a5df28 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /usr/lib/libtcmalloc.so.4
#9 0x00007fb171a5dfbf in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /usr/lib/libtcmalloc.so.4
#10 0x00007fb171a60faa in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /usr/lib/libtcmalloc.so.4
#11 0x00007fb171a52289 in ?? () from /usr/lib/libtcmalloc.so.4
#12 0x00007fb171a72d52 in tc_posix_memalign () from /usr/lib/libtcmalloc.so.4
#13 0x000055bd32dd3eb5 in ceph::buffer::list::append(char const*, unsigned int) ()
#14 0x000055bd32df9656 in GetdescsHook::call(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> > > > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list&) ()
#15 0x000055bd32df632b in AdminSocket::do_accept() ()
#16 0x000055bd32df78d8 in AdminSocket::entry() ()
#17 0x00007fb170e7f70a in start_thread (arg=0x7fb16d3af700) at pthread_create.c:333
#18 0x00007fb16fef682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 3 (Thread 0x7fb16dbb0700 (LWP 93912)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x000055bd32efe1f0 in CephContextServiceThread::entry() ()
#2 0x00007fb170e7f70a in start_thread (arg=0x7fb16dbb0700) at pthread_create.c:333
#3 0x00007fb16fef682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 2 (Thread 0x7fb16eb4c700 (LWP 93911)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x000055bd32e1fc7b in ceph::logging::Log::entry() ()
#2 0x00007fb170e7f70a in start_thread (arg=0x7fb16eb4c700) at pthread_create.c:333
#3 0x00007fb16fef682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7fb17299bc80 (LWP 93910)):
#0 0x00007fb171a6d159 in base::internal::SpinLockDelay(int volatile*, int, int) () from /usr/lib/libtcmalloc.so.4
#1 0x00007fb171a6d026 in SpinLock::SlowLock() () from /usr/lib/libtcmalloc.so.4
#2 0x00007fb171a5de38 in tcmalloc::CentralFreeList::Populate() () from /usr/lib/libtcmalloc.so.4
#3 0x00007fb171a5df28 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /usr/lib/libtcmalloc.so.4
#4 0x00007fb171a5dfbf in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /usr/lib/libtcmalloc.so.4
#5 0x00007fb171a60faa in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long, unsigned long) () from /usr/lib/libtcmalloc.so.4
#6 0x00007fb171a72a4b in tc_calloc () from /usr/lib/libtcmalloc.so.4
#7 0x00007fb1727a04fa in do_lookup_unique (undef_map=0x55bd3d1c9700, ref=0x7fb16c5a9028, strtab=0x55bd324ca418 "", sym=<optimized out>, type_class=4, result=0x7ffc82b1bf80, map=0x7fb1729bd168, new_hash=2818826075,
undef_name=0x7fb16c6331b5 "_ZZN5boost9function1IvdE9assign_toINS_3_bi6bind_tIvNS_4_mfi3mf1IvN11json_spirit16Semantic_actionsINS7_10Value_implINS7_10Config_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEENS_6spirit7c"...) at dl-lookup.c:268
#8 do_lookup_x (undef_name=undef_name@entry=0x7fb16c6331b5 "_ZZN5boost9function1IvdE9assign_toINS_3_bi6bind_tIvNS_4_mfi3mf1IvN11json_spirit16Semantic_actionsINS7_10Value_implINS7_10Config_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEENS_6spirit7c"..., new_hash=new_hash@entry=2818826075,
old_hash=old_hash@entry=0x7ffc82b1bf70, ref=0x7fb16c5a9028, result=result@entry=0x7ffc82b1bf80, scope=<optimized out>, i=<optimized out>, version=0x0, flags=1, skip=0x0, type_class=4, undef_map=0x55bd3d1c9700) at dl-lookup.c:540
#9 0x00007fb1727a094f in _dl_lookup_symbol_x (undef_name=0x7fb16c6331b5 "_ZZN5boost9function1IvdE9assign_toINS_3_bi6bind_tIvNS_4_mfi3mf1IvN11json_spirit16Semantic_actionsINS7_10Value_implINS7_10Config_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEENS_6spirit7c"...,
undef_map=undef_map@entry=0x55bd3d1c9700, ref=ref@entry=0x7ffc82b1c0d0, symbol_scope=symbol_scope@entry=0x55bd3d1c9a58, version=0x0, type_class=4, flags=1, skip_map=0x0) at dl-lookup.c:829
#10 0x00007fb1727a25ad in elf_machine_rela (skip_ifunc=0, reloc_addr_arg=0x7fb16c9616a8, version=<optimized out>, sym=0x7fb16c5a9028, reloc=0x7fb16c669a70, map=0x55bd3d1c9700) at ../sysdeps/x86_64/dl-machine.h:301
#11 elf_dynamic_do_Rela (skip_ifunc=0, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=0x55bd3d1c9700) at do-rel.h:137
#12 _dl_relocate_object (scope=<optimized out>, reloc_mode=reloc_mode@entry=0, consider_profiling=<optimized out>, consider_profiling@entry=0) at dl-reloc.c:258
#13 0x00007fb1727ab681 in dl_open_worker (a=a@entry=0x7ffc82b1c3a0) at dl-open.c:435
#14 0x00007fb1727a6394 in _dl_catch_error (objname=objname@entry=0x7ffc82b1c390, errstring=errstring@entry=0x7ffc82b1c398, mallocedp=mallocedp@entry=0x7ffc82b1c38f, operate=operate@entry=0x7fb1727ab300 <dl_open_worker>, args=args@entry=0x7ffc82b1c3a0) at dl-error.c:187
#15 0x00007fb1727aabd9 in _dl_open (file=0x55bd3d16dcc0 "/usr/lib/ceph/erasure-code/libec_lrc.so", mode=-2147483646,
caller_dlopen=0x55bd3323b016 <ceph::ErasureCodePluginRegistry::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::ErasureCodePlugin**, std::ostream*)+406>, nsid=-2,
argc=<optimized out>, argv=<optimized out>, env=0x55bd3d07a000) at dl-open.c:660
#16 0x00007fb170c74f09 in dlopen_doit (a=a@entry=0x7ffc82b1c5d0) at dlopen.c:66
#17 0x00007fb1727a6394 in _dl_catch_error (objname=0x55bd3d07e0b0, errstring=0x55bd3d07e0b8, mallocedp=0x55bd3d07e0a8, operate=0x7fb170c74eb0 <dlopen_doit>, args=0x7ffc82b1c5d0) at dl-error.c:187
#18 0x00007fb170c75571 in _dlerror_run (operate=operate@entry=0x7fb170c74eb0 <dlopen_doit>, args=args@entry=0x7ffc82b1c5d0) at dlerror.c:163
#19 0x00007fb170c74fa1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#20 0x000055bd3323b016 in ceph::ErasureCodePluginRegistry::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::ErasureCodePlugin**, std::ostream*) ()
#21 0x000055bd3323b6b5 in ceph::ErasureCodePluginRegistry::preload(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::ostream*) ()
#22 0x000055bd32dce3f6 in global_init_preload_erasure_code(CephContext const*) ()
#23 0x000055bd327a7c41 in main ()
(gdb) q
A debugging session is active.
Inferior 1 [process 93910] will be detached.
Quit anyway? (y or n) y
Detaching from program: /usr/bin/ceph-osd, process 93910
root@smithi088:~# dpkg -l | grep tcmal
ii libtcmalloc-minimal4 2.4-0ubuntu5 amd64 efficient thread-caching malloc
</pre>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=90579
2017-05-01T15:40:59Z
Sage Weil
sage@newdream.net
<ul></ul><p>xenial package presumably does not backport gperftools commit 7852eeb75b9375cf52a7da01be044da6e915dd08 like the rhel/centos package does.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=91175
2017-05-17T15:35:41Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>15</i> to <i>Resolved</i></li></ul><pre><jamespage> sage: hey - that google-perftools fix is now in Xenial - hopefully that should resolve the testing issues you've been seeing!
<jamespage> (had a post summit blitz on SRU's this week)</pre>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=91473
2017-05-24T02:46:17Z
Sage Weil
sage@newdream.net
<ul></ul><p>just saw this on smithi081, which had 2.4-0ubuntu5, but apt update + apt install libgoogle-perftools4 upgraded to 2.4-0ubuntu5.16.04.1.</p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=91485
2017-05-24T14:51:14Z
David Galloway
<ul></ul><p>Sage Weil wrote:</p>
<blockquote>
<p>just saw this on smithi081, which had 2.4-0ubuntu5, but apt update + apt install libgoogle-perftools4 upgraded to 2.4-0ubuntu5.16.04.1.</p>
</blockquote>
<p><a class="external" href="https://github.com/ceph/ceph-cm-ansible/pull/317">https://github.com/ceph/ceph-cm-ansible/pull/317</a></p>
Ceph - Bug #13522: Apparent deadlock between tcmalloc getting a stacktrace and dlopen allocating memory
https://tracker.ceph.com/issues/13522?journal_id=91912
2017-05-31T19:40:00Z
David Galloway
<ul></ul><p>David Galloway wrote:</p>
<blockquote>
<p>Sage Weil wrote:</p>
<blockquote>
<p>just saw this on smithi081, which had 2.4-0ubuntu5, but apt update + apt install libgoogle-perftools4 upgraded to 2.4-0ubuntu5.16.04.1.</p>
</blockquote>
<p><a class="external" href="https://github.com/ceph/ceph-cm-ansible/pull/317">https://github.com/ceph/ceph-cm-ansible/pull/317</a></p>
</blockquote>
<p>Merged</p>