Actions
Bug #9768
closedceph-osd mkfs hangs
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Workaround for Firefly <= 0.80.7¶
If it shows with
# dpkg -l | grep aio ii libaio1:amd64 0.3.109-4 amd64 Linux kernel AIO access library - shared library $ ceph --version ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae) $ uname -a Linux g3 3.14-2-amd64 #1 SMP Debian 3.14.15-2 (2014-08-09) x86_64 GNU/Linux
It can be worked around with the following in ceph.conf
journal_aio = false
It can be removed once the ceph-disk activate has run. Restarting the OSD will have it use AIO.
Description¶
$ ceph-disk prepare --mkfs ext4 /dev/sdc $ ps fauwwwx | grep mkfs root 366 0.0 0.0 40672 1548 ? Ss Oct10 0:00 udevd --daemon root 21963 0.0 0.0 40644 1360 ? S 13:57 0:00 \_ udevd --daemon root 21988 0.0 0.0 4308 636 ? S 13:57 0:00 | \_ /bin/sh /usr/sbin/ceph-disk-activate /dev/sdc1 root 21991 0.2 0.0 48736 9968 ? S 13:57 0:00 | \_ /usr/bin/python /usr/sbin/ceph-disk activate /dev/sdc1 root 22082 0.0 0.0 192040 8820 ? Sl 13:57 0:00 | \_ /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 6 --monmap /var/lib/ceph/tmp/mnt.5mDa2j/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.5mDa2j --osd-journal /var/lib/ceph/tmp/mnt.5mDa2j/journal --osd-uuid 0b27732f-bee9-4212-9395-bad5eaeffc37 --keyring /var/lib/ceph/tmp/mnt.5mDa2j/keyring
The log with debug_filestore=20
2014-10-14 14:23:05.600868 7fd3f68c47c0 0 ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae), process ceph-osd, pid 1609 2014-10-14 14:23:05.602648 7fd3f68c47c0 1 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) mkfs in /var/lib/ceph/tmp/mnt.9lLCnp 2014-10-14 14:23:05.602666 7fd3f68c47c0 1 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) mkfs fsid is already set to cd654a0a-08d7-4ff8-a79d-372c73ae1f3a 2014-10-14 14:23:05.647609 7fd3f68c47c0 1 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) leveldb db exists/created 2014-10-14 14:23:05.649577 7fd3f68c47c0 1 journal _open /var/lib/ceph/tmp/mnt.9lLCnp/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:23:05.649934 7fd3f68c47c0 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected cd654a0a-08d7-4ff8-a79d-372c73ae1f3a, invalid (someone else's?) journal 2014-10-14 14:23:05.651841 7fd3f68c47c0 1 journal _open /var/lib/ceph/tmp/mnt.9lLCnp/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:23:05.655205 7fd3f68c47c0 0 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) mkjournal created journal on /var/lib/ceph/tmp/mnt.9lLCnp/journal 2014-10-14 14:23:05.655236 7fd3f68c47c0 1 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) mkfs done in /var/lib/ceph/tmp/mnt.9lLCnp 2014-10-14 14:23:05.662726 7fd3f68c47c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.9lLCnp) detect_features: FIEMAP ioctl is supported and appears to work 2014-10-14 14:23:05.662746 7fd3f68c47c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.9lLCnp) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2014-10-14 14:23:05.667236 7fd3f68c47c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.9lLCnp) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2014-10-14 14:23:05.691436 7fd3f68c47c0 0 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) limited size xattrs 2014-10-14 14:23:05.704333 7fd3f68c47c0 0 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2014-10-14 14:23:05.706694 7fd3f68c47c0 1 journal _open /var/lib/ceph/tmp/mnt.9lLCnp/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:23:05.709404 7fd3f68c47c0 1 journal _open /var/lib/ceph/tmp/mnt.9lLCnp/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:23:05.709841 7fd3f68c47c0 -1 filestore(/var/lib/ceph/tmp/mnt.9lLCnp) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory 2014-10-14 14:25:14.161524 7f95042a87c0 0 ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae), process ceph-osd, pid 3052 2014-10-14 14:25:14.163866 7f95042a87c0 1 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) mkfs in /var/lib/ceph/tmp/mnt.8GTDCM 2014-10-14 14:25:14.163884 7f95042a87c0 1 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) mkfs fsid is already set to 5a9ad874-ee9e-4b92-8607-c32c23eb7e83 2014-10-14 14:25:14.199386 7f95042a87c0 1 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) leveldb db exists/created 2014-10-14 14:25:14.201252 7f95042a87c0 1 journal _open /var/lib/ceph/tmp/mnt.8GTDCM/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:25:14.201574 7f95042a87c0 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 5a9ad874-ee9e-4b92-8607-c32c23eb7e83, invalid (someone else's?) journal 2014-10-14 14:25:14.203518 7f95042a87c0 1 journal _open /var/lib/ceph/tmp/mnt.8GTDCM/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:25:14.206768 7f95042a87c0 0 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) mkjournal created journal on /var/lib/ceph/tmp/mnt.8GTDCM/journal 2014-10-14 14:25:14.206801 7f95042a87c0 1 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) mkfs done in /var/lib/ceph/tmp/mnt.8GTDCM 2014-10-14 14:25:14.213720 7f95042a87c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8GTDCM) detect_features: FIEMAP ioctl is supported and appears to work 2014-10-14 14:25:14.213748 7f95042a87c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8GTDCM) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2014-10-14 14:25:14.225341 7f95042a87c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8GTDCM) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2014-10-14 14:25:14.245278 7f95042a87c0 0 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) limited size xattrs 2014-10-14 14:25:14.258212 7f95042a87c0 0 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2014-10-14 14:25:14.260042 7f95042a87c0 1 journal _open /var/lib/ceph/tmp/mnt.8GTDCM/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:25:14.261695 7f95042a87c0 1 journal _open /var/lib/ceph/tmp/mnt.8GTDCM/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 14:25:14.261955 7f95042a87c0 -1 filestore(/var/lib/ceph/tmp/mnt.8GTDCM) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory 2014-10-14 15:22:59.330720 7f5f022c17c0 0 ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae), process ceph-osd, pid 28482 2014-10-14 15:22:59.332796 7f5f022c17c0 1 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) mkfs in /var/lib/ceph/tmp/mnt.GbLkrx 2014-10-14 15:22:59.332819 7f5f022c17c0 1 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) mkfs fsid is already set to 9b224aad-1365-4d35-95a6-3eb59041c4ab 2014-10-14 15:22:59.363696 7f5f022c17c0 1 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) leveldb db exists/created 2014-10-14 15:22:59.365460 7f5f022c17c0 1 journal _open /var/lib/ceph/tmp/mnt.GbLkrx/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:22:59.365789 7f5f022c17c0 -1 journal check: ondisk fsid 5a9ad874-ee9e-4b92-8607-c32c23eb7e83 doesn't match expected 9b224aad-1365-4d35-95a6-3eb59041c4ab, invalid (someone else's?) journal 2014-10-14 15:22:59.367804 7f5f022c17c0 1 journal _open /var/lib/ceph/tmp/mnt.GbLkrx/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:22:59.371068 7f5f022c17c0 0 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) mkjournal created journal on /var/lib/ceph/tmp/mnt.GbLkrx/journal 2014-10-14 15:22:59.371095 7f5f022c17c0 1 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) mkfs done in /var/lib/ceph/tmp/mnt.GbLkrx 2014-10-14 15:22:59.385444 7f5f022c17c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.GbLkrx) detect_features: FIEMAP ioctl is supported and appears to work 2014-10-14 15:22:59.385466 7f5f022c17c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.GbLkrx) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2014-10-14 15:22:59.398287 7f5f022c17c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.GbLkrx) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2014-10-14 15:22:59.415874 7f5f022c17c0 0 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) limited size xattrs 2014-10-14 15:22:59.427902 7f5f022c17c0 0 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2014-10-14 15:22:59.429929 7f5f022c17c0 1 journal _open /var/lib/ceph/tmp/mnt.GbLkrx/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:22:59.432652 7f5f022c17c0 1 journal _open /var/lib/ceph/tmp/mnt.GbLkrx/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:22:59.432908 7f5f022c17c0 -1 filestore(/var/lib/ceph/tmp/mnt.GbLkrx) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory 2014-10-14 15:27:04.082152 7f5efdff2700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7f5ef9fea700' had timed out after 4 2014-10-14 15:27:04.082187 7f5efdff2700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7f5efa7eb700' had timed out after 4 2014-10-14 15:27:04.084983 7f5ef9fea700 1 heartbeat_map reset_timeout 'FileStore::op_tp thread 0x7f5ef9fea700' had timed out after 4 2014-10-14 15:27:04.085017 7f5efa7eb700 1 heartbeat_map reset_timeout 'FileStore::op_tp thread 0x7f5efa7eb700' had timed out after 4 2014-10-14 15:35:39.632678 7f32b27be7c0 0 ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae), process ceph-osd, pid 530 2014-10-14 15:35:39.634200 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) dump_stop 2014-10-14 15:35:39.634648 7f32b27be7c0 1 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mkfs in /var/lib/ceph/tmp/mnt.8uRLPI 2014-10-14 15:35:39.634676 7f32b27be7c0 1 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mkfs fsid is already set to 3043b7af-d7a0-45d7-bcd1-5fa8e01def54 2014-10-14 15:35:39.668382 7f32b27be7c0 1 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) leveldb db exists/created 2014-10-14 15:35:39.668400 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) open_journal at /var/lib/ceph/tmp/mnt.8uRLPI/journal 2014-10-14 15:35:39.668450 7f32b27be7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5367661056) 2014-10-14 15:35:39.670276 7f32b27be7c0 20 journal _check_disk_write_cache: disk write cache is on, but your kernel is new enough to handle it correctly. (fn:/var/lib/ceph/tmp/mnt.8uRLPI/journal) 2014-10-14 15:35:39.670454 7f32b27be7c0 1 journal _open /var/lib/ceph/tmp/mnt.8uRLPI/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:35:39.670471 7f32b27be7c0 10 journal read_header 2014-10-14 15:35:39.670779 7f32b27be7c0 10 journal header: block_size 4096 alignment 4096 max_size 5367660544 2014-10-14 15:35:39.670795 7f32b27be7c0 10 journal header: start 4096 2014-10-14 15:35:39.670798 7f32b27be7c0 10 journal write_pos 0 2014-10-14 15:35:39.670802 7f32b27be7c0 -1 journal check: ondisk fsid 9b224aad-1365-4d35-95a6-3eb59041c4ab doesn't match expected 3043b7af-d7a0-45d7-bcd1-5fa8e01def54, invalid (someone else's?) journal 2014-10-14 15:35:39.670896 7f32b27be7c0 2 journal create /var/lib/ceph/tmp/mnt.8uRLPI/journal fsid 3043b7af-d7a0-45d7-bcd1-5fa8e01def54 2014-10-14 15:35:39.670917 7f32b27be7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5367661056) 2014-10-14 15:35:39.672612 7f32b27be7c0 20 journal _check_disk_write_cache: disk write cache is on, but your kernel is new enough to handle it correctly. (fn:/var/lib/ceph/tmp/mnt.8uRLPI/journal) 2014-10-14 15:35:39.672805 7f32b27be7c0 1 journal _open /var/lib/ceph/tmp/mnt.8uRLPI/journal fd 11: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:35:39.672815 7f32b27be7c0 10 journal header: block_size 4096 alignment 4096 max_size 5367660544 2014-10-14 15:35:39.672818 7f32b27be7c0 10 journal header: start 4096 2014-10-14 15:35:39.672824 7f32b27be7c0 10 journal write_pos 0 2014-10-14 15:35:39.675994 7f32b27be7c0 2 journal create done 2014-10-14 15:35:39.676052 7f32b27be7c0 0 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mkjournal created journal on /var/lib/ceph/tmp/mnt.8uRLPI/journal 2014-10-14 15:35:39.676094 7f32b27be7c0 1 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mkfs done in /var/lib/ceph/tmp/mnt.8uRLPI 2014-10-14 15:35:39.676105 7f32b27be7c0 5 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) basedir /var/lib/ceph/tmp/mnt.8uRLPI journal /var/lib/ceph/tmp/mnt.8uRLPI/journal 2014-10-14 15:35:39.676121 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mount fsid is 3043b7af-d7a0-45d7-bcd1-5fa8e01def54 2014-10-14 15:35:39.683189 7f32b27be7c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8uRLPI) detect_features: FIEMAP ioctl is supported and appears to work 2014-10-14 15:35:39.683211 7f32b27be7c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8uRLPI) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2014-10-14 15:35:39.688255 7f32b27be7c0 0 genericfilestorebackend(/var/lib/ceph/tmp/mnt.8uRLPI) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2014-10-14 15:35:39.705087 7f32b27be7c0 0 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) limited size xattrs 2014-10-14 15:35:39.705166 7f32b27be7c0 5 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mount op_seq is 1 2014-10-14 15:35:39.723568 7f32b27be7c0 20 filestore (init)dbobjectmap: seq is 1 2014-10-14 15:35:39.723580 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) open_journal at /var/lib/ceph/tmp/mnt.8uRLPI/journal 2014-10-14 15:35:39.723602 7f32b27be7c0 0 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2014-10-14 15:35:39.723606 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) list_collections 2014-10-14 15:35:39.723656 7f32b27be7c0 10 journal journal_replay fs op_seq 1 2014-10-14 15:35:39.723660 7f32b27be7c0 2 journal open /var/lib/ceph/tmp/mnt.8uRLPI/journal fsid 3043b7af-d7a0-45d7-bcd1-5fa8e01def54 fs_op_seq 1 2014-10-14 15:35:39.723676 7f32b27be7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5367661056) 2014-10-14 15:35:39.723688 7f32ad4ed700 20 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) sync_entry waiting for max_interval 5.000000 2014-10-14 15:35:39.725499 7f32b27be7c0 20 journal _check_disk_write_cache: disk write cache is on, but your kernel is new enough to handle it correctly. (fn:/var/lib/ceph/tmp/mnt.8uRLPI/journal) 2014-10-14 15:35:39.725688 7f32b27be7c0 1 journal _open /var/lib/ceph/tmp/mnt.8uRLPI/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:35:39.725704 7f32b27be7c0 10 journal read_header 2014-10-14 15:35:39.725952 7f32b27be7c0 10 journal header: block_size 4096 alignment 4096 max_size 5367660544 2014-10-14 15:35:39.725971 7f32b27be7c0 10 journal header: start 4096 2014-10-14 15:35:39.725974 7f32b27be7c0 10 journal write_pos 4096 2014-10-14 15:35:39.725977 7f32b27be7c0 10 journal open header.fsid = 3043b7af-d7a0-45d7-bcd1-5fa8e01def54 2014-10-14 15:35:39.726066 7f32b27be7c0 2 journal No further valid entries found, journal is most likely valid 2014-10-14 15:35:39.726068 7f32b27be7c0 10 journal open reached end of journal. 2014-10-14 15:35:39.726076 7f32b27be7c0 2 journal No further valid entries found, journal is most likely valid 2014-10-14 15:35:39.726079 7f32b27be7c0 3 journal journal_replay: end of journal, done. 2014-10-14 15:35:39.726080 7f32b27be7c0 10 journal make_writeable 2014-10-14 15:35:39.726377 7f32b27be7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5367661056) 2014-10-14 15:35:39.728080 7f32b27be7c0 20 journal _check_disk_write_cache: disk write cache is on, but your kernel is new enough to handle it correctly. (fn:/var/lib/ceph/tmp/mnt.8uRLPI/journal) 2014-10-14 15:35:39.728281 7f32b27be7c0 1 journal _open /var/lib/ceph/tmp/mnt.8uRLPI/journal fd 17: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2014-10-14 15:35:39.728354 7f32b27be7c0 10 journal journal_start 2014-10-14 15:35:39.728376 7f32ac4eb700 10 journal write_thread_entry start 2014-10-14 15:35:39.728407 7f32ac4eb700 20 journal prepare_multi_write queue_pos now 4096 2014-10-14 15:35:39.728407 7f32abcea700 10 journal write_finish_thread_entry enter 2014-10-14 15:35:39.728445 7f32abcea700 20 journal write_finish_thread_entry sleeping 2014-10-14 15:35:39.728481 7f32ac4eb700 15 journal do_aio_write writing 4096~0 + header 2014-10-14 15:35:39.728495 7f32ac4eb700 20 journal write_aio_bl 0~4096 seq 0 2014-10-14 15:35:39.728510 7f32ac4eb700 20 journal write_aio_bl .. 0~4096 in 1 2014-10-14 15:35:39.728509 7f32b27be7c0 15 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) read meta/23c2fcde/osd_superblock/0//-1 0~0 2014-10-14 15:35:39.728549 7f32ac4eb700 20 journal write_aio_bl 4096~0 seq 0 2014-10-14 15:35:39.728553 7f32ac4eb700 5 journal put_throttle finished 0 ops and 0 bytes, now 0 ops and 0 bytes 2014-10-14 15:35:39.728556 7f32ac4eb700 20 journal write_thread_entry going to sleep 2014-10-14 15:35:39.728581 7f32abcea700 20 journal write_finish_thread_entry waiting for aio(s) 2014-10-14 15:35:39.728574 7f32b27be7c0 -1 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory 2014-10-14 15:35:39.728602 7f32b27be7c0 10 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) FileStore::read(meta/23c2fcde/osd_superblock/0//-1) open error: (2) No such file or directory 2014-10-14 15:35:39.728643 7f32b27be7c0 5 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) queue_transactions new osr(default 0x7f32b4c38b98)/0x7f32b4c38b98 2014-10-14 15:35:39.728655 7f32b27be7c0 10 journal op_submit_start 2 2014-10-14 15:35:39.728656 7f32b27be7c0 5 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) queue_transactions (writeahead) 2 0x7fff6ba941f0 2014-10-14 15:35:39.728659 7f32b27be7c0 10 journal op_journal_transactions 2 0x7fff6ba941f0 2014-10-14 15:35:39.728674 7f32b27be7c0 5 journal submit_entry seq 2 len 505 (0x7f32b4bf58d0) 2014-10-14 15:35:39.728685 7f32b27be7c0 10 journal op_submit_finish 2 2014-10-14 15:35:39.728716 7f32ac4eb700 20 journal write_thread_entry woke up 2014-10-14 15:35:39.728724 7f32ac4eb700 20 journal write_thread_entry aio throttle: aio num 1 bytes 4096 ... exp 2 min_new 4 ... pending 0 2014-10-14 15:35:39.728730 7f32ac4eb700 20 journal write_thread_entry deferring until more aios complete: 1 aios with 4096 bytes needs 4 bytes to start a new aio (currently 0 pending) 2014-10-14 15:35:39.730226 7f32abcea700 10 journal write_finish_thread_entry aio 0~4096 done 2014-10-14 15:35:39.730240 7f32abcea700 20 journal check_aio_completion 2014-10-14 15:35:39.730242 7f32abcea700 20 journal check_aio_completion completed seq 0 0~4096 2014-10-14 15:35:39.730247 7f32abcea700 20 journal write_finish_thread_entry sleeping 2014-10-14 15:35:44.724070 7f32ad4ed700 20 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) sync_entry woke after 5.000381 2014-10-14 15:35:44.724098 7f32ad4ed700 10 journal commit_start max_applied_seq 1, open_ops 0 2014-10-14 15:35:44.724102 7f32ad4ed700 10 journal commit_start blocked, all open_ops have completed 2014-10-14 15:35:44.724105 7f32ad4ed700 10 journal commit_start nothing to do 2014-10-14 15:35:44.724108 7f32ad4ed700 10 journal commit_start 2014-10-14 15:35:44.724115 7f32ad4ed700 20 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) sync_entry waiting for max_interval 5.000000 2014-10-14 15:35:49.724191 7f32ad4ed700 20 filestore(/var/lib/ceph/tmp/mnt.8uRLPI) sync_entry woke after 5.000076 2014-10-14 15:35:49.724221 7f32ad4ed700 10 journal commit_start max_applied_seq 1, open_ops 0 2014-10-14 15:35:49.724225 7f32ad4ed700 10 journal commit_start blocked, all open_ops have completed 2014-10-14 15:35:49.724227 7f32ad4ed700 10 journal commit_start nothing to do 2014-10-14 15:35:49.724236 7f32ad4ed700 10 journal commit_start ...
stack trace
(gdb) thread apply all bt Thread 14 (Thread 0x7f3177cd8700 (LWP 23461)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b67413b in ceph::log::Log::entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f3177cd8700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 13 (Thread 0x7f3176d4e700 (LWP 23462)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f317b74021b in CephContextServiceThread::entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f3176d4e700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 12 (Thread 0x7f317654d700 (LWP 23463)): #0 0x00007f3178c370ed in poll () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f317b726433 in AdminSocket::entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f317654d700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 11 (Thread 0x7f317554b700 (LWP 23470)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b595dbf in WBThrottle::get_next_should_flush(boost::tuples::tuple<ghobject_t, std::tr1::shared_ptr<FDCache::FD>, WBThrottle::PendingWB, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>*) () #2 0x00007f317b596c83 in WBThrottle::entry() () #3 0x00007f317a4dc0a4 in start_thread (arg=0x7f317554b700) at pthread_create.c:309 #4 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 10 (Thread 0x7f3175d4c700 (LWP 23471)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f317b51fc0d in Cond::WaitInterval(CephContext*, Mutex&, utime_t) () #2 0x00007f317b4ea490 in FileStore::sync_entry() () #3 0x00007f317b52076d in FileStore::SyncThread::entry() () #4 0x00007f317a4dc0a4 in start_thread (arg=0x7f3175d4c700) at pthread_create.c:309 #5 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 9 (Thread 0x7f3174d4a700 (LWP 23476)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b613286 in FileJournal::write_thread_entry() () #2 0x00007f317b51ee8d in FileJournal::Writer::entry() () #3 0x00007f317a4dc0a4 in start_thread (arg=0x7f3174d4a700) at pthread_create.c:309 #4 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 8 (Thread 0x7f3174549700 (LWP 23477)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b2bee06 in Cond::Wait(Mutex&) () #2 0x00007f317b610eea in FileJournal::write_finish_thread_entry() () #3 0x00007f317b51efad in FileJournal::WriteFinisher::entry() () #4 0x00007f317a4dc0a4 in start_thread (arg=0x7f3174549700) at pthread_create.c:309 #5 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 7 (Thread 0x7f3173d48700 (LWP 23478)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b6625f8 in Finisher::finisher_thread_entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f3173d48700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 6 (Thread 0x7f3173547700 (LWP 23479)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f317b71cb32 in ThreadPool::worker(ThreadPool::WorkThread*) () #2 0x00007f317b71dd90 in ThreadPool::WorkThread::entry() () #3 0x00007f317a4dc0a4 in start_thread (arg=0x7f3173547700) at pthread_create.c:309 #4 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 5 (Thread 0x7f3172d46700 (LWP 23480)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f317b71cb32 in ThreadPool::worker(ThreadPool::WorkThread*) () #2 0x00007f317b71dd90 in ThreadPool::WorkThread::entry() () #3 0x00007f317a4dc0a4 in start_thread (arg=0x7f3172d46700) at pthread_create.c:309 #4 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 4 (Thread 0x7f3172545700 (LWP 23481)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b6625f8 in Finisher::finisher_thread_entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f3172545700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 3 (Thread 0x7f3171d44700 (LWP 23482)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b6625f8 in Finisher::finisher_thread_entry() () #2 0x00007f317a4dc0a4 in start_thread (arg=0x7f3171d44700) at pthread_create.c:309 #3 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 2 (Thread 0x7f3171543700 (LWP 23483)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b717d84 in SafeTimer::timer_thread() () #2 0x00007f317b71892d in SafeTimerThread::entry() () #3 0x00007f317a4dc0a4 in start_thread (arg=0x7f3171543700) at pthread_create.c:309 #4 0x00007f3178c3fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 1 (Thread 0x7f317b01d7c0 (LWP 23460)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f317b35bbf4 in ObjectStore::apply_transactions(ObjectStore::Sequencer*, std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, Context*) () #2 0x00007f317b2c0d08 in ObjectStore::apply_transaction(ObjectStore::Transaction&, Context*) () #3 0x00007f317b27374a in OSD::mkfs(CephContext*, ObjectStore*, std::string const&, uuid_d, int) () #4 0x00007f317b24a872 in main ()
Files
Updated by Loïc Dachary over 9 years ago
- Subject changed from ceph-osd mkfs loops forever on sync_entry (firefly 0.80.6) to ceph-osd mkfs loops forever on sync_entry
- Description updated (diff)
Updated by Loïc Dachary over 9 years ago
Although https://github.com/ceph/ceph/commit/2f11631f3144f2cc0e04d718e40e716540c8af19 seems related, the log shows FileStore::umount() is not called.
Updated by Loïc Dachary over 9 years ago
$ git log --patch tags/v0.80.5..tags/v0.80.6 ... commit f7498124325fe2269c33e45fae4c8181c8f7be14 Author: Samuel Just <sam.just@inktank.com> Date: Mon Jun 9 11:00:32 2014 -0700 os/: add async flush_commit() method to Sequencer and implementations Signed-off-by: Samuel Just <sam.just@inktank.com> ... commit 3bbe29eb5cca7cc42266a6c5561ce68feb4072de Author: Samuel Just <sam.just@inktank.com> Date: Tue May 20 16:05:01 2014 -0700 common/Finisher: add queue(list<Context*>&) Signed-off-by: Samuel Just <sam.just@inktank.com>
Updated by Loïc Dachary over 9 years ago
I browsed the patches to aio in 3.12.7 until now and saw nothing that could related to this problem https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.12.7 . Significant problems were fixed before 3.12 but not after.
Updated by Loïc Dachary over 9 years ago
- Status changed from New to 12
- Assignee set to Loïc Dachary
Updated by Loïc Dachary over 9 years ago
- File ceph-osd-ok.txt added
On ubuntu-14.04 the logs of a ceph-osd mkfs on 0.80.5 that completes successfully.
Updated by Loïc Dachary over 9 years ago
- Subject changed from ceph-osd mkfs loops forever on sync_entry to ceph-osd mkfs hangs
Updated by Loïc Dachary over 9 years ago
Created with ceph-disk prepare --fs-type=ext4 and ceph-disk activate /dev/loop3p1
Actions