Bug #21820
closedCeph OSD crash with Segfault
0%
Description
Hi,
I've observed that after a while some OSD crash with a segfault. This happends since I switched to Bluestore.
This leads to reduced data redundancy and seems critical to me.
Here are some information:
- ceph --cluster ceph-mirror osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 17.06296 root default
-2 5.82999 host inf-0a38f9
1 hdd 2.91499 osd.1 up 1.00000 1.00000
2 hdd 2.91499 osd.2 up 1.00000 1.00000
-3 5.62140 host inf-30d985
4 hdd 2.81070 osd.4 up 1.00000 1.00000
5 hdd 2.81070 osd.5 down 0 1.00000
-4 5.61157 host inf-d7a3ca
0 hdd 2.80579 osd.0 down 0 1.00000
3 hdd 2.80579 osd.3 up 1.00000 1.00000
- ceph --cluster ceph-mirror -s
cluster:
id: 4b3bef10-7a76-491e-bf1a-c6ea4f5705cf
health: HEALTH_WARN
622/323253 objects misplaced (0.192%)
Degraded data redundancy: 9306/323253 objects degraded (2.879%), 11 pgs unclean, 11 pgs degraded, 8 pgs undersizedservices:
mon: 3 daemons, quorum inf-d7a3ca,inf-30d985,inf-0a38f9
mgr: inf-0a38f9(active), standbys: inf-d7a3ca, inf-30d985
osd: 6 osds: 4 up, 4 in; 8 remapped pgs
rbd-mirror: 1 daemon activedata:
pools: 2 pools, 128 pgs
objects: 105k objects, 418 GB
usage: 1765 GB used, 9955 GB / 11721 GB avail
pgs: 9306/323253 objects degraded (2.879%)
622/323253 objects misplaced (0.192%)
117 active+clean
4 active+recovery_wait+undersized+degraded+remapped
3 active+recovery_wait+degraded
3 active+undersized+degraded+remapped+backfill_wait
1 active+undersized+degraded+remapped+backfillingio:
client: 159 kB/s rd, 2004 kB/s wr, 19 op/s rd, 137 op/s wr
recovery: 1705 kB/s, 0 objects/s
Each node has 2x HDD and 2x SSD. The SSDs offer partition number 4 for usage as separate Block / WAL:
Disk /dev/sda: 234441648 sectors, 111.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 1BD0737C-CFB6-4A06-AB2F-3BF150E6CC12
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 234441614
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)
Number Start (sector) End (sector) Size Code Name
1 2048 16795647 8.0 GiB FD00 Linux RAID
2 16795648 58771455 20.0 GiB FD00 Linux RAID
3 58771456 58773503 1024.0 KiB EF02 BIOS boot partition
4 58773504 234441614 83.8 GiB 8300 Linux filesystem
This is how I provisioned the devices for each node:
- ceph-disk prepare --cluster ceph-mirror --bluestore --block.db /dev/sda4 /dev/sdc
- ceph-disk prepare --cluster ceph-mirror --bluestore --block.db /dev/sdb4 /dev/sdd
- ceph-disk activate /dev/sdc1
- ceph-disk activate /dev/sdd1
sdc and sdd are the hdds, sda4 and sdb4 are the manually created (and not formatted in any way) partitions for WAL/DB usage.
After occuring this issue I've to complete remove the OSD and recreate it. Next time, another OSD crashes. It's mysterious.
Please see the attached log for details.
Files
Updated by Dan Williams over 6 years ago
I’m also seeing this.
With 72 osd ~10mb/s client io, (20 filestore/50 bluestore) although the osd restarts successfully it is actually happening enough that the cluster never fully heals and there is constant recovery operations happing.
My log files contains exactly the same stack trace as the one attached above.
Updated by Dan Williams over 6 years ago
I forgot to mention that i’m Running cents 7.4 up to date as of 2017/10/17 12:00 UTC.
I’m using bluestore all-in-one OSDs.
Updated by Yves Vogl over 6 years ago
Updated by Yves Vogl over 6 years ago
Looks like there're some people getting the same.
Updated by Yves Vogl over 6 years ago
Changing from jemalloc to tcmalloc is a workaround.
Updated by Sage Weil over 6 years ago
- Related to Bug #20557: segmentation fault with rocksdb|BlueStore and jemalloc added
Updated by Sage Weil over 6 years ago
- Status changed from New to Duplicate
Disable jemalloc in /etc/{sysconfig,default}/ceph