Project

General

Profile

Actions

Bug #23120

closed

OSDs continously crash during recovery

Added by Oliver Freyermuth about 6 years ago. Updated over 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have several OSDs continuously crashing during recovery. This is Luminous 12.2.3.

 ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable)
 1: (()+0xa3c591) [0x55b3e5a85591]
 2: (()+0xf5e0) [0x7f8c237ca5e0]
 3: (gsignal()+0x37) [0x7f8c227f31f7]
 4: (abort()+0x148) [0x7f8c227f48e8]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x55b3e5ac4664]
 6: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1487) [0x55b3e5997a27]
 7: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0) [0x55b3e5998a70]
 8: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x65) [0x55b3e5708a85]
 9: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x55b3e5828191]
 10: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327) [0x55b3e5838b27]
 11: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55b3e573d680]
 12: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55b3e56a900c]
 13: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55b3e552ef29]
 14: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55b3e57abad7]
 15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55b3e555d99e]
 16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55b3e5aca009]
 17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55b3e5acbfa0]
 18: (()+0x7e25) [0x7f8c237c2e25]
 19: (clone()+0x6d) [0x7f8c228b634d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

This is using the officially released RPMs.

I've uploaded the logfile of one such OSD as:
ca0a29ae-0993-4faa-be4d-9ba2f7d6f905

The cluster will likely be recreated soon, since the system is now borked anyway, so please let me know quickly if more info is needed.

Actions #1

Updated by Oliver Freyermuth about 6 years ago

It might be that this OSD was subject to OOM at some point in the last 24 hours.
It seems OSDs are using 2-3 times as much memory as configured via bluestore_cache_size_hdd when doing recovery and accepting small objects,
which exceeded our RAM + swap on some machines.

Actions #2

Updated by Oliver Freyermuth about 6 years ago

Here's another log of another OSD:
7de1dddf-27d4-4b6b-9128-0138bfaf85cf
backtrace looks similar.

Actions #3

Updated by Oliver Freyermuth about 6 years ago

After many restarts of all OSDs, and temporarily lowering min_size, they now stay up. I'll watch and see if the cluster recovers.

Actions #4

Updated by Oliver Freyermuth about 6 years ago

Cluster has mostly recovered, looks good.
Still, hopefully the stacktrace and logs can help to track down the underlying issue that caused the crashes.

Actions #5

Updated by Oliver Freyermuth about 6 years ago

Here's a ceph osd tree due to popular request:

# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME       STATUS REWEIGHT PRI-AFF 
 -1       700.74890 root default                            
 -3         0.43658     host mon001                         
  0   ssd   0.21829         osd.0       up  1.00000 1.00000 
  1   ssd   0.21829         osd.1       up  1.00000 1.00000 
 -5         0.43637     host mon002                         
  2   ssd   0.21819         osd.2       up  1.00000 1.00000 
  3   ssd   0.21819         osd.3       up  1.00000 1.00000 
-10       116.64600     host osd001                         
  4   hdd   3.64519         osd.4       up  1.00000 1.00000 
  5   hdd   3.64519         osd.5       up  1.00000 1.00000 
  6   hdd   3.64519         osd.6       up  1.00000 1.00000 
  7   hdd   3.64519         osd.7       up  1.00000 1.00000 
  8   hdd   3.64519         osd.8       up  1.00000 1.00000 
  9   hdd   3.64519         osd.9       up  1.00000 1.00000 
 10   hdd   3.64519         osd.10      up  1.00000 1.00000 
 11   hdd   3.64519         osd.11      up  1.00000 1.00000 
 12   hdd   3.64519         osd.12      up  1.00000 1.00000 
 13   hdd   3.64519         osd.13      up  1.00000 1.00000 
 14   hdd   3.64519         osd.14      up  1.00000 1.00000 
 15   hdd   3.64519         osd.15      up  1.00000 1.00000 
 16   hdd   3.64519         osd.16      up  1.00000 1.00000 
 17   hdd   3.64519         osd.17      up  1.00000 1.00000 
 18   hdd   3.64519         osd.18      up  1.00000 1.00000 
 19   hdd   3.64519         osd.19      up  1.00000 1.00000 
 20   hdd   3.64519         osd.20      up  1.00000 1.00000 
 21   hdd   3.64519         osd.21      up  1.00000 1.00000 
 22   hdd   3.64519         osd.22      up  1.00000 1.00000 
 23   hdd   3.64519         osd.23      up  1.00000 1.00000 
 24   hdd   3.64519         osd.24      up  1.00000 1.00000 
 25   hdd   3.64519         osd.25      up  1.00000 1.00000 
 26   hdd   3.64519         osd.26      up  1.00000 1.00000 
 27   hdd   3.64519         osd.27      up  1.00000 1.00000 
 28   hdd   3.64519         osd.28      up  1.00000 1.00000 
 29   hdd   3.64519         osd.29      up  1.00000 1.00000 
 30   hdd   3.64519         osd.30      up  1.00000 1.00000 
 31   hdd   3.64519         osd.31      up  1.00000 1.00000 
 32   hdd   3.64519         osd.32      up  1.00000 1.00000 
 33   hdd   3.64519         osd.33      up  1.00000 1.00000 
 34   hdd   3.64519         osd.34      up  1.00000 1.00000 
 35   hdd   3.64519         osd.35      up  1.00000 1.00000 
-13       116.64600     host osd002                         
 36   hdd   3.64519         osd.36      up  1.00000 1.00000 
 37   hdd   3.64519         osd.37      up  1.00000 1.00000 
 38   hdd   3.64519         osd.38      up  1.00000 1.00000 
 39   hdd   3.64519         osd.39      up  1.00000 1.00000 
 40   hdd   3.64519         osd.40      up  1.00000 1.00000 
 41   hdd   3.64519         osd.41      up  1.00000 1.00000 
 42   hdd   3.64519         osd.42      up  1.00000 1.00000 
 43   hdd   3.64519         osd.43      up  1.00000 1.00000 
 44   hdd   3.64519         osd.44      up  1.00000 1.00000 
 45   hdd   3.64519         osd.45      up  1.00000 1.00000 
 46   hdd   3.64519         osd.46      up  1.00000 1.00000 
 47   hdd   3.64519         osd.47      up  1.00000 1.00000 
 48   hdd   3.64519         osd.48      up  1.00000 1.00000 
 49   hdd   3.64519         osd.49      up  1.00000 1.00000 
 50   hdd   3.64519         osd.50      up  1.00000 1.00000 
 51   hdd   3.64519         osd.51      up  1.00000 1.00000 
 52   hdd   3.64519         osd.52      up  1.00000 1.00000 
 53   hdd   3.64519         osd.53      up  1.00000 1.00000 
 54   hdd   3.64519         osd.54      up  1.00000 1.00000 
 55   hdd   3.64519         osd.55      up  1.00000 1.00000 
 56   hdd   3.64519         osd.56      up  1.00000 1.00000 
 57   hdd   3.64519         osd.57      up  1.00000 1.00000 
 58   hdd   3.64519         osd.58      up  1.00000 1.00000 
 59   hdd   3.64519         osd.59      up  1.00000 1.00000 
 60   hdd   3.64519         osd.60      up  1.00000 1.00000 
 61   hdd   3.64519         osd.61      up  1.00000 1.00000 
 62   hdd   3.64519         osd.62      up  1.00000 1.00000 
 63   hdd   3.64519         osd.63      up  1.00000 1.00000 
 64   hdd   3.64519         osd.64      up  1.00000 1.00000 
 65   hdd   3.64519         osd.65      up  1.00000 1.00000 
 66   hdd   3.64519         osd.66      up  1.00000 1.00000 
 67   hdd   3.64519         osd.67      up  1.00000 1.00000 
-16       116.64600     host osd003                         
 68   hdd   3.64519         osd.68      up  1.00000 1.00000 
 69   hdd   3.64519         osd.69      up  1.00000 1.00000 
 70   hdd   3.64519         osd.70      up  1.00000 1.00000 
 71   hdd   3.64519         osd.71      up  1.00000 1.00000 
 72   hdd   3.64519         osd.72      up  1.00000 1.00000 
 73   hdd   3.64519         osd.73      up  1.00000 1.00000 
 74   hdd   3.64519         osd.74      up  1.00000 1.00000 
 75   hdd   3.64519         osd.75      up  1.00000 1.00000 
 76   hdd   3.64519         osd.76      up  1.00000 1.00000 
 77   hdd   3.64519         osd.77      up  1.00000 1.00000 
 78   hdd   3.64519         osd.78      up  1.00000 1.00000 
 79   hdd   3.64519         osd.79      up  1.00000 1.00000 
 80   hdd   3.64519         osd.80      up  1.00000 1.00000 
 81   hdd   3.64519         osd.81      up  1.00000 1.00000 
 82   hdd   3.64519         osd.82      up  1.00000 1.00000 
 83   hdd   3.64519         osd.83      up  1.00000 1.00000 
 84   hdd   3.64519         osd.84      up  1.00000 1.00000 
 85   hdd   3.64519         osd.85      up  1.00000 1.00000 
 86   hdd   3.64519         osd.86      up  1.00000 1.00000 
 87   hdd   3.64519         osd.87      up  1.00000 1.00000 
 88   hdd   3.64519         osd.88      up  1.00000 1.00000 
 89   hdd   3.64519         osd.89      up  1.00000 1.00000 
 90   hdd   3.64519         osd.90      up  1.00000 1.00000 
 91   hdd   3.64519         osd.91      up  1.00000 1.00000 
 92   hdd   3.64519         osd.92      up  1.00000 1.00000 
 93   hdd   3.64519         osd.93      up  1.00000 1.00000 
 94   hdd   3.64519         osd.94      up  1.00000 1.00000 
 95   hdd   3.64519         osd.95      up  1.00000 1.00000 
 96   hdd   3.64519         osd.96      up  1.00000 1.00000 
 97   hdd   3.64519         osd.97      up  1.00000 1.00000 
 98   hdd   3.64519         osd.98      up  1.00000 1.00000 
 99   hdd   3.64519         osd.99      up  1.00000 1.00000 
-19       116.64600     host osd004                         
100   hdd   3.64519         osd.100     up  1.00000 1.00000 
101   hdd   3.64519         osd.101     up  1.00000 1.00000 
102   hdd   3.64519         osd.102     up  1.00000 1.00000 
103   hdd   3.64519         osd.103     up  1.00000 1.00000 
104   hdd   3.64519         osd.104     up  1.00000 1.00000 
105   hdd   3.64519         osd.105     up  1.00000 1.00000 
106   hdd   3.64519         osd.106     up  1.00000 1.00000 
107   hdd   3.64519         osd.107     up  1.00000 1.00000 
108   hdd   3.64519         osd.108     up  1.00000 1.00000 
109   hdd   3.64519         osd.109     up  1.00000 1.00000 
110   hdd   3.64519         osd.110     up  1.00000 1.00000 
111   hdd   3.64519         osd.111     up  1.00000 1.00000 
112   hdd   3.64519         osd.112     up  1.00000 1.00000 
113   hdd   3.64519         osd.113     up  1.00000 1.00000 
114   hdd   3.64519         osd.114     up  1.00000 1.00000 
115   hdd   3.64519         osd.115     up  1.00000 1.00000 
116   hdd   3.64519         osd.116     up  1.00000 1.00000 
117   hdd   3.64519         osd.117     up  1.00000 1.00000 
118   hdd   3.64519         osd.118     up  1.00000 1.00000 
119   hdd   3.64519         osd.119     up  1.00000 1.00000 
120   hdd   3.64519         osd.120     up  1.00000 1.00000 
121   hdd   3.64519         osd.121     up  1.00000 1.00000 
122   hdd   3.64519         osd.122     up  1.00000 1.00000 
123   hdd   3.64519         osd.123     up  1.00000 1.00000 
124   hdd   3.64519         osd.124     up  1.00000 1.00000 
125   hdd   3.64519         osd.125     up  1.00000 1.00000 
126   hdd   3.64519         osd.126     up  1.00000 1.00000 
127   hdd   3.64519         osd.127     up  1.00000 1.00000 
128   hdd   3.64519         osd.128     up  1.00000 1.00000 
129   hdd   3.64519         osd.129     up  1.00000 1.00000 
130   hdd   3.64519         osd.130     up  1.00000 1.00000 
131   hdd   3.64519         osd.131     up  1.00000 1.00000 
-22       116.64600     host osd005                         
132   hdd   3.64519         osd.132     up  1.00000 1.00000 
133   hdd   3.64519         osd.133     up  1.00000 1.00000 
134   hdd   3.64519         osd.134     up  1.00000 1.00000 
135   hdd   3.64519         osd.135     up  1.00000 1.00000 
136   hdd   3.64519         osd.136     up  1.00000 1.00000 
137   hdd   3.64519         osd.137     up  1.00000 1.00000 
138   hdd   3.64519         osd.138     up  1.00000 1.00000 
139   hdd   3.64519         osd.139     up  1.00000 1.00000 
140   hdd   3.64519         osd.140     up  1.00000 1.00000 
141   hdd   3.64519         osd.141     up  1.00000 1.00000 
142   hdd   3.64519         osd.142     up  1.00000 1.00000 
143   hdd   3.64519         osd.143     up  1.00000 1.00000 
144   hdd   3.64519         osd.144     up  1.00000 1.00000 
145   hdd   3.64519         osd.145     up  1.00000 1.00000 
146   hdd   3.64519         osd.146     up  1.00000 1.00000 
147   hdd   3.64519         osd.147     up  1.00000 1.00000 
148   hdd   3.64519         osd.148     up  1.00000 1.00000 
149   hdd   3.64519         osd.149     up  1.00000 1.00000 
150   hdd   3.64519         osd.150     up  1.00000 1.00000 
151   hdd   3.64519         osd.151     up  1.00000 1.00000 
152   hdd   3.64519         osd.152     up  1.00000 1.00000 
153   hdd   3.64519         osd.153     up  1.00000 1.00000 
154   hdd   3.64519         osd.154     up  1.00000 1.00000 
155   hdd   3.64519         osd.155     up  1.00000 1.00000 
156   hdd   3.64519         osd.156     up  1.00000 1.00000 
157   hdd   3.64519         osd.157     up  1.00000 1.00000 
158   hdd   3.64519         osd.158     up  1.00000 1.00000 
159   hdd   3.64519         osd.159     up  1.00000 1.00000 
160   hdd   3.64519         osd.160     up  1.00000 1.00000 
161   hdd   3.64519         osd.161     up  1.00000 1.00000 
162   hdd   3.64519         osd.162     up  1.00000 1.00000 
163   hdd   3.64519         osd.163     up  1.00000 1.00000 
-25       116.64600     host osd006                         
164   hdd   3.64519         osd.164     up  1.00000 1.00000 
165   hdd   3.64519         osd.165     up  1.00000 1.00000 
166   hdd   3.64519         osd.166     up  1.00000 1.00000 
167   hdd   3.64519         osd.167     up  1.00000 1.00000 
168   hdd   3.64519         osd.168     up  1.00000 1.00000 
169   hdd   3.64519         osd.169     up  1.00000 1.00000 
170   hdd   3.64519         osd.170     up  1.00000 1.00000 
171   hdd   3.64519         osd.171     up  1.00000 1.00000 
172   hdd   3.64519         osd.172     up  1.00000 1.00000 
173   hdd   3.64519         osd.173     up  1.00000 1.00000 
174   hdd   3.64519         osd.174     up  1.00000 1.00000 
175   hdd   3.64519         osd.175     up  1.00000 1.00000 
176   hdd   3.64519         osd.176     up  1.00000 1.00000 
177   hdd   3.64519         osd.177     up  1.00000 1.00000 
178   hdd   3.64519         osd.178     up  1.00000 1.00000 
179   hdd   3.64519         osd.179     up  1.00000 1.00000 
180   hdd   3.64519         osd.180     up  1.00000 1.00000 
181   hdd   3.64519         osd.181     up  1.00000 1.00000 
182   hdd   3.64519         osd.182     up  1.00000 1.00000 
183   hdd   3.64519         osd.183     up  1.00000 1.00000 
184   hdd   3.64519         osd.184     up  1.00000 1.00000 
185   hdd   3.64519         osd.185     up  1.00000 1.00000 
186   hdd   3.64519         osd.186     up  1.00000 1.00000 
187   hdd   3.64519         osd.187     up  1.00000 1.00000 
188   hdd   3.64519         osd.188     up  1.00000 1.00000 
189   hdd   3.64519         osd.189     up  1.00000 1.00000 
190   hdd   3.64519         osd.190     up  1.00000 1.00000 
191   hdd   3.64519         osd.191     up  1.00000 1.00000 
192   hdd   3.64519         osd.192     up  1.00000 1.00000 
193   hdd   3.64519         osd.193     up  1.00000 1.00000 
194   hdd   3.64519         osd.194     up  1.00000 1.00000 
195   hdd   3.64519         osd.195     up  1.00000 1.00000

The metadata-pool lives on the SSDs, the data pool on the HDDs (via the device classes).

Actions #6

Updated by Oliver Freyermuth about 6 years ago

All HDD-OSDs have 4 TB, while the SSDs used for the metadata pool have 240 GB.

Actions #7

Updated by Peter Woodman about 6 years ago

Hey, I might be seeing the same bug. Can you paste in the operation dump that shows up right before that crash, and maybe like ~10 lines of previous context?

Actions #8

Updated by Oliver Freyermuth about 6 years ago

@Peter Woodman: Since the system recovered after many OSD restarts (see my previous comment) and I did not think about taking an ops dump, I can sadly not reproduce that now :-(. I'll keep it in mind as soon as the issue reappears, right now I can sadly only share the logfiles (which I uploaded in full via ceph-post-file, but I could also share parts of the publicly if it helps).

Actions #9

Updated by Greg Farnum about 6 years ago

  • Project changed from Ceph to bluestore
Actions #10

Updated by Sage Weil about 6 years ago

  • Status changed from New to Need More Info

Can you reproduce the crash on one or more OSDs with 'debug osd = 20' and 'debug bluestore = 20'?

Also, can you check what the problematic PG is on the other crashing OSDs? For the attached log (osd.191) it is

  -202> 2018-02-25 17:21:58.829663 7f503e854700  0 bluestore(/var/lib/ceph/osd/ceph-191)  transaction dump:
{
    "ops": [
        {
            "op_num": 0,
            "op_name": "setattrs",
            "collection": "2.21es1_head",
            "oid": "1#2:785f4b65:::1000645413e.00000000:head#",
            "attr_lens": {
                "_": 275,
                "_layout": 30,
                "_parent": 346,
                "snapset": 35
            }
        },
        {
            "op_num": 1,
            "op_name": "setattr",
            "collection": "2.21es1_head",
            "oid": "1#2:785f4b65:::1000645413e.00000000:head#",
            "name": "hinfo_key",
            "length": 42
        }
    ]
}

so 2.21es1

Actions #11

Updated by Oliver Freyermuth about 6 years ago

The bad news (for the ticket) is that the problem vanished after restarting all crashing OSDs often enough,
and temporarily reducing min_size of the cluster (k=4 m=2, min_size is usually 5, I temporarily put it to 4).
So currently, I can't reproduce :-(.
Maybe Peter Woodman can add more info if his bug is really the same -
Or maybe 7de1dddf-27d4-4b6b-9128-0138bfaf85cf helps (from my comment #2), which was from another crashing OSD?

Actions #12

Updated by Peter Woodman about 6 years ago

Yeah, I've got some of that. Problem is, I'm not seeing debug log messages that should be there based on the failure, if I'm reading the code correctly. Will post when i get home.

Actions #13

Updated by Peter Woodman about 6 years ago

Actually, looks like your crashing ops are different from mine. I'll just open a new bug.

Actions #14

Updated by Sage Weil over 5 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF