Bug #5637
closedOSD crash upon pool creation
Added by Andrey Korolyov almost 11 years ago. Updated almost 11 years ago.
0%
Description
db2bb270e93ed44f9252d65d1d4c9b36875d0ea5 introduces some deadly behavior - every osd in the cluster crashing with SIGABRT, sample traces are attached.
Files
poolcreate.txt.gz (6.69 KB) poolcreate.txt.gz | Andrey Korolyov, 07/15/2013 03:24 PM | ||
poolcreate2.txt.gz (6.6 KB) poolcreate2.txt.gz | Andrey Korolyov, 07/15/2013 03:24 PM | ||
createpool-osd-log.txt.gz (771 KB) createpool-osd-log.txt.gz | Andrey Korolyov, 07/16/2013 01:47 AM |
Updated by Sage Weil almost 11 years ago
the osd log a fewlines up should tell us what the error was that it got back from the fs. can you attach the last few hundreds lines of that log (or the whole thing)?
Updated by Andrey Korolyov almost 11 years ago
Here you go, log was truncated before pool creation to avoid excessive gzip attachment.
Updated by Sage Weil almost 11 years ago
actually, no:
-5> 2013-07-16 12:43:08.632339 7ffe2d5a9700 15 filestore(/var/lib/ceph/osd/5) create_collection /var/lib/ceph/osd/5/current/7.0_head -4> 2013-07-16 12:43:08.632359 7ffe2d5a9700 10 filestore(/var/lib/ceph/osd/5) create_collection /var/lib/ceph/osd/5/current/7.0_head = -17 -3> 2013-07-16 12:43:08.632362 7ffe2d5a9700 0 filestore(/var/lib/ceph/osd/5) error (17) File exists not handled on operation 20 (6778010.0.1, or op 1, counting from 0) -2> 2013-07-16 12:43:08.632378 7ffe2d5a9700 0 filestore(/var/lib/ceph/osd/5) unexpected error code -1> 2013-07-16 12:43:08.632380 7ffe2d5a9700 0 filestore(/var/lib/ceph/osd/5) transaction dump: { "ops": [ { "op_num": 0, "op_name": "mkcoll", "collection": "7.0_head"}, { "op_num": 1, "op_name": "mkcoll", "collection": "7.0_head"}, { "op_num": 2, "op_name": "collection_setattr", "collection": "7.0_head", "name": "info", "length": 1},
the mkcoll is in there twice.
Updated by Sage Weil almost 11 years ago
- Status changed from Duplicate to 12
- Priority changed from High to Urgent
Updated by Sage Weil almost 11 years ago
Andrey, what version is the osd running? the sha1 in the log (43c453982c37a55defc5d9ffcfd8cf8a60755f24) is not in ceph.git
Updated by Andrey Korolyov almost 11 years ago
Please excuse me, I am horribly wrong - we`re using our own patch which is affecting collections too and it was a general cause for the current problem. By the way, may I ask to add some work with collection seperation via config tunable in the git? Same question was about a year ago and nobody implemented this yet in the mainline.
Updated by Sage Weil almost 11 years ago
- Status changed from 12 to Rejected
Andrey Korolyov wrote:
Please excuse me, I am horribly wrong - we`re using our own patch which is affecting collections too and it was a general cause for the current problem. By the way, may I ask to add some work with collection seperation via config tunable in the git? Same question was about a year ago and nobody implemented this yet in the mainline.
what do you mean by 'collection separation via config tunable'?
Updated by Andrey Korolyov almost 11 years ago
Sorry if it was unclear: dedicated parameter(s) for path of the omap and meta collections.