Bug #12538
closedAn image with a long clone chain blows up kernel stack
0%
Description
#I create a rbd block named r1
rbd -p rbd create r1 --size 1024
#then create a snapshot of r1
rbd -p rbd snap create --image r1 --snap r1-s1
#then protect this snapshot and clone a new block from this snapshot
rbd snap protect --image r1 --snap r1-s1
rbd clone --image r1 --snap r1-s1 --dest r2
#then create a snapshot of r2 , protect the snapshot , clone a new block named r3
rbd snap create --image r2 --snap r2-s1
rbd snap protect --image r2 --snap r2-s1
rbd clone --image r2 --snap r2-s1 --dest r3
#Do the same thing as above, loop 100 times (for example, you can cycle more)
#then I get a block named r101(get from 100 times clone)
#then I try to map this block?I use this command
rbd -p rbd map r101
#then my system crash
#In the below picture,You can see some infomation about system crash
This happens in the real machine, rather than the virtual machine, and I also tested on the virtual machine, found the same problem
Files
Updated by Ilya Dryomov almost 9 years ago
- Project changed from rbd to Linux kernel client
- Status changed from New to 12
- Assignee set to Ilya Dryomov
Updated by Ilya Dryomov almost 9 years ago
- Subject changed from rbd map leads to system crash! to An image with a long clone chain blows up kernel stack
Updated by Ilya Dryomov over 8 years ago
It's on the list, I'll get to it sometime after I'm done with #9779.
Is that really something you care about? What's your use case for an image with so many parents?
Updated by Xudong Cao over 8 years ago
Hello, I'm bo cai, the bug submitter's colleague, I have been focusing on this problem for two weeks and just reached a conclusion:
In function rbd_add(), there's a recursive-like invoking logic below:
rbd_dev_image_probe()->rbd_dev_probe_parent()->rbd_dev_image_probe()->rbd_dev_probe_parent()->......
When a image has too many parents, this logic will result in stack overflow (however, a process in kernel mode only has a little 8K stack in X64 arch), I changed the kernel stack size from 8K to 64K and recompiled kernel, and then the problem disappeared.
so, shoud I do something in rbd_add()? for example, change the recursion to a loop? add a clone-chain-length limit? or something else? @llya Dryomov
Updated by Ilya Dryomov over 8 years ago
Yeah, the stack overflow part was clear from the beginning. Changing the structure of the code so that it doesn't generate a recursive process is the way to do it, but all of that is tied with the submit/callback code which I'd rather not change just to fix this particular problem - there are other issues with it, so it will be reworked in the future. You can't really fix this in a backportable way, so there is no upside.
We could limit the clone chain length, but then again, what's your use case for an image with >50 parents? It's going to perform pretty horribly...
Updated by Xudong Cao over 8 years ago
@llya Dryomov
Could you indicate more specifically about the submit/callback code? Because from my side I can only see that the changes have very limited influence to other codes.
Also I would appreciate it if you could share some details about the related rework plan.
As for this issue, I'm afraid it could be a potential but serious security problem, as someone may use it to take the whole system down. Therefore, IMO it is necessary to limit the clone chain length in the user-mode code (or in kernel?) to serve as a temporary solution, what do you think?
Updated by Ilya Dryomov over 8 years ago
rbd_dev_image_probe() -> rbd_dev_probe_parent() -> rbd_dev_image_probe() is not the problem - at least not when we are talking about a 50-100 images chain. Kernel stack is blown in the reply path, where we end up with the following: rbd_obj_request_complete() -> rbd_img_obj_callback() -> rbd_img_parent_read_callback() -> rbd_obj_request_complete() for no good reason.
The rework (or, rather, part of the rework) would be to switch it from a callback-based scheme to a more state machine like scheme, with no recursion. Callbacks don't buy us anything since there is no module boundary, the code as written is just trying to be way more generic then it needs to be.
I'll post a patch for limiting the chain length.
Updated by Xudong Cao over 8 years ago
OK, thanks so much for your answer! Could you please give me the link after you posting the patch? my email is cao.xudong@h3c.com, thank you!
Updated by Ilya Dryomov over 8 years ago
- Status changed from 12 to Fix Under Review
[PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()
[PATCH] rbd: prevent kernel stack blow up on rbd map
This just puts a cap on the parent chain length, but I think it would be a good idea to keep it around even after the problematic recursion is eliminated.
Updated by bo cai over 8 years ago
Ilya Dryomov wrote:
[PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()
[PATCH] rbd: prevent kernel stack blow up on rbd mapThis just puts a cap on the parent chain length, but I think it would be a good idea to keep it around even after the problematic recursion is eliminated.
I see status changed to "Need Review", but where is the code about how to fix this?
could you give us some link?
thanks.
Updated by Ilya Dryomov over 8 years ago
I sent a link to a github commit to Xudong Cao a couple of days ago. And, like all other kernel client patches, it's been posted to the mailing list:
https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg25996.html
https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg25997.html
Updated by Ilya Dryomov over 8 years ago
- Status changed from Fix Under Review to Pending Backport
Merged into 4.3-rc7.
Updated by Ilya Dryomov over 8 years ago
- Status changed from Pending Backport to Resolved