Bug #54437: dbstore: concurrent use of sqlite prepared statements - rgw - Ceph

Actions

Copy link

Bug #54437

closed

dbstore: concurrent use of sqlite prepared statements

Added by Casey Bodley about 2 years ago. Updated about 2 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Soumya Koduri

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

several database operations make use of sqlite's prepared statements (sqlite3_stmt), but do not prevent multiple threads from binding parameters and evaluating the statement at the same time

consider allocating several instances of each prepared statement so they can be used in parallel, and fall back to normal queries if all prepared statements are in use

Actions

Copy link

Updated by Soumya Koduri about 2 years ago

Casey Bodley wrote:

several database operations make use of sqlite's prepared statements (sqlite3_stmt), but do not prevent multiple threads from binding parameters and evaluating the statement at the same time

consider allocating several instances of each prepared statement so they can be used in parallel, and fall back to normal queries if all prepared statements are in use

In the current dbstore design, they are protected via mutex lock in SQL_EXECUTE -

#define SQL_EXECUTE(dpp, params, stmt, cbk, args...) \
do{ \
const std::lock_guard<std::mutex> lk(((DBOp*)(this))->mtx); \
if (!stmt) { \
ret = Prepare(dpp, params); \
} \
\
if (!stmt) { \
ldpp_dout(dpp, 0) <<"No prepared statement "<< dendl; \
goto out; \
} \
\
ret = Bind(dpp, params); \
if (ret) { \
ldpp_dout(dpp, 0) <<"Bind parameters failed for stmt(" <<stmt<<") "<< dendl; \
goto out; \
}

<<<

Actions

Copy link

Updated by Soumya Koduri about 2 years ago

In the current dbstore design, they are protected via mutex lock in SQL_EXECUTE -

But yes this limits to only one thread for each operation. I tried to use sqlite Transactions but that also indirectly serializes the operations and sometimes results in EBUSY errors.

Do you suggest to have a pool of prepared statements?

Actions

Copy link

Updated by Casey Bodley about 2 years ago

Soumya Koduri wrote:

In the current dbstore design, they are protected via mutex lock in SQL_EXECUTE -

oops thanks! i had missed the DBop::mtx, so i broke this in https://github.com/ceph/ceph/pull/45199

But yes this limits to only one thread for each operation. I tried to use sqlite Transactions but that also indirectly serializes the operations and sometimes results in EBUSY errors.

Do you suggest to have a pool of prepared statements?

i think the important thing is that we don't block operations that are otherwise ready to execute. if we have a pool of N statements, we can run that many in parallel but still have to block once we get to N+1. the model we have now is essentially a pool with N=1

so i think a 'fallback' to avoid blocking is probably the most important part. this fallback path for N+1 could prepare a new statement for each request, then throw it away after; but i'm guessing that compiling a statement is more expensive than executing a normal, non-prepared query instead

if we do go with pools for each statement, we'd want to be smart about sizing them. some statements like Put/GetObject will be way more frequent than others like InsertUser, so we'd want a bigger pool for those. adaptive sizing could be an option to deal with imbalances like that. adding PerfCounters to track the hits/misses would help there too, but ultimately we'll want to make these decisions based on benchmarking

Actions

Copy link

Updated by Soumya Koduri about 2 years ago

Do you suggest to have a pool of prepared statements?

i think the important thing is that we don't block operations that are otherwise ready to execute. if we have a pool of N statements, we can run that many in parallel but still have to block once we get to N+1. the model we have now is essentially a pool with N=1

so i think a 'fallback' to avoid blocking is probably the most important part. this fallback path for N+1 could prepare a new statement for each request, then throw it away after; but i'm guessing that compiling a statement is more expensive than executing a normal, non-prepared query instead

https://github.com/ceph/ceph/pull/44441/files has my WIP changes for DB connections pool. Have taken simple approach to start with. There is one global DB connection (default_db) and a pool of connection handles created by default when dbstore is initialized. Each operation (could be selective) requests a DB connection from the connection pool. If the limit is reached, the operation gets blocked till it gets a free handle (https://github.com/ceph/ceph/pull/44441/files#diff-9864d50860bbc23591d052cb71ac88b3df66bf5ab7a83576774288e6d348e04fR92).

if we do go with pools for each statement, we'd want to be smart about sizing them. some statements like Put/GetObject will be way more frequent than others like InsertUser, so we'd want a bigger pool for those. adaptive sizing could be an option to deal with imbalances like that. adding PerfCounters to track the hits/misses would help there too, but ultimately we'll want to make these decisions based on benchmarking

Yes. That was the idea (https://docs.google.com/document/d/1xCoHT5DCujqbe1pnEfYpSaSqiBcW8z87CgQCC5LivOo/edit#heading=h.eu0mzf5ujuzm) .. i.e, to use thread pool for only object (& maybe bucket) operations but global default_db connection handle for other User/LC & other Misc operations.

However while testing this patch (along with https://github.com/ceph/ceph/pull/43427), have noticed that most of the times, many operations error out with EDATABASELOCKED/EBUSY error as each transaction locks entire table. So then decided to first handle the I/O atomicity by using ObjectID for object uploads.

So with objectID changes in (versioned objects - still WIP) , the main (or maybe only) reason to have connection pool now is to improve the DBStore performance. I am yet to revive these changes but my plan is to check first if/by how much perf improvement we get when compared to complexity involved with error handling & retry operations with multiple DB handles (specially in case of SQLite).

Actions

Copy link

Updated by Soumya Koduri about 2 years ago

but i'm guessing that compiling a statement is more expensive than executing a normal, non-prepared query instead

yes ..that would defeat the purpose of using prepared statement.

Actions

Copy link

Updated by Matt Benjamin about 2 years ago

I'm confused--does creating a pool of prepared statement really improve concurrency? I would not have thought so.

Matt

Actions

Copy link

Updated by Casey Bodley about 2 years ago

if sqlite doesn't allow concurrent writers and would block anyway, then i agree that a mutex is the best we can do there. but can't we expect some concurrency from the read ops?

Actions

Copy link

Updated by Casey Bodley about 2 years ago

ok, i'm probably just missing the interaction with connection pools?

if statements are per-connection, and connections in the pool can only be used by one thread at a time, then pooling of statements would indeed be unnecessary

Actions

Copy link

Updated by Casey Bodley about 2 years ago

Status changed from New to Closed

closing this one, because i think it's covered by the connection pooling work

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Bug #54437

dbstore: concurrent use of sqlite prepared statements

Updated by Soumya Koduri about 2 years ago

Updated by Soumya Koduri about 2 years ago

Updated by Casey Bodley about 2 years ago

Updated by Soumya Koduri about 2 years ago

Updated by Soumya Koduri about 2 years ago

Updated by Matt Benjamin about 2 years ago

Updated by Casey Bodley about 2 years ago

Updated by Casey Bodley about 2 years ago

Updated by Casey Bodley about 2 years ago