Skip to content

sqlite: add batched async Database API#62015

Draft
BurningEnlightenment wants to merge 11 commits intonodejs:mainfrom
BurningEnlightenment:dev/sqlite-batched-async
Draft

sqlite: add batched async Database API#62015
BurningEnlightenment wants to merge 11 commits intonodejs:mainfrom
BurningEnlightenment:dev/sqlite-batched-async

Conversation

@BurningEnlightenment
Copy link

This PR adds the asynchronous complements Database and Statement to the node:sqlite module.

Given that a single sqlite db connection cannot scale beyond a single CPU core, (unlike the old sqlite3) this implementation avoids unordered dispatches to the thread pool by maintaining a strict FIFO operation queue. This allows usage of multiple db connections in parallel without relying on the internal db connection mutex (=> SQLITE_THREADSAFE=2 is sufficient). Furthermore I've favored throughput over latency, e.g. it collects asynchronous operations on a db connection and dispatches them in batches to the thread pool. This helps to amortize the inherent synchronization overhead over multiple operations and it also improves cache locality.

The implementation effort has reached a maturity level where the overall approach should only require relatively small incremental changes to reach the outlined API scope. Briefly the following things work:

  • Opening a db connection and (asynchronously) preparing statements.
  • Disposal of db connection and statement instances.1
  • Database.prototype.exec() without a prepared statement.
  • Statement.prototype.get() with binding parameters (though currently not as fancy as the sync variant).
  • Implicit microtask-based batching of all previously mentioned operations (inspired by kysely).

Things which still need work:

  • Function parameter validation should not throw but return rejected promises.
  • From a brief look it seems that direct usage of v8::Promise looses the async context/callstack. I need some guidance on how to fix that.
  • An explicit batching API, either callback- or Symbol.dispose/using-based (different PR?)
  • Explore an API to move a db connection from Database to DatabaseSync and vice versa (different PR?)
  • Proper API documentation
  • Probably a few extensions to the test suite here and there.
  • The memory allocation strategy should be tweaked to use std::pmr::monotonic_buffer_resource in a few places.
  • Obviously implementing the remaining APIs outlined below.

Fixes #54307
Previous attempt #59109 (I've only taken the test suites from there).

API synopsis

interface DatabaseOpenOptions {
    // IN PROGRESS (currently same as sync, though partially not respected)
}
interface DatabaseCtor {
    new(path: string, options?: DatabaseOpenOptions): Database;
}
interface Database {
    open(): void;
    isOpen: boolean;
    close(): Promise<void>;
    [Symbol.asyncDispose](): Promise<void>;

    location(dbName?: string): Promise<string|null>; // TODO
    loadExtension(path: string): Promise<void>; // TODO
    enableLoadExtension(allow: boolean): Promise<void>; // TODO
    enableDefensive(active: boolean): Promise<void>; // TODO

    inTransaction(): Promise<boolean>; // TODO

    exec(sql: string): Promise<void>;
    // IN PROGRESS (statement options are ignored)
    prepare(sql: string, options?: StatementOptions): Promise<Statement>;
}

type Literal = null | number | bigint | string | ArrayBufferView;
type BindParams = Record<string, Literal> | Array<Literal>;
type RowResult = Record<string, Literal> | Array<Literal>;
interface Statement {
    [Symbol.dispose](): void;

    get(params?: BindParams): Promise<RowResult | undefined>;
    all(params?: BindParams): Promise<RowResult[]>; // TODO
    run(params?: BindParams): Promise<{ changes: number, lastInsertRowid: number }>; // TODO
}

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.

Footnotes

  1. I've followed the spirit of DEP0137; should probably be reconciled with the sync API at some point.

BurningEnlightenment and others added 11 commits February 26, 2026 21:07
The `node:sqlite` module `Initialize` would get quite large if both
of the async and sync database templates were embedded. Therefore move
the template creation into a seperate function. I've avoided the
`GetConstructorTemplate` pattern, because it seems to imply exposing the
template via `PER_ISOLATE_TEMPLATE_PROPERTIES` which is unnecessary in
our case.
Previously all `SQLTagStore` instances had unique prototypes. Note that
the class and its prototype are currently not exposed on `node:sqlite`,
i.e. it currently can't be directly used for `instanceOf` checks.
The database opening and configuration logic can be shared between the
sync and async API variants, therefore extract the shared implementation
into a common base class.
Add a Database class skeleton which is currently only capable of opening
and closing a sqlite db connection. Also notably missing is any support
for asynchronous operations apart from changes to the close and dispose
method signatures. However, the skeleton is capable enough to pass the
most basic lifetime tests and therefore useful as a diff target for the
asynchronous operations to be added.

Re-enable the async database test suites and skip the tests on a case-by
-case basis.
Trade a bit of db operation latency for throughput by batching db
operations together before scheduling their execution on the thead pool.
Completion notifications are currently delivered eagerly through an
additional `uv_async` handle, but this is not strictly necessary for the
chosen implementation strategy and might get removed again.

Some care has been taken to not criss-cross memory allocations and de-
allocations between threads as modern modern allocators tend to perform
worse in such scenarios.
The linter doesn't seem to like these. Is it a linter bug?
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/security-wg
  • @nodejs/sqlite

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Feb 26, 2026
@BurningEnlightenment BurningEnlightenment changed the title Dev/sqlite batched async sqlite: add batched async Database API Feb 26, 2026
@BurningEnlightenment
Copy link
Author

@geeksilva97 given that you spearheaded the previous attempt, what are your thoughts regarding this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sqlite async API

3 participants