8000 SQLite Async API by geeksilva97 · Pull Request #59109 · nodejs/node · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@geeksilva97
Copy link
Contributor
@geeksilva97 geeksilva97 commented Jul 18, 2025

Closes #54307

This PR implements an async API for node:sqlite module. So far, it contains a very minimal implementation of exec method, misses some tests, docs and refactoring but it is good enough to share the whole theory I have for it; with that, anybody can share thoughts about it.

  • Docs
  • Tests

Design

On C++ land, I plan to have the Database class determine whether the operations will be asynchronous.

Public API

For the public API, I plan to have classes such as Database, Statement, etc., as counterparts to' DatabaseSync', ' StatementSync', and so on.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/sqlite

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. sqlite Issues and PRs related to the SQLite subsystem. labels Jul 18, 2025
@geeksilva97
Copy link
Contributor Author

Concurrency Control

For concurrency control, SQLite provides a few options. Multi-threaded seems to be a good fit.

Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection nor any object derived from database connection, such as a prepared statement, is used in two or more threads at the same time.

We just need a way to guarantee that no two threads will be using connection or statements at the same time. Since we have a threadpool, hence no control over which thread a work will be executed, seems like we need a mutex. Please share your thoughts.

@geeksilva97
Copy link
Contributor Author

We just need a way to guarantee that no two threads will be using connection or statements at the same time. Since we have a threadpool, hence no control over which thread a work will be executed, seems like we need a mutex. Please share your thoughts.

I wonder if, for the first version of this API if the serialized mode is good enough

Serialized. In serialized mode, API calls to affect or use any SQLite database connection or any object derived from such a database connection can be made safely from multiple threads. The effect on an individual object is the same as if the API calls had all been made in the same order from a single thread. The name "serialized" arises from the fact that SQLite uses mutexes to serialize access to each object.

@geeksilva97 geeksilva97 added the wip Issues and PRs that are still a work in progress. label Aug 28, 2025
@geeksilva97 geeksilva97 force-pushed the sqlite-async branch 5 times, most recently from 6561e49 to 9af39ec Compare October 1, 2025 03:39
@geeksilva97 geeksilva97 force-pushed the sqlite-async branch 2 times, most recently from 4bf5609 to 8ce4e74 Compare November 28, 2025 04:07
@geeksilva97 geeksilva97 marked this pull request as ready for review November 28, 2025 04:09
@geeksilva97 geeksilva97 marked this pull request as draft November 28, 2025 15:36
@geeksilva97 geeksilva97 marked this pull request as ready for review November 28, 2025 21:16
@geeksilva97 geeksilva97 marked this pull request as draft November 28, 2025 21:38
@geeksilva97 geeksilva97 marked this pull request as ready for review November 29, 2025 02:16
@codecov
Copy link
codecov bot commented Nov 29, 2025

Codecov Report

❌ Patch coverage is 81.73258% with 97 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.50%. Comparing base (08d966c) to head (5bec78e).
⚠️ Report is 21 commits behind head on main.

Files with missing lines Patch % Lines
src/node_sqlite.cc 82.53% 49 Missing and 42 partials ⚠️
src/node_sqlite.h 40.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #59109      +/-   ##
==========================================
- Coverage   88.54%   88.50%   -0.05%     
==========================================
  Files         703      703              
  Lines      208291   208690     +399     
  Branches    40170    40244      +74     
==========================================
+ Hits       184430   184691     +261     
- Misses      15865    15961      +96     
- Partials     7996     8038      +42     
Files with missing lines Coverage Δ
src/node_sqlite.h 81.81% <40.00%> (+1.42%) ⬆️
src/node_sqlite.cc 79.56% <82.53%> (-0.37%) ⬇️

... and 41 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@geeksilva97
8000
Copy link
Contributor Author

It turns out I was very naive while conceiving this implementation. My approach was to utilize libuv's thread pool for running SQLite operations, as we do for backups.

In a situation where a query calls a user-defined function, authorizer, or aggregation, execution on the thread pool fails because it doesn't have access to an Isolate.

My first idea was to track the defined functions and identify when they are in the query; if functions are present, that query would run in the main thread. Another option would be using worker threads that have their own isolates.

@nodejs/sqlite would you have any clue? Thanks in advance.

@geeksilva97
Copy link
Contributor Author

It turns out I was very naive while conceiving this implementation. My approach was to utilize libuv's thread pool for running SQLite operations, as we do for backups.

In a situation where a query calls a user-defined function, authorizer, or aggregation, execution on the thread pool fails because it doesn't have access to an Isolate.

My first idea was to track the defined functions and identify when they are in the query; if functions are present, that query would run in the main thread. Another option would be using worker threads that have their own isolates.

@nodejs/sqlite would you have any clue? Thanks in advance.

better-sqlite lets the developer decide, suggesting worker threads https://github.com/WiseLibs/better-sqlite3/blob/master/docs/threads.md

@louwers
Copy link
Contributor
louwers commented Dec 1, 2025

I would consider not adding any async APIs, and instead

  1. Allow passing threading mode on DatabaseSync construction.
  2. Allow setting the threading mode at runtime.
  3. Add an example to the documentation how to use SQLite with worker threads.

The overhead of serializing queries and results probably makes this somewhat of an advanced use case. Defaulting to async for all queries is (probably) a mistake for most applications. Instead it is something you might do for a few expensive queries. I could be wrong though.

8000

@geeksilva97
Copy link
Contributor Author

I would consider not adding any async APIs, and instead

  1. Allow passing threading mode on DatabaseSync construction.
  2. Allow setting the threading mode at runtime.
  3. Add an example to the documentation how to use SQLite with worker threads.

The overhead of serializing queries and results probably makes this somewhat of an advanced use case. Defaulting to async for all queries is (probably) a mistake for most applications. Instead it is something you might do for a few expensive queries. I could be wrong though.

From the beginning, the idea is not to default all queries to run asynchronously. It's about to add the possibility for async. Like fs and fs/promises

@PhilipTrauner
Copy link
PhilipTrauner commented Dec 1, 2025

having implemented worker thread offloading i can state that there's a significant amount of gnarly lifecycle plumbing involved, especially if one wants to support transactions
if some of that plumbing could live within node.js itself it would be much appreciated
not sure where a line would have to be drawn be to support more advanced use cases (e.g. 1 writer, <n> reader setups) but i'm eager to test stuff out!

@geeksilva97
Copy link
Contributor Author

having implemented worker thread offloading i can state that there's a significant amount of gnarly lifecycle plumbing involved, especially if one wants to support transactions if some of that plumbing could live within node.js itself it would be much appreciated not sure where a line would have to be drawn be to support more advanced use cases (e.g. 1 writer, <n> reader setups) but i'm eager to test stuff out!

Yeah. I will build a solution based on the authorizer, and let's see what you think about it.

@louwers
Copy link
Contributor
louwers commented Dec 1, 2025

From the beginning, the idea is not to default all queries to run asynchronously. It's about to add the possibility for async.

I think the API should reflect this. If it is named Database then I think a lot of people will just go for that one, thinking fewer keystrokes! And async is better/faster! For reading files, async often makes most sense. Correct me if I am wrong, but I think for SQLite queries, most of the time, the overhead makes async the worse option, even measured in wall clock time on the main thread.

Maybe calling it something like DatabaseAsync, StatementAsync is an option?

Or maybe DatabaseSync can be renamed to Database and we can have Database.execAsync?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. sqlite Issues and PRs related to the SQLite subsystem. wip Issues and PRs that are still a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sqlite async API

5 participants

0