Add SQLite backend #12

hannahwhy · 2018-01-13T07:39:11Z

I'm interested in using SQLite as a database backend for rustodon. Truly database-agnostic code is difficult to do, and it might end up being impractical, but:

I like to think that the set of necessary queries won't push us into territory that can only be satisfied by Postgres features
I'm kinda banking on WWPD being more than SQLite marketing

Tasks:

Get rustodon booting with SQLite
Switch connection to WAL mode
- ~~make sure concurrent writers work~~ concurrent write isn't allowed, but WAL should at least result in fewer exclusive locks
Generalize connection pool and associated functions to work with either SQLite or Postgres
Come up with some way to select SQLite or Postgres (feature flag?)
Add .travis.yml to run CI in SQLite and Postgres configurations
Get migrations_sqlite in sync with migrations
Find a way to get i64 IDs working with INTEGER PRIMARY KEY NOT NULL (should be possible, because INTEGER columns in SQLite can hold signed 64-bit integers. this might end up as a Diesel patch if I can prove it's a Diesel concern)
Get updated_at trigger working on tables that use it
Make sure all of this works with an in-memory database

iliana · 2018-01-13T17:19:51Z

Another advantage of the SQLite backend is we can write smaller unit tests for the models that start with a fresh in-memory database.

We'll still want the same tests to run on the Postgres backend as part of CI, but this will help speed up development, hopefully...

iliana · 2018-01-13T17:47:26Z

Cargo.toml

+[features]
+default = ["postgres"]
+postgres = []
+sqlite = []


I think this could be

postgres = ["diesel/postgres", "diesel_infer_schema/postgres"] sqlite = ["diesel/sqlite", "diesel_infer_schema/sqlite"]

Then remove those features from diesel and diesel_infer_schema below, and diesel will be compiled with or without the database engines depending on how you want to compile rustodon.

Thanks for the tip! I've applied that in 4ac4311.

You probably don't want me accidentally committing IDE-specific stuff or databases in here.

Compatibility notes: 1. TIMESTAMP WITH TIME ZONE is not supported by SQLite. I opted to use DATETIME instead. This is probably not a big deal if the application reads and writes times in UTC (and only converts to/from a local time zone in the UI). 2. In SQLite, BIGINT and INTEGER have the same storage class. However: a. Diesel won't allow i64 fields to be used with a column having type "INTEGER" b. SQLite will interpret a column of type INTEGER PRIMARY KEY NOT NULL to be a rowid alias, but will not do so for BIGINT PRIMARY KEY NOT NULL The short version is that we're currently using BIGINT in tables, but we lose autoincrement. This is not a problem if IDs are assigned by the application (e.g. with the Snowflake algorithm), but if we're expecting autoincrement (and it looks like the Postgres schema is) then we'll have to find some way around (a).

This should be made generic at some point. Soon.

rustfmt was applied to master; doing it here (with the same rustfmt.toml) will hopefully avoid the nastier merge conflicts.

We can use Cargo's feature dependencies feature (ahem) to reduce duplication in the diesel and diesel_infer_schema dependency specs.

This switch is permanent until the journal mode is changed again; see https://sqlite.org/wal.html#persistence_of_wal_mode https://web.archive.org/web/20180114031518/https://sqlite.org/wal.html#persistence_of_wal_mode Another application reading the database *could* change the journal mode back to DELETE, which would have performance implications (at best); however, I think we can probably get away with saying "well don't do that". A more sophisticated solution might switch to WAL mode when a connection is first acquired.

hannahwhy · 2018-01-14T21:32:39Z

re: i64 for autoincrementing IDs in SQLite, this has been raised before:

diesel-rs/diesel#1116
diesel-rs/diesel#852 (comment)

It looks like we will have to do manual override.

barzamin · 2018-01-14T22:45:47Z

@yipdw i really like the idea you posted in discord:

we could also conditionally define type Serial = i32 or type Serial = i64 I guess, which be less churn than ripping out infer_schema!

hannahwhy · 2018-01-14T22:50:57Z

It could be; I'm not yet sure if it'll work or what the code would end up looking like.

One thing that will happen is that we'd assume the burden of remembering that ID fields (both autoincremented PK and FK) must be Serial, not i32 or i64; I don't know if that's a good thing or not. (The build won't pass if they're mixed up, which is nice, but the resulting error doesn't immediately give away the cause.)

barzamin · 2018-01-15T01:10:10Z

@yipdw i had another thought: what if we forego BIGSERIAL columns even in postgres, and just move everything to i32s? Like, do we really need 64 bits of post ID?

🤷‍♀️ I initially pushed for using BIGSERIAL, but if it's too much work, I guess we could move away from that.

hannahwhy · 2018-01-15T07:01:20Z

@barzamin I think sticking with 64-bit IDs is fine in principle. Also, if e.g. you eventually want to switch away from autoincrementation to (say) a Snowflake-like ID generation algorithm, then infer_schema!'s interpretation of INTEGER PRIMARY KEY NOT NULL in SQLite is no longer relevant -- we'll re-declare the ID fields BIGINT and have i64 inferred for both databases.

I'll give the type Serial = ... thing a try and we'll see how it goes, I guess?

hannahwhy · 2018-01-15T07:07:09Z

Just to be clear, the issue with SQLite is an unfortunate clash of circumstances:

Autoincrementing row IDs in SQLite can be done in a couple ways (see https://sqlite.org/autoinc.html). The recommended approach is marking the column exactly as INTEGER PRIMARY KEY, in which case the column becomes an alias for the signed 64-bit integer rowid.
In most databases, however, INTEGER does not denote a 64-bit quantity but rather a 32-bit quantity, and infer_schema! infers such columns as i32. The SQLite behavior is the exception.

iliana · 2018-01-16T03:55:42Z

I think we should probably do snowflake IDs from the get-go.

hannahwhy · 2018-01-16T04:15:55Z

Yeah, if we're going to do Snowflake-ish IDs, it is much easier to do that starting out. I guess that'd be a separate issue if @barzamin is down with the idea.

barzamin · 2018-01-16T05:08:07Z

I am very down with snowflakes. Opened #17 to track this.

hannahwhy · 2018-01-29T22:10:00Z

I'll update this to reflect the changes made in #21 in a little bit. (Turns out git rename detection works in my favor here.)

#17 is currently blocking this issue. I can look at that if it's unassigned.

barzamin · 2018-01-29T22:10:32Z

@yipdw yes, that would be much appreciated!

SQLite hacks

netshade · 2018-06-24T18:56:45Z

database/src/models/account.rs

-        diesel::insert_into(accounts)
-            .values(&self)
-            .get_result(&**conn)
+        let inserted = diesel::insert_into(accounts).values(&self).execute(&**conn);


If I could do this all over again, I would have done it with a lot more ?

netshade · 2018-06-24T18:58:17Z

scripts/cargo-rustodon

+
+set -euo pipefail
+
+exec scripts/run


This shim script is here mainly to allow cargo watch to continue to do the right thing while allowing us to infer the right build features and set them at runtime.

netshade · 2018-06-24T18:59:19Z

scripts/run

+fi
+
+if [ ! -f "database/src/schema.rs" ]; then
+  echo "Database schema does not exist, run scripts/setup before attempting to run the tests." >&2


This warning is correct and scripts/setup should likely continue to do the schema generation, but in retrospect build.rs could also just do the schema generation as well, and avoid having the user forced to take action.

Investigated this further, and because db stuff is broken out into its own crate, its own build.rs would need to run first in order to get the crate to build. Unfortunately, SQLite paths would likely be qualified in the root crate, not the dependent crate, so any DATABASE_URL that exists that would allow the schema generation to proceed would need to be requalified within the context of the database subdirectory, at which point this would all start to feel like it's doing too much work / assuming too much on what's inside that env var.

Just leaving things as is for now.

netshade · 2018-06-24T19:01:06Z

database/src/schema.rs

@@ -1 +0,0 @@
-infer_schema!("dotenv:DATABASE_URL");


This removal ( and replacement via manual schema generation ) was due to SQLite throwing several errors using infer_schema in the tests; since Diesel seems to be deprecating infer_schema anyways, it seemed a positive direction.

This reverts commit b222c44.

netshade · 2018-06-26T03:17:21Z

Tests passing again, there were some issues w/ things on master and the CI setup that were causing tests to fail.

hannahwhy · 2018-07-06T05:44:13Z

Find a way to get i64 IDs working with INTEGER PRIMARY KEY NOT NULL

This is no longer relevant -- we're using i64s generated via flaken.

netshade · 2018-07-08T00:41:20Z

There was an out of band discussion on this branch regarding handling SQLITE_BUSY correctly. The default behavior of the SQLite bindings ( and SQLite proper ) is to return SQLITE_BUSY when access to the database is not possible and the caller should try again. This is a "common case" state. However, rusqlite ( which is used by r2d2, which in turn manages our connections in Diesel ) manually sets a busy timeout of 5 seconds, which seemed like it was generous enough to simply allow the default behavior.

Primary folks in that discussion were @yipdw and @nightpool . Comment left here just to note that if that behavior changes in rusqlite, we'd need to make changes to adjust. ( Also for posterity for the plume folks who linked to this PR )

hannahwhy · 2018-09-15T23:35:13Z

Whew, I need to look at this again -- sorry for falling off the world. It looks like all we need now is an updated_at trigger?

igalic · 2018-09-16T13:12:05Z

src/activitypub/mod.rs

@@ -1,4 +1,5 @@
 use db;
+use db::datetime::Rfc339able;


maybe we should add another 3 to this module name. db::datetime::Rfc3339able strikes me as something slightly more recognisable and usable in a Datetime context than "MLTNET - A "MULTI-TELNET" SUBSYSTEM FOR TENEX"

igalic · 2018-09-16T13:13:55Z

diesel.toml

@@ -1,3 +1,4 @@
 [print_schema]
 file = "database/src/schema.rs"


if this is the schema file, then how come we're deleting it, i'm a bit confused…

database/src/schema.rs is generated by diesel print-schema inside scripts/setup. (This branch adds some scripts to setup and run the server; if/when this is merged, those scripts will probably become part of the standard setup/run procedure.)

iirc it's diesel best practice to check schema.rs into source control (so you don't have to print-schema when deploying the app For Realz). don't quote me on that though

although that gets done as part of migrations so extremely 🤷‍♀️

I think that makes sense in situations where you're pushing source code to a bunch of app servers (which is the classic Web application deployment model), but I don't think it matters much when you're building a binary and deploying that to a bunch of app servers or (more common in the SQLite case) building a binary and installing it on many machines, since you won't be recompiling on the destination machine in either case.

That said, if there arises some benefit to checking in schema.rs, it doesn't seem like it would be a difficult change to implement.

Yeah, we have it checked in right now and i think that's the only reason to keep it around atm. Pulling it out is totally fine 💜

igalic · 2018-09-16T13:16:00Z

database/src/lib.rs

 pub mod validators;

+#[cfg(all(feature = "sqlite", feature = "postgres"))]
+compile_error!("sqlite and postgres features cannot be simultaneously selected");


i wish there was an easier (and prettier) way to do this… after all diesel doesn't use feature flags and can have all of its databases enabled at the same time.

Diesel (or at least the version of Diesel we're using) has both postgres and sqlite feature flags. The postgres and sqlite feature flags on the database crate toggle feature flags on diesel and diesel_infer_schema; we use them here as well.

I think there is a case to build in support for multiple servers for e.g. pre-built Rustodon binary distributions, but that seems to be some way off. In the meantime, switching using types seemed like the mechanism that required the fewest code changes.

i opened an issue for this in diesel diesel-rs/diesel#1853

hannahwhy force-pushed the feature/sqlite branch 4 times, most recently from afa34dc to 287aed7 Compare January 13, 2018 09:29

iliana reviewed Jan 13, 2018

View reviewed changes

hannahwhy added 8 commits January 13, 2018 17:53

Enable sqlite feature on diesel

0080422

Ignore CLion/IntelliJ settings an 8000 d SQLite databases

d977010

You probably don't want me accidentally committing IDE-specific stuff or databases in here.

Find/replace PgConnection -> SqliteConnection

902af30

This should be made generic at some point. Soon.

Use feature flags to select between sqlite/pg

eb9aa50

Apply rustfmt to src/db/mod.rs

b8a7de5

rustfmt was applied to master; doing it here (with the same rustfmt.toml) will hopefully avoid the nastier merge conflicts.

Re-export features on diesel/diesel_infer_schema

9c1850d

We can use Cargo's feature dependencies feature (ahem) to reduce duplication in the diesel and diesel_infer_schema dependency specs.

ci: add sqlite build

ac0e5c3

hannahwhy force-pushed the feature/sqlite branch from 4ac4311 to ac0e5c3 Compare January 13, 2018 23:58

hannahwhy added 2 commits January 13, 2018 21:14

sqlite: first cut at porting status.uri migration

d9020ef

hannahwhy added 2 commits January 29, 2018 16:18

Merge branch 'split-out-db' into feature/sqlite

f5c72bf

Merge remote-tracking branch 'origin/master' into feature/sqlite

3d0f66c

Merge pull request #1 from netshade/feature/sqlite-hacks

b1e3877

SQLite hacks

netshade reviewed Jun 24, 2018

View reviewed changes

chris added 7 commits June 25, 2018 22:47

Merge remote-tracking branch 'upstream/master' into feature/sqlite

5c98c43

slim the definition for this method using unwrapping

a2963f3

rustfmt

c69589f

sass installed via bundler, execute bundler to check existence

b222c44

ignore generated schema

4e9b63f

on CI, just install bundler

0994a0c

Revert "sass installed via bundler, execute bundler to check existence"

219d40f

This reverts commit b222c44.

Revert "sass installed via bundler, execute bundler to check existence"

219d40f

This reverts commit b222c44.

Merge remote-tracking branch 'upstream/master' into feature/sqlite

95174fc

elegaanz mentioned this pull request Jun 28, 2018

Add SQLite as an option Plume-org/Plume#93

Closed

chris added 2 commits June 28, 2018 21:13

import necessary traits for templates

96557a2

Merge remote-tracking branch 'upstream/master' into feature/sqlite

f572dc1

netshade mentioned this pull request Jul 1, 2018

Nightly update #79

Merged

netshade added 2 commits July 7, 2018 20:58

Merge remote-tracking branch 'upstream/master' into feature/sqlite

ffe8ab9

ignore .env files with suffixes

ca4050d

igalic reviewed Sep 16, 2018

View reviewed changes

igalic mentioned this pull request Sep 17, 2018

Expose multi-db connection inference from diesel-cli diesel-rs/diesel#1853

Closed

2 tasks

barzamin force-pushed the master branch from 19a804a to 271adbb Compare March 16, 2019 17:06

barzamin added A: Backend Anything related to the backend and removed A-database labels Jul 23, 2019

		@@ -1,3 +1,4 @@
		[print_schema]
		file = "database/src/schema.rs"

		@@ -1 +0,0 @@
		infer_schema!("dotenv:DATABASE_URL");

Add SQLite backend #12

Are you sure you want to change the base?

Add SQLite backend #12

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants