[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency for JSON type field between mysql and sqlite #17713

Open
kolorafa opened this issue Jun 20, 2024 · 18 comments
Open

Inconsistency for JSON type field between mysql and sqlite #17713

kolorafa opened this issue Jun 20, 2024 · 18 comments
Milestone

Comments

@kolorafa
Copy link
Contributor
kolorafa commented Jun 20, 2024

Description

For MySQL:
The 'json' column is handled as a json, and is automatically converted from json-string into array in model on read, and converted back into json-string on save.

For Sqlite:
The 'json' column is handled as 'text' so it's not decoded and is not touch in any way so it's normal json-string.

Other engines untested

The key difference is in:

if (str_contains($col, 'json')) {
return ['type' => TableSchemaInterface::TYPE_JSON, 'length' => null];
}

As that part is missing in sqlite which in turn is getting changed to "text" for sqlite. Adding that part to SqliteSchemaDialect resolves it, but it is a 'default behavior change' so could break old projects that expect string not array.

the json functionality in sqlite is now enabled by default from 3.38.0
https://sqlite.org/json1.html#compiling_in_json_support

CakePHP Version

5.0.8

PHP Version

8.2

@kolorafa
Copy link
Contributor Author

I don't think this change should land as a fix in 5.0.9, more like 5.1 (or 6.x?) with breaking changes?

But I think the automatic handling the field type should be unified somehow at some point.
So never convert to array making it stay as string on mysql, or always for both.

@markstory markstory added the ORM label Jun 20, 2024
@markstory markstory added this to the 5.1.0 milestone Jun 20, 2024
@markstory
Copy link
Member

I agree that we can't change the default behavior in 5.0. However, I think we could do that change in 5.1 as we have opt-in ways for applications to prepare for the upgrade, and for them to preserve the existing behavior via setColumnType

@othercorey
Copy link
Member

We'd have to detect the support since sqlite 3.38 isn't our minimum version yet, right?

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 5, 2024

The issue is also in fixtures, because now depending on on the engine (mysql vs sqlite) we need to provide the field content as array (for mysql) or json-string (for sqlite).
Array in sqlite will fail on type conversion, and json-string with mysql will double encode it (json in json).

I have set schema override to always force json in app but it's not working for Fixtures. That might be because we have those fixtures in a plugin, and the fixtures are inside plugin code but not prefixed in any way, so they don't load plugin models for seeding data.

I don't mind either way as long it's unified (to be able to run the same tests using both sqlite for simplicity/speed and mysql).

Debian 11 (bullseye) have sqlite 3.34.1 (not supporting json type)
Debian 12 (bookworm) have sqlite 3.40.1 (supporting json type)

We'd have to detect the support since sqlite 3.38 isn't our minimum version yet, right?

This should be behind feature flag until cakephp 6, but I don't know if we need to check sqlite version to handle json type, because If sqlite is not supporting json then you simply can't set json field type so this feature will not trigger and you are forced to use 'text' field type in migrations or they fail when run against sqlite.
Unless the "fix/change" will do something more like automatically convert json type into text in migrations. But then once gain you can't have this feature working as you will not be able to know if a specific 'text' field is a json type.

@othercorey
Copy link
Member

So, looking into this, I don't think the original issue can be addressed. There is no "json" field in sqlite, it's only text with json functions. There's nothing for us to parse and set as a json field.

However, it sounds like users are not able to override a column type for fixtures defined in plugins, which seems like an issue.

@kolorafa You are using cakephp/migrations to create the fixture tables or something else?

@ADmad
Copy link
Member
ADmad commented Jul 6, 2024

However, it sounds like users are not able to override a column type for fixtures defined in plugins, which seems like an issue.

Shouldn't they be contacting the plugin devs to address that?

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 6, 2024

I don't think the original issue can be addressed

The orignal issue can be fixes as easilly as adding:

 if (str_contains($col, 'json')) { 
     return ['type' => TableSchemaInterface::TYPE_JSON, 'length' => null]; 
 } 

into SqliteSchemaDialect.php

I'm already doing that locally (modifying directly in vendor) to fix it.

There is no "json" field in sqlite

There is:

sqlite> .d test_table
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE test_table
(
    example_text   TEXT,
    example_json   JSON_TEXT,
    column_integer integer
);
COMMIT;
sqlite> 

You are using cakephp/migrations to create the fixture tables or something else?

Fixtures only import data (and with the above fix they work the same way for both mysql and sqlite, without they fail for sqlite), tables are created by running cakephp/migrations in tests/bootstrap.php and the migrations are correctly working and handling json field type for both mysql and sqlite just fine. No issues there for migrations.

However, it sounds like users are not able to override a column type for fixtures defined in plugins, which seems like an issue.

For Fixtures I will go deep dive into it first as they might just be defined wrongly that's why they "generate" generic new entity when inserting data hence not handling the schema override set in models. That's a topic for another potential issue in the future.

Shouldn't they be contacting the plugin devs to address that?

We write both the app and the plugin (that will be shared between apps), and I just noticed that while working with fixtures in plugin - but they might be just wrongly implemented (missing plugin prefix or something)

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 6, 2024

we have opt-in ways for applications to prepare for the upgrade

@ADmad How is that done? Thanks

(I read the Contributing and few migartion guides but didn't find anything like that, but maybe i missed it)

@othercorey
Copy link
Member

example_json JSON_TEXT

I cannot find any documentation on JSON_TEXT. What I found said they cannot add a new type. Can you show me?

@ADmad
Copy link
Member
ADmad commented Jul 6, 2024

https://www.sqlite.org/json1.html#interface_overview

SQLite stores JSON as ordinary text. Backwards compatibility constraints mean that SQLite is only able to store values that are NULL, integers, floating-point numbers, text, and BLOBs. It is not possible to add a new "JSON" type.

Googling I can't find any info about JSON_TEXT type either.

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 6, 2024

I see, that the core engine still handle this field as a TEXT (which doesn't make any difference to me) but just preserves the type that was set in table definition.

Found the explantation at: https://sqlite.org/forum/forumpost/89c83eb676f810c0?t=h

How I arrived at this state is just by using cakephp/migrations having:

$table->addColumn('configuration_schema', AdapterInterface::PHINX_TYPE_JSON, [
            'null' => true,
            'default' => null,
]);

It was designed with MySQL in mind, and as it turned out it somehow "works" for sqlite (by setting this "custom type").

The way to search inside json field was implemented as:

public function getQueryByDriver(string $driver): string
{
    return match ($driver) {
        Mysql::class => 'LOWER(JSON_VALUE(%s, "$.%s")) COLLATE utf8mb4_general_ci LIKE',
        Sqlite::class => 'LOWER(json_extract(%s, "$.%s")) COLLATE utf8mb4_general_ci LIKE',
        default => throw new Exception('Not implemented'),
    };
}

And CakePHP by default if it see field type contain "json" it handle it using Cake\Database\Type\JsonType which on tye fly decode/encode it while reading/saving. It was a nice addition for MySQL as it did work out of the box, but it did fail in sqlite tests.

So I digged in and found that the type is mached by looking for "json" in type, (henc the fix above), and after applying the same logic from MySQL Dialect into Sqlite it work the same way perfectly for simple read/write, but ofc require different json manipulation functions to do some inside-database json manipulation.


so long story short, it looks like that it 'just works' as sqlite allows you to make any field type you want :D

sqlite> .d test_table
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE test_table
(
    example_text   TEXT,
    example_json   JSON_TEXT,
    column_integer integer
, test SOMETHING_TEXT);
INSERT INTO test_table VALUES('example row','example row json field',123,'random field');
INSERT INTO test_table VALUES('{"some":"think2"}','{"some":"think"}',123,'random field2');
COMMIT;

(added the test column using phpstorm, and just put "SOMETHING_TEXT" in the field type)

but yes, it having the JSON_TEXT doesn't give you any json validation compared to MySQL :)

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 6, 2024

So, looking at it now, because sqlite going forward will have functions to manipulate json by default, and that the JSON_TYPE (or any type with "json" in the type name) should clearly indicate that it is a json type, and it is "supported" as in it doesn't fail and is preserved...

It would be nice to have CakePHP handling json type for MySQL and SQLite the same way by default. Like in my case to make it easier to use SQLite for testing stuff with tables having "json" fields)

But I don't want to potentially break other people apps if their migration did set their "json" fields the same way and they already have some manual (if sqlite do json_decode) workarounds.

@ADmad
Copy link
Member
ADmad commented Jul 6, 2024

https://www.sqlite.org/datatype3.html#determination_of_column_affinity

Wow, that's a really wacky feature of Sqlite :) Given this "column affinity" feature of Sqlite I don't mind if we map sqlite column types with string JSON in their name to CakePHP's json type. We can potentially even extend it to other types like types with string DATE to our date type.

Edit: DATE type is already accounted for in SqliteSchemaDialect.

@ADmad
Copy link
Member
ADmad commented Jul 6, 2024

How is that done? Thanks

We just use a config var which is checked. For e.g. for this case we could use ORM.mapJsonTypeForSqlite and users who want to opt-in for this feature can set Configure::write('ORM.mapJsonTypeForSqlite', true) in their app's bootstrap.

P.S. New features need to target the 5.next branch.

@othercorey
Copy link
Member
othercorey commented Jul 6, 2024

Thanks for explaining. I wanted to know how you ended up with JSON_TEXT. Since it's arbitrary, we should try to "standardize" it so it's not such a hidden feature for our users.

It looks like sqlite supports a binary BLOB json format as well. Would mapping both of those to json type work? Probably not since we wouldn't know how to parse the blob, right?

Should we support JSON_TEXT only or anything with JSON in it?

@ADmad
Copy link
Member
ADmad commented Jul 7, 2024

Should we support JSON_TEXT only or anything with JSON in it?

I would just check for the latter, anything with JSON in it.

@othercorey
Copy link
Member

I know it's "technically" a breaking change, but how many developers are actually using JSON as their column type currently since it's always text?

I wonder if we could change this before 6.0.

We'd have to warn not to name your jsonb BLOB column types JSON since we can't support reading it.

@kolorafa
Copy link
Contributor Author
kolorafa commented Jul 7, 2024

how many developers are actually using JSON as their column type

at least one :)

I want this change to happen, but personally I would not want any breaking change to happen without knowing upfront for a little bit of time to prepare, in case someone somehow had a huge sqlite database in production that have json or jsonb as column type it... yes, it probably would be very stupid but god only knows.
So in my opinion the 6.0 should be the place where this change should be enabled by default.
Or at least it should not be enabled by default in 5.1 as there is already RC1 released that someone would expect that no new breaking changes would happen and they already tested their app against that release :)

We'd have to warn not to name your jsonb BLOB column types JSON since we can't support reading it.

Valid point, we probably should exclude jsonb from the column matching, as you can't json_decode a jsonb binary, so if someone somehow not used exactly "BLOB" then it surely used "jsonb" in it.

And supporting jsonb for sqlite by default in core probably a little bit over kill, as when someone actually go that far then they do that for speeding up json manipulation in database, at that point it would be way better to switch to an engine that support indexing values from inside json (like MySQL or PostgreSQL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants