Cube support by kirkbrauer · Pull Request #3867 · npgsql/npgsql

kirkbrauer · 2021-07-01T04:29:45Z

Here's my implementation of the cube type from #698.

The binary send/recv functions are not documented anywhere at the moment, so I had to read the PostgreSQL source files to determine the binary type of the cube type.

https://github.com/postgres/postgres/blob/c30f54ad732ca5c8762bb68bbe0f51de9137dd72/contrib/cube/cube.c

From what I found, there appears to be a 32-bit header that indicates the number of dimensions (i31) in the cube and whether the cube is a point (the first bit) meaning it has the same lower-left and upper-right coordinates. After that there is a linear list of doubles containing all the coordinates in the cube, if the cube is a point they don't bother storing the upper-right coordinates since they are the same as the lower-left ones.

For example, here is a hex dump of the cube (1):

00000000  80 00 00 01 3F F0 00 00 00 00 00 00 -> (1)

And here is a hex dump for a cube with lower-left and upper-right coordinates:

00000000  00 00 00 02 40 14 00 00 00 00 00 00 40 08 00 00
00000010  00 00 00 00 40 00 00 00 00 00 00 00 40 18 00 00
00000020  00 00 00 00 -> (5, 3),(2, 6)

We also need to determine which parts of the API need to be implemented on the client side so users can do basic cube manipulation and comparison directly in their program (or if we even need this).

roji

Thanks, this looks pretty good! See some comments/questions below.

The NpgsqlCube API surface is indeed the thing we need to tie down well. I'm not convinced we need to provide too many client-side operation implementations - the type seems simple enough (basically two coordinate lists) that users can do whatever they want. But am definitely open if you think we should provide some operations.

PS I'm on semi-vacation for the coming week, so expect some delays with responses.

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs

src/Npgsql/TypeMapping/GlobalTypeMapper.cs

src/Npgsql/NpgsqlTypes/NpgsqlDbType.cs

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

kirkbrauer · 2021-07-01T16:54:58Z

@roji What do you think about adding the basic client-side API for manipulating and querying the cube?

I think that the cube type is specific enough that it needs the basic client-side operations so that users can do local math on the cube and manipulate its values. We already have to provide a few of the query methods (cube_is_point, cube_dim, cube_ll_coord, cube_ur_coord) for our implementation to display itself and serialize/deserialize.

My reasoning for this is that manually manipulating the cube's lower-left or upper-right coordinate lists is inherently unsafe and makes it very easy to corrupt the cube. If the number of coordinates in the two lists gets out of sync, we cannot properly serialize/deserialize the cube anymore because such a scenario is not allowed in the cube spec.

Making the LowerLeft and UpperRight IEnumerable<double>s would prevent the user from unsafely changing the dimensions of the cube and require them to use the manipulation API therefore preventing corruption of the cube data. We could also provide utility methods like AddDimension(double c1, double? c2 = null) to help with cube manipulation if we want the cube to be a mutable data type.

roji · 2021-07-03T06:57:35Z

Making the LowerLeft and UpperRight IEnumerables would prevent the user from unsafely changing the dimensions of the cube and require them to use the manipulation API therefore preventing corruption of the cube data

For preventing direct manipulation of the lists, IReadOnlyList can be used instead of IEnumerable.

More generally, I'd avoid going in the direction of doing client-side implementations - we generally don't want to fully reproduce PostgreSQL types, with all their complexity. That can easily lead us to a place where we duplicate the entire PostgreSQL function set in .NET (because why implement one operation client-side but not another?), and have to maintain it, fix bugs, etc.

We can always start with a minimal implementation, where the data is made fully available, and see what users request; if there's a a lot of need for certain client-side operations, we can always add those later. Otherwise, if you strongly feel we should be providing client-side implementations, do you have certain operations in mind (and reasoning for them)? Or just all of them?

kirkbrauer · 2021-07-03T14:45:38Z

That makes sense, I forgot about the specific advantages of the 'IReadOnlyList' interface.

I also agree that we should limit the client-side API to the bare minimum operations to start out and then expand it if necessary. I think the best way that users can manipulate the cube is just by using the constructor and the provided lists to re-build a new cube if they desire.

We should probably also leave all the mathematical operations to the database so we don't have to code or debug those.

The only exception to this could be the AddDimension(double a, double b) API (this does not exist in the database impl), which might make it a bit easier to add a dimension to the cube.

roji · 2021-07-03T14:50:05Z

Great, sounds like we're aligned. There's still an open question in my mind as to whether the type should be mutable or immutable (i.e. if we should expose List or IReadOnlyList). Immutable would allow better enforcing of correctness, but if the lists are long there's a perf hit for having to generate copies etc.

kirkbrauer · 2021-07-03T14:53:25Z

@roji The maximum size of each list is 100 elements, so AddDimension might be warranted, of course we have to determine if this method returns a new instance of the cube or if it just 8000 mutates the existing lists internally.

roji · 2021-07-03T14:58:42Z

Yeah. Since the data size here isn't small and fixed, it may make sense to have a fully mutable API and leave it to the user's responsibility to not mess up (e.g. by having different list lengths).

kirkbrauer · 2021-07-03T18:02:39Z

That makes sense, if the list sizes get out of sync, we can just pad the opposite array with zeros when serializing or displaying the string rep.

roji · 2021-07-12T15:41:26Z

@kirkbrauer just a note to say that this is waiting on you (but absolutely no rush of course!). You can push more commits, and when you'd like another review, ask for one in the reviews panel above.

kirkbrauer · 2021-07-12T15:43:35Z

@roji No worries, I got a bit busy last week, but I'll push a new updated version with the changes we discussed above when I get a chance.

kirkbrauer · 2021-07-12T16:34:34Z

@roji Quick question regarding the public shipped API vs public unshipped API. Which file would be the best to place the declarations for the NpgsqlCube so we can run the tests? I am not super familiar with these files, but I know that they are commonly used on MS open-source projects to denote the public API. I also know that there isn't great support for Rider, but I'll try to autogenerate them in Visual Studio.

roji · 2021-07-13T10:24:45Z

Re the public API files, all changes should be added to the Unshipped file (as you've done) - this is basically a snapshot of all changes done in the current, unreleased version. When we release 6.0, we'll merge those changes into the Shipped file, and the Unshipped file will be empty again.

Unfortunately you're right that the analyzer has issues under Rider - I usually add into Unshipped manually, but VS can indeed be used as well.

roji

@kirkbrauer thanks for the work on this - here are some more comments.

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs

roji

@kirkbrauer sorry for the long wait (was away on vacation). This looks almost ready - see remaining comments below, I'm now available to finalize this and merge.

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs

src/Npgsql/TypeMapping/GlobalTypeMapper.cs

roji · 2021-08-02T18:16:37Z

@kirkbrauer can you please squash and rebase this on the latest main branch?

kirkbrauer · 2021-08-02T21:28:34Z

@roji Doing it right now.

kirkbrauer · 2021-08-02T21:32:42Z

@roji All squashed and rebased. We should be ready for merge, though we might need to do a little work on the tests first.

roji · 2021-08-03T07:46:33Z

@kirkbrauer agreed - everything LGTM, but test coverage currently seems to be mainly for the type handler (reading/writing from PG). Some coverage for the APIs on the type would be good.

vonzshik

Just as @roji suggested, we should have some tests for the cube API. After that. it should be good enough to merge.

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

kirkbrauer · 2021-08-03T18:14:52Z

@roji I'm working on adding additional tests for the cube type, however I have run into a very obvious problem related to our reference optimizations.

If the cube is a point, we set both the lowerLeft and upperRight lists to the same List instance. This is super efficient, but if a user wants to change one list, they will end up manipulating both lists and the source list since all those values reference the same original list instance.

This is probably fine if we treat the cube as immutable, however, if the user is often creating or manipulating their own cubes, it would produce the (possibly) unexpected behavior especially for users who might not be as familiar with pass-by-reference languages.

In other types such as the NpgsqlPath, we are instead re-instantiating the List objects with the same values within the constructor at the penalty of performance, I think this might be a better approach to maintain parity with the rest of the library.

roji · 2021-08-04T07:55:05Z

You raise a good point... Here are some options I'm seeing:

Duplicate the user-provided list when constructing a point. As you pointed out, this wouldn't be good for perf (note that NpgsqlPath and others were written a long time ago, and shouldn't constrain this type).
Make the lists immutable - also bad for perf.
Have a separate type for points, which only exposes a single list. This seems a bit heavy/over-complicated to me, and I also worry a bit about confusion between this new point type and PostGIS points.
Keep the 2nd list internally null for points, and duplicate it only when accessed through the public property (a sort of "copy-on-write"). The handler would use a separate, internal access that bypasses duplication. This is still non-ideal perf-wise, as duplication happens when reading the property (and not when updating the list), so certain user access patterns would still cause needless duplication.
Simply say that UpperRight is null for points. This would make that property nullable (possibly annoying), but would be simple and consistent.
Do nothing, and explicitly document that when using the point constructor (or factory), a point remains a point, and so any change in one list is reflected in the other.

Since this type is generally about reading/writing data from PostgreSQL, I think perf should be prioritized. I actually think the current situation isn't necessarily that bad - some users may trip over it, but with good documentation things should be OK. Otherwise options 4 or 5 above also seem reasonable - what do you think?

(@vonzshik @NinoFloris any thoughts here?)

kirkbrauer · 2021-08-04T14:58:05Z

I've been thinking about the best way to go about this. I feel that options 2, and 3 are probably off the table because they introduce additional complexity/confusion that would not benefit end users at all.

Option 6 could work because we already have that implemented, but this would also create the issues where a user would be locked into a specific type of cube, which reduces flexibility and reusability of code making use of the NpgsqlCube type.

As you mentioned, 5 is probably the most reasonable balance between perf and DX, I like the simplicity of setting the UpperRight property to null, however if we make the UpperRight nullable, it could create a situation where a point has to be treated as a special case in any business logic a user would create. If we deem this OK, then we can go ahead with this approach.

If we wanted to go with option 4, it would make much more sense to just make the lists read-only and introduce custom APIs to add/remove coordinates, that way we can preform any optimizations we want including only duplicating the list on write if the lower-left and upper-right values are different. I'm not convinced this is the best approach because it requires custom cube manipulation methods, but at least it allows us to provide the most optimized internal impl and keeps the list lengths the same, preventing us from having to throw exceptions in the properties.

A more radical solution might be to just totally abandon the LowerLeft and UpperRight lists concept and make the entire cube an enumerable of named value tuples (double LowerLeft, double UpperRight) and provide APIs to manipulate the internal List<(double, double)>. This would require duplicating the double for each coordinate of a point, however since ValueTuples are just structs on the stack, the performance shouldn't take too much of a hit compared to a list of decimals and we get the advantage of only having to memory manage a single list.

keyneom · 2023-03-24T17:21:29Z

Awesome, yes, I see that makes sense I hadn't been paying attention to the differences between the two repos, this one handles the underlying types and that one interfaces with efcore. Thanks for the explanation! This looks good to me then!

NinoFloris · 2025-10-29T15:04:35Z

@kirkbrauer are you still interested in bringing this over the finish line? Maybe there is something we can do to help?

kirkbrauer · 2025-10-29T15:17:20Z

Hi @NinoFloris it's been quite a while, but I do think it would be valuable to get this merged into core. Even though I no longer work with low-level Npgsql things on a daily basis, having full coverage for the Postgres API is still good for the project. We have this reference implementation now, but it needs to be rebased and thoroughly tested. Also, I don't think I ever contributed the LINQ side of things which would be required to make this actually useful, so perhaps someone who is more of a LINQ expert than myself could tackle that later.

Jimmacle · 2025-10-30T00:07:14Z

I started working on EF Core support for this a while back but it's been idle for a couple years: npgsql/efcore.pg#2808

roji · 2025-10-30T08:16:29Z

@Jimmacle for EF Core support we can absolutely do this, but I think we need to merge the lower-level support here first.

- Arrays of cube - Dimension mismatches - Exceeding max dimensions (100) - Negative values - HashCode equality - Zero values - Maximum dimensions (100)

…ResolverFactory

kirkbrauer · 2025-10-31T13:27:39Z

@roji I attempted to rebase my changes on the latest. I had to make a few modifications to how we register the type, but my understanding is that since cube is an extension type, it should be treated as a plug-in like ltree. I also expanded the test cases to cover more scenarios. Things seem to be working locally when running the tests.

roji

Thanks, this looks pretty good! I'll take a deeper look later, but here are some first comments.

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

roji · 2025-10-31T17:34:20Z

test/Npgsql.Tests/Types/CubeTests.cs

+
+    [Test, TestCaseSource(nameof(CubeValues))]
+    public Task Cube(NpgsqlCube cube, string sqlLiteral)
+        => AssertType(cube, sqlLiteral, "cube", NpgsqlDbType.Cube, isDefaultForWriting: false);


Shouldn't isDefaultForWriting be true, as this is a dedicated .NET type that corresponds exactly to the PG type?

isDefault in general should be true, we only have a single mapping for cube.

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

NinoFloris

This is great! Thanks a lot for updating this. I commented on quite a few things, but that should be all from my end.

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

src/Npgsql/Internal/ResolverFactories/CubeTypeInfoResolverFactory.cs

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs

src/Npgsql/Internal/Converters/Geometric/CubeConverter.cs

…lied API suggestions, fixed CLR type inference (need to discuss further)

kirkbrauer · 2025-11-02T02:24:21Z

src/Npgsql/PostgresMinimalDatabaseInfo.cs

        Add(DataTypeNames.PgLsn, oid: 3220, arrayOid: 3221);
        Add(DataTypeNames.Unknown, oid: 705, arrayOid: 0);
        Add(DataTypeNames.Void, oid: 2278, arrayOid: 0);
+        Add(DataTypeNames.Cube, oid: 100000, arrayOid: 100001);


We need to figure out a solution to this problem, this was required to make the tests pass, but probably creates many other unintended consequences.

We definitely shouldn't have cube in PostgresMinimalDatabaseInfo.

kirkbrauer · 2025-11-02T02:25:16Z

src/Npgsql/Internal/Postgres/DataTypeNames.cs

    public static DataTypeName TimeTz => ValidatedName("pg_catalog.timetz");
    public static DataTypeName Inet => ValidatedName("pg_catalog.inet");
    public static DataTypeName Cidr => ValidatedName("pg_catalog.cidr");
+    public static DataTypeName Cube => ValidatedName("pg_catalog.cube");


The cube extension type can be installed in any schema not just pg_catalog, typically it is installed in public, but the user may choose to install it somewhere else. Therefore, this qualifier is actually invalid.

kirkbrauer · 2025-11-02T03:16:22Z

@roji @NinoFloris I discovered a critical issue with the implementation of the cube type. Since this type is technically an extension type, it won't be installed in the pg_catalog schema, instead it will usually be installed in public or whatever schema the user chooses to install it in.

It appears that there is currently no mechanism to support extension types within the GlobalTypeMapper's CLR type inference unless we have registered them in PostgresMinimalDatabaseInfo which is designed to only handle pg_catalog types.

We could pretend that the cube type is part of the pg_catalog, but I don't think that is a long-term solution.

roji · 2025-11-02T06:03:21Z

@kirkbrauer are you sure about that? Isn't cube exactly like, say, the PostGIS types, which also come from an extension (and can be installed in any schema) - and AFAIK work fine?

If we do have a limitation/problem, I'd treat that as separate and open a new issue, without blocking cube per se; again, whatever problem we have here would also affect PostGIS (and probably some other types as well)...

kirkbrauer · 2025-11-02T12:15:36Z

Thanks @NinoFloris, disabling CLR type inference makes sense. I think I was trying to force it to work the same way as the built-in types which obviously won't work hence the confusion. @roji if this works properly then we should be good.

NinoFloris · 2025-11-02T12:28:38Z

No problem, it's something to improve in Npgsql. All external plugins have to rely on DataTypeName anyway, so this part of the inference isn't relevant there. And as it really only affects the plugin types that have NpgsqlDbType cases it hasn't really been worth the effort.

Note, it really just disables NpgsqlDbType inference. So when we see just an NpgsqlCube on a parameter, and you inspect NpgsqlDbType we fail to tell you it's NpgsqlDbType.Cube.

Stale review from 2021

NinoFloris · 2025-11-03T15:05:36Z

@kirkbrauer thank you for your work and persistence. I'm very glad to see this be part of npgsql for 10.0!

roji · 2025-11-03T16:29:49Z

Likewise, really happy this got in!

kirkbrauer requested review from roji and vonzshik as code owners July 1, 2021 04:29

roji reviewed Jul 1, 2021

View reviewed changes

kirkbrauer requested a review from roji July 12, 2021 19:20

roji reviewed Jul 13, 2021

View reviewed changes

kirkbrauer requested a rev BEA4 iew from roji July 19, 2021 13:23

roji requested changes Aug 2, 2021

View reviewed changes

vonzshik requested changes Aug 2, 2021

View reviewed changes

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs Outdated Show resolved Hide resolved

src/Npgsql/Internal/TypeHandlers/CubeHandler.cs Outdated Show resolved Hide resolved

roji approved these changes Aug 2, 2021

View reviewed changes

src/Npgsql/TypeMapping/GlobalTypeMapper.cs Outdated Show resolved Hide resolved

kirkbrauer force-pushed the main branch from 1981dc7 to e7e2916 Compare August 2, 2021 21:31

kirkbrauer requested a review from vonzshik August 3, 2021 14:10

vonzshik previously requested changes Aug 3, 2021

View reviewed changes

src/Npgsql/NpgsqlTypes/NpgsqlCube.cs Outdated Show resolved Hide resolved

kirkbrauer requested review from roji and vonzshik April 1, 2023 03:36

Jimmacle mentioned this pull request Jul 9, 2023

Cube support npgsql/efcore.pg#2808

Closed

kirkbrauer added 2 commits October 30, 2025 23:05

Add NpgsqlCube type and converter

29758db

Add additional NpgsqlCube tests for:

2e8b5e8

- Arrays of cube - Dimension mismatches - Exceeding max dimensions (100) - Negative values - HashCode equality - Zero values - Maximum dimensions (100)

kirkbrauer force-pushed the main branch from e437449 to 2e8b5e8 Compare October 31, 2025 03:28

Properly treat NpgsqlCube type as a plug-in type and add CubeTypeInfo…

130817c

…ResolverFactory

roji reviewed Oct 31, 2025

View reviewed changes

NinoFloris requested changes Nov 1, 2025

View reviewed changes

Performance optimizations to reduce allocations on cube creation, app…

6d74882

…lied API suggestions, fixed CLR type inference (need to discuss further)

kirkbrauer commented Nov 2, 2025

View reviewed changes

Fix tests and undo minimal database info inclusion

4837c24

NinoFloris approved these changes Nov 2, 2025

View reviewed changes

NinoFloris merged commit 9163444 into npgsql:main Nov 3, 2025
13 checks passed

roji linked an issue Nov 3, 2025 that may be closed by this pull request

Support for the cube type #698

Closed

kirkbrauer mentioned this pull request Nov 4, 2025

Cube support (revamped) npgsql/efcore.pg#3651

Merged

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!