8000 Error on invalid store mode by dstansby · Pull Request #3068 · zarr-developers/zarr-python · GitHub
[go: up one dir, main page]

Skip to content

Error on invalid store mode #3068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 21, 2025
Merged

Conversation

dstansby
Copy link
Contributor

This avoids instances where mode='r' could be passed, by the resulting array is not read-only. Fixes #2949

@github-actions github-actions bot added needs release notes Automatically applied to PRs which haven't added release notes and removed needs release notes Automatically applied to PRs which haven't added release notes labels May 18, 2025
@dstansby dstansby marked this pull request as ready for review May 18, 2025 16:27
@dstansby dstansby added this to the 3.0.8 milestone May 19, 2025
@d-v-b
Copy link
Contributor
d-v-b commented May 19, 2025

this looks good to me, curious to hear your thoughts @jhamman since you are the mode architect

@dstansby dstansby enabled auto-merge (squash) May 21, 2025 11:31
@dstansby dstansby merged commit 481550a into zarr-developers:main May 21, 2025
29 of 30 checks passed
@maxrjones
Copy link
Member

Would it be possible instead to provide a UserWarning and return a read-only copy of the store? I'm concerned about the downstream impacts of this change (e.g., see xarray failures reported in #3105 (comment)).

@dstansby dstansby deleted the invalid-mode branch May 30, 2025 21:01
@dstansby
Copy link
Contributor Author

My thinking here with an error was to avoid situations where one opened an array with mode='r', but writing to it still worked. So although it has the potential to be disruptive, it's avoiding confusion with the API so I think worth it.

At a slightly higher level, I don't really understand why arrays can't have their own read/write permissions, but I missed that design decision.

@maxrjones
Copy link
Member

My thinking here with an error was to avoid situations where one opened an array with mode='r', but writing to it still worked. So although it has the potential to be disruptive, it's avoiding confusion with the API so I think worth it.

I definitely agree with the premise of avoiding situations where one opened an array with mode='r' but writing still works.

I still question this solution because it will certainly be disruptive. The decision to defer read/write permissions to only the store level is an internal design detail that could be changed without any user facing disruptions, so IMO it'd be better to reconsider that decision before releasing a disruptive change.

IIUC since the array has a reference to the store used at creation rather than a copy, this check doesn't really suffice to solving the situation where one opened an array with mode="r" if the store mode were become mutable (xref #3105). While it seems likely that @d-v-b understands that consequence and is arguing against it, a separate array mode would protect against store changes badly influencing array behavior in a more robust way.

@dstansby
Copy link
Contributor Author

We should decide one way or another before doing a 3.0.9 release. I'm still minded to go ahead with this, but expand the error message to say to open the store in read only mode if you want to open an array in read only mode. I'd also like to look at the xarray test failures, to see if there's an easy fix, or if there is a sensible use case that we're breaking with this change.

So before we release 3.0.9 with this, I think the todo is:

@maxrjones does that sound good? Anything I missed? Thanks for the downstream testing, it helps a lot!

@dstansby
Copy link
Contributor Author

Okay, I had a poke around at pydata/xarray#10430, and convinced myself that we should revert this change for now. Long term we really need to sort out the complete mess of permissions and work out what our permission model actually is...

@d-v-b
Copy link
Contributor
d-v-b commented Jun 18, 2025

for posterity can you elaborate a bit more on how the linked xarray PR convinced you to revert?

dstansby added a commit to dstansby/zarr-python that referenced this pull request Jun 18, 2025
@dstansby
Copy link
Contributor Author

In the xarray tests, there are examples where they create a memory store that is not read only, write an array to it, and then try and open that array in read only mode. I think it's reasonable to ask for a read only array from a writeable store, and not be able to write to the array.

@d-v-b
Copy link
Contributor
d-v-b commented Jun 18, 2025

an alternative solution would be to go with ##3138 and modify xarray to use that method as needed. I'd be curious to hear the pros / cons for the revert solution vs the "add functionality" solution.

@dstansby
Copy link
Contributor Author

My sense is #3138 is a bit of a hack anyway, when it should be possible to open an array in read-only mode from a writeable store. Because no-one did a write up (?) of the permission model changes for zarr-python v3 I don't really understand whether this is deliberately not possible, or if it was just an oversight.

@d-v-b
Copy link
Contributor
d-v-b commented Jun 18, 2025

it should be possible to open an array in read-only mode from a writeable store.

This is the key question. In Zarr-python 2 arrays and groups had a permission model. But in Zarr-python 3, we expressly moved the permissions model to the storage level, and so arrays / groups have defer entirely to the store for that. If we stick with that decision, then it should not be possible to open an array in read-only mode from a writeable store, without creating a new read-only version of that store (which #3138 implements).

@maxrjones
Copy link
Member

it should be possible to open an array in read-only mode from a writeable store.

This is the key question. In Zarr-python 2 arrays and groups had a permission model. But in Zarr-python 3, we expressly moved the permissions model to the storage level, and so arrays / groups have defer entirely to the store for that. If we stick with that decision, then it should not be possible to open an array in read-only mode from a writeable store, without creating a new read-only version of that store (which #3138 implements).

Rather than erroring on invalid store mode, we could emit a warning and create a copy of the store in the specified mode using #3138 (if implemented in the store class). This would balance matching the store mode with the API call and avoiding breaking changes. What would be the downsides of that approach?

@dstansby
Copy link
Contributor Author

👍 that sounds like a good approach to me - I'd even consider not warning and just silently returning a copy.

How shall we move forward then? Should we roll that change into #3138?

@dcherian
Copy link
Contributor

I think it's reasonable to ask for a read only array from a writeable store, and not be able to write to the array.

Yes. for example I'd like to read from array a and write to another array b in the same store, and have some protections around not writing to a.

Rather than erroring on invalid store mode, we could emit a warning and create a copy of the store in the specified mode using #3138 (if implemented in the store class).

👍 great idea!

@maxrjones
Copy link
Member

How shall we move forward then? Should we roll that change into #3138?

I can update #3138 to add with_read_only() to the other store classes as the first step. To keep PRs well-constrained, I think it would work well for you to either modify #3145 to use with_read_only() as the second step or someone does that as a different PR after #3138 is finished/reviewed/merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Array.read_only incorrect
4 participants
0