8000 correct and reasonable new example to replace the old one · Issue #25409 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

correct and reasonable new example to replace the old one #25409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Song-Pingfan opened this issue Jan 16, 2023 · 1 comment · Fixed by #26002
Closed

correct and reasonable new example to replace the old one #25409

Song-Pingfan opened this issue Jan 16, 2023 · 1 comment · Fixed by #26002

Comments

@Song-Pingfan
Copy link

>>> X = load_sample_images().images[1]

The problem of the old example is that it did not consider the "n_samples" dimension of the function from sklearn.feature_extraction.image.PatchExtractor and therefore caused a strange, unreasonable and confusing result that had no channel dimension.

The new example corrects the issue and gives a clear, reasonable demonstration of how to use PatchExtractor sensibly. It is shown as following:

>>> from sklearn.datasets import load_sample_images
>>> from sklearn.feature_extraction import image
# Use the array data from the second image in this dataset:
>>> X = load_sample_images().images[1]
>>> X = X[np.newaxis,:] # make X has a shape (n_samples, image_height, image_width, n_channels). Very important!!
>>> print('Image shape: {}'.format(X.shape))
Image shape: (1, 427, 640, 3)
>>> pe = image.PatchExtractor(patch_size=(10, 10))
>>> pe_fit = pe.fit(X) # Do nothing and return the estimator unchanged.
>>> pe_trans = pe.transform(X)
>>> print('Patches shape: {}'.format(pe_trans.shape))
Patches shape: (263758, 10, 10, 3)

Note, this line X = X[np.newaxis,:] adds a new axis as the "n_samples" dimension, which makes X has a correct shape (n_samples, image_height, image_width, n_channels). It is the key difference from the old example. This change is very important, because, otherwise, "image_height" will be treated as "n_samples", and other dimensions will also be messed up similarly. That is why the result in the old example look strange and confusing, with only 3 dimensions but without the channel dimension.

It would be good to add the following reconstruction scripts to verify that the original image can be reconstructed from the extracted patches using reconstruct_from_patches_2d function. This makes the example more complete, similar to the example of using the extract_patches_2d function. Again, it is important to keep in mind the "n_samples" dimension.

>>> X_reconstructed = image.reconstruct_from_patches_2d(pe_trans, X.shape[1:])
>>> print(X_reconstructed.shape)
>>> np.testing.assert_array_equal(X[0], X_reconstructed)
(427, 640, 3)
@github-actions github-actions bot added the Needs Triage Issue requires triage label Jan 16, 2023
@thomasjpfan
Copy link
Member

I agree with the change to add in n_samples, I'll recommend something like this:

from sklearn.datasets import load_sample_images
from sklearn.feature_extraction import image
# Use the array data from the second image in this dataset:
X = load_sample_images()["images"][1]
X = X[None, ...]
print(f"Image shape: {X.shape}")

pe = image.PatchExtractor(patch_size=(10, 10))
pe_trans = pe.transform(X)
print(f"Patches shape: {pe_trans.shape}")

X_reconstructed = image.reconstruct_from_patches_2d(pe_trans, X.shape[1:])
print(f"Reconstructed shape: {X_reconstructed.shape}")

There is no need to call fit before transform. Our docstring tests will assert that the reconstructed shape is correct.

@Song-Pingfan Are you interested in opening a pull request to update the docstring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0