Implement `to_numpy` method to speed up matplotlib with PyTorch arrays #101795

patel-zeel · 2023-05-18T09:59:10Z

🚀 The feature, motivation and pitch

Hi,

As discussed in this issue and corresponding PR on matplotlib, PyTorch arrays can be significantly slow when used directly with matplotlib. This is because matplotlib has no easy way to convert PyTorch arrays to NumPy arrays before plotting and thus it expects other libraries to have to_numpy() method. I think to_numpy() implementation in PyTorch would be useful for the PyTorch users who might be using PyTorch arrays directly with matplotlib without knowing that it can be too slow.

Alternatives

As discussed in the matplotlib PR, we considered adding a specific check for inputs of type torch.Tensor and then convert it to numpy using .numpy() method but adding a string based check does not seem a good idea.
We tried using __array__ method to convert both JAX and PyTorch arrays to NumPy but it does not work well with some other objects having __array__ method.

Additional context

Here is the code to reproduce the plotting delay issue:

from time import time
import numpy as np
import matplotlib.pyplot as plt

import torch

torch_array = torch.randn(1000, 150)

def plot_hist(array):
    init = time()
    plt.figure()
    plt.hist(array)
    print(f"Time to plot: {time() - init:.2f} s")
    plt.show()

plot_hist(torch_array.ravel())  # Takes around 2 seconds
plot_hist(np.array(torch_array.ravel()))  # Takes around 0.04 seconds

I am open to a diverse set of suggestions to fix this issue.

cc @mruberry @rgommers

The text was updated successfully, but these errors were encountered:

malfet · 2023-05-18T13:56:21Z

Wouldn't adding torch.Tensor.to_numpy = torch.Tensor.numpy after import torch addresses it?
But adding this alias permanently, sounds reasonable to me, as it would make it consistent with say pandas, unless @mruberry knows a reason why it should be called numpy, rather than to_numpy?

oscargus · 2023-05-18T15:04:23Z

As discussed in matplotlib/matplotlib#22645, at least pandas, xarray, polar, and pyarrow support to_numpy, so I would just encourage you to consider adding an alias.

patel-zeel · 2023-05-19T02:43:30Z

Right @oscargus, I'd say that adding to_numpy is unlikely to break anything in PyTorch but finding a way to support all possible array libraries with different APIs is kind of difficult for matplotlib (at least it seems to me that way). At least this can be a hotfix until all array libraries find a common way out of this.

lezcano · 2023-05-20T15:53:51Z

cc @rgommers

rgommers · 2023-05-20T16:23:59Z

This is related to gh-36560, which is also "ensure conversion to numpy before plotting". Note that .numpy() or .to_numpy() as an alias is not enough, you'll get exceptions for non-CPU tensors as well as tensors on CPU that are in an autograd graph.

There are already several comments on matplotlib/matplotlib#25887 which point out that adding yet another way of converting to a numpy array is not the best idea. So I'd recommend not doing this. There are already too many ways of doing this. I'll comment on the Matplotlib PR, because that has much more relevant context.

patel-zeel · 2025-01-07T19:51:48Z

The original issue is resolved by matplotlib/matplotlib#25887 and thus this issue also stands resolved. Closing it.

malfet added triage review enhancement Not as big of a feature, but technically not a bug. Should be easy to fix labels May 18, 2023

patel-zeel closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement `to_numpy` method to speed up matplotlib with PyTorch arrays #101795

Implement `to_numpy` method to speed up matplotlib with PyTorch arrays #101795

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Implement to_numpy method to speed up matplotlib with PyTorch arrays #101795

Implement to_numpy method to speed up matplotlib with PyTorch arrays #101795

Comments

Uh oh!

🚀 The feature, motivation and pitch

Alternatives

Additional context

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Implement `to_numpy` method to speed up matplotlib with PyTorch arrays #101795

Implement `to_numpy` method to speed up matplotlib with PyTorch arrays #101795