8000 `asarray`: device does not propagate from input to output after `set_default_device` · Issue #150199 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

asarray: device does not propagate from input to output after set_default_device #150199< 8000 /span>

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
crusaderky opened this issue Mar 28, 2025 · 9 comments
Assignees
Labels
module: python array api Issues related to the Python Array API module: python frontend For issues relating to PyTorch's Python frontend triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@crusaderky
Copy link
crusaderky commented Mar 28, 2025

🐛 Describe the bug

The documentation of asarray states:

device (torch.device, optional) – the device of the returned tensor. Default: None, which causes the device of obj to be used. Or, if obj is a Python sequence, the current default device will be used.

The described behaviour is coherent with the specification of the Array API standard.

This works as expected in practice: asarray(x, device=None), propagates the device of x to the output - unless the user sets the default device (even if just to explicitly state the current value). After that, asarray(x, device=None) disregards the device of x and converts everything to the default device.

In [1]: import torch

In [2]: torch.get_default_device()
Out[2]: device(type='cpu')

In [3]: x = torch.asarray(0, device=torch.device('cuda'))

In [4]: torch.asarray(x).get_device()
Out[4]: 0  # OK

In [5]: torch.set_default_device('cpu')

In [6]: torch.asarray(x).get_device()
Out[6]: -1  # KO

Versions

pytorch 2.6.0 conda-forge linux intel

cc @mruberry @rgommers @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi @albanD

@FFFrog
Copy link
Collaborator
FFFrog commented Mar 31, 2025

Hey!
From my point of view, this is all expected.

The function get_device() returns the device index of the tensor. Since cpu does not have strict indexing, the function returns -1. You can remove get_device() and call torch.asarray(x) directly to test this case.

In [1]: import torch

In [2]: torch.get_default_device()
Out[2]: device(type='cpu')

In [3]: x = torch.asarray(0, device=torch.device('cuda'))

In [4]: ​​torch.asarray(x)
Out[4]: tensor(0, device='cuda:0') # OK

In [5]: torch.set_default_device('cpu')

In [6]: torch.asarray(x)
Out[6]: tensor(0)

In [6]: torch.asarray(x).device
Out[6]: device(type='cpu') // Since the index of cpu is -1, the function repr of torch.device has some special logic to not display the index

@crusaderky
Copy link
Author
crusaderky commented Mar 31, 2025

I'm sorry there must be some misunderstanding.
I'm aware that -1 means CPU.
The problem that I am reporting is that calling asarray of a CUDA-backed tensor with device=None shunts it to the CPU memory, whereas both the torch documentation itself and the Array API specification dictate that the device of the input tensor must be propagated to the output.
Your own code snippet clearly confirms this problem.

The torch documentation states:

(emphasis mine)

device (torch.device, optional) – the device of the returned tensor. Default: None, which causes the device of obj to be used. Or, if obj is a Python sequence, the current default device will be used.

What is actually happening in the code snippets above:

The default device=None causes the default device to be used instead of the device of obj, which is disregarded.

@FFFrog
Copy link
Collaborator
FFFrog commented Apr 1, 2025

Sorry, I misunderstood.

As the Array API standard says, I have excerpted a part of it and pasted some relevant content as follows:

If a library has multiple ways of controlling device placement, the most explicit method should have the highest priority. For example:

If device= keyword is specified, that always takes precedence
If device=None, then use the setting from a context manager, if set.
If no context manager was used, then use the global default device/strategy

So, I think asarray in PyTorch is expected, but the documentation of asarray does cause some confusion, I will submit a PR to fix it

@FFFrog FFFrog self-assigned this Apr 1, 2025
FFFrog added a commit that referenced this issue Apr 1, 2025
As the title stated.

Related Issue:
#150199

ghstack-source-id: 9bdc20e
Pull Request resolved: #150385
@FFFrog FFFrog linked a pull request Apr 1, 2025 that will close this issue
@crusaderky
Copy link
Author

Hello,

Let me quote the same page, 3 lines above:

Preserve device assignment as much as possible (e.g. output arrays from a function are expected to be on the same device as input arrays to the function).

Also, I need to reiterate that torch respects this guidance, but only if nobody called torch.set_default_device.

@FFFrog
Copy link
Collaborator
FFFrog commented Apr 1, 2025

Hey, @crusaderky

We can try to simplify this problem and break it into three parts:

  • If no one has called torch.set_default_device before, then the output tensor device should be the same as the input tensor when calling torch.asarray with device=None or without explicitly specifying the device, right? (I understand that our argument should be here, but I didn't understand it like this before)
  • If someone has called torch.set_device_device to set the default device beforehand, and then calls device=None or without explicitly specifying the device, then the output tensor device should be the same as the default device, right?
  • Of course, regardless of whether someone has called torch.set_device_device beforehand, when calling torch.asarry with a specific device like device=cuda, the output tensor device should always be on the specified device, right?

Any advices are welcome :D

@crusaderky
Copy link
Author
  • If no one has called torch.set_default_device before, then the output tensor device should be the same as the input tensor when calling torch.asarray with device=None or without explicitly specifying the device, right? (I understand that our argument should be here, but I didn't understand it like this before)

Correct.

  • If someone has called torch.set_device_device to set the default device beforehand, and then calls device=None or without explicitly specifying the device, then the output tensor device should be the same as the default device, right?

It should be the same as the input tensor's, following the principle of device propagation from input to output:

"output arrays from a function are expected to be on the same device as input arrays to the function"

If the argument is a pure-python array-like (e.g. a list or scalar), it should be the same as the default device.

  • Of course, regardless of whether someone has called torch.set_device_device beforehand, when calling torch.asarry with a specific device like device=cuda, the output tensor device should always be on the specified device, right?

Correct.

@FFFrog
Copy link
Collaborator
FFFrog commented Apr 1, 2025

I see, thanks for your explanation.

But I have two questions about this:

  • Is the above description only for asarray? Currently the similar functions are also as follows
    def _device_constructors():
    return {
    # standard ones
    torch.empty,
    torch.empty_permuted,
    torch.empty_strided,
    torch.empty_quantized,
    torch.ones,
    torch.arange,
    torch.bartlett_window,
    torch.blackman_window,
    torch.eye,
    torch.fft.fftfreq,
    torch.fft.rfftfreq,
    torch.full,
    torch.fill,
    torch.hamming_window,
    torch.hann_window,
    torch.kaiser_window,
    torch.linspace,
    torch.logspace,
    torch.nested.nested_tensor,
    # This function doesn't actually take a device argument
    # torch.normal,
    torch.ones,
    torch.rand,
    torch.randn,
    torch.randint,
    torch.randperm,
    torch.range,
    torch.sparse_coo_tensor,
    torch.sparse_compressed_tensor,
    torch.sparse_csr_tensor,
    torch.sparse_csc_tensor,
    torch.sparse_bsr_tensor,
    torch.sparse_bsc_tensor,
    torch.tril_indices,
    torch.triu_indices,
    torch.vander,
    torch.zeros,
    torch.asarray,
    # weird ones
    torch.tensor,
    torch.as_tensor,
    torch.scalar_tensor,
    torch.asarray,
    }
  • In theory, we can easily implement this, but it will introduce BC break

@albanD @ezyang Any suggestions?

@crusaderky
Copy link
Author
crusaderky commented Apr 1, 2025
  • Is the above description only for asarray? Currently the similar functions are also as follows

All functions should behave the same.
I mentioned only asarray because in array-api-compat we're wrapping around torch; all other data creation functions are already wrapped and behave the way I described. asarray is the only one we haven't wrapped because so far it was respecting the Standard specs.

  • In theory, we can easily implement this, but it will introduce BC break

If you would rather retain the current behaviour due to backwards compatibility concerns, we can wrap asarray too in array-api-compat and call it a day. Although in the long term I would hope to see the need for array-api-compat disappear eventually?

@zou3519 zou3519 added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: python array api Issues related to the Python Array API module: python frontend For issues relating to PyTorch's Python frontend labels Apr 2, 2025
@crusaderky
Copy link
Author

I went on and tested all the functions you listed.

  • Is the above description only for asarray? Currently the similar functions are also as follows

These work fine today, as there is no option to pass an input array:

                 torch.empty, 
                 torch.ones, 
                 torch.arange, 
                 torch.eye, 
                 torch.fft.fftfreq, 
                 torch.fft.rfftfreq, 
                 torch.full, 
                 torch.linspace, 
                 torch.ones, 
                 torch.zeros, 

These work fine today, as the input's device is correctly propagated to the output:

                 torch.ones_like, 
                 torch.full_like, 
                 torch.zeros_like, 

This DOES NOT propagate the input's device when the user has set the default device:

                 torch.asarray, 

These are not covered by the Array API standard so I don't have an opinion:

                 torch.empty_permuted, 
                 torch.empty_strided, 
                 torch.empty_quantized, 
                 torch.bartlett_window, 
                 torch.blackman_window, 
                 torch.fill, 
                 torch.hamming_window, 
                 torch.hann_window, 
                 torch.kaiser_window, 
                 torch.logspace, 
                 torch.nested.nested_tensor, 
                 torch.rand, 
                 torch.randn, 
                 torch.randint, 
                 torch.randperm, 
                 torch.range, 
                 torch.sparse_coo_tensor, 
                 torch.sparse_compressed_tensor, 
                 torch.sparse_csr_tensor, 
                 torch.sparse_csc_tensor, 
                 torch.sparse_bsr_tensor, 
                 torch.sparse_bsc_tensor, 
                 torch.tril_indices, 
                 torch.triu_indices, 
                 torch.vander, 
                 torch.tensor, 
                 torch.as_tensor, 
                 torch.scalar_tensor, 

Divigroup-RAP pushed a commit to Divigroup-RAP/PYTORCH that referenced this issue Apr 22, 2025
As the title stated.

Related Issue:
pytorch/pytorch#150199

ghstack-source-id: 1000e3d
Pull Request resolved: pytorch/pytorch#150385
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: python array api Issues related to the Python Array API module: python frontend For issues relating to PyTorch's Python frontend triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0