CUDA not available when calling ` backwards` before using CUDA

When I use TorchSharp and I perform a gradient calculation before loading in the CUDA binaries, then TorchSharp no longer recognizes that CUDA is available.

For example:
```cs
var lin = torch.nn.Linear(10, 1, false);
lin.forward(torch.rand(10)).backward();
Console.WriteLine(torch.cuda.is_available()); // False
```

On the other hand, if I call a CUDA function before the `backward()` call, it will work:

```cs
Console.WriteLine(torch.cuda.is_available()); // True
var lin = torch.nn.Linear(10, 1, false);
lin.forward(torch.rand(10)).backward();
Console.WriteLine(torch.cuda.is_available()); // True
```

I've traced this issue back to LibTorch, over [here](https://github.com/pytorch/pytorch/blob/HEAD/aten/src/ATen/detail/CUDAHooksInterface.cpp#L28-L29). The exact quote is:
```cs
// NB: The once_flag here implies that if you try to call any CUDA
// functionality before libATen_cuda.so is loaded, CUDA is permanently
// disabled for that copy of ATen.
```

@NiklasGustafsson How do you think this should be handled?

In PyTorch I'm pretty sure that they load in the CUDA binaries with the first `import torch` call. This would be easy to add to TorchSharp, but is that the desired behavior?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions