8000 Support `torch.linalg.trace` by asi1024 · Pull Request #62714 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Support torch.linalg.trace #62714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Add torch.linalg.trace
  • Loading branch information
asi1024 committed Sep 29, 2021
commit 0b91b4b4183525dea9dec7a60636df7231c9b7c9
20 changes: 20 additions & 0 deletions aten/src/ATen/native/ReduceOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1079,6 +1079,19 @@ Tensor trace_cpu(const Tensor& self) {
return result;
}

// TODO: this routine should be implemented without diag and sum for perf problems,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File a new issue to improve the performance of linalg.trace and point to that issue in this comment

Copy link
< 8000 /div>
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually (see below) I recommend we address this in this PR and not file a follow-up issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'd say this would take quite some effort to do. I don't think that this function is going to be super popular, so I'd propose we stick with the current implementation.

// see https://github.com/pytorch/pytorch/pull/47305,
Tensor linalg_trace(const Tensor& self, int64_t offset) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also missing a TORCH_CHECK(self.dim() >= 2, ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd also be nice to check that offset is in the range [-dim, dim) and throw a nice error message otherwise.

return at::diagonal(self, offset, -2, -1).sum(-1);
}

Tensor linalg_trace_backward(const Tensor & grad, IntArrayRef input_sizes, int64_t offset) {
auto grad_input = at::zeros(input_sizes, grad.options());
auto diag = grad_input.diagonal(offset, -2, -1);
diag.copy_(grad.unsqueeze(-1));
return grad_input;
}

void impl_func_prod(
const Tensor& self,
IntArrayRef dims,
Expand All @@ -1093,6 +1106,13 @@ void impl_func_prod(
}
}

Tensor prod(const Tensor& self, int64_t dim, bool keepdim, c10::optional<ScalarType> opt_dtype) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed in this PR?
The function prod(Tensor self, int dim, bool keepdim=False, *, ScalarType? dtype=None) -> Tensor should be autogenerated with

structured_delegate: prod.int_out

ScalarType dtype = get_dtype_from_self(self, opt_dtype, true);
Tensor result = create_reduction_result(self, dim, keepdim, dtype);
native::prod_out_impl(result, self, dim, keepdim, dtype);
return result;
}

TORCH_IMPL_FUNC(prod_out)
(const Tensor& self,
int64_t dim,
Expand Down
12 changes: 12 additions & 0 deletions aten/src/ATen/native/native_functions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6230,6 +6230,18 @@
CPU: trace_cpu
CUDA: trace_cuda

- func: linalg_trace(Tensor self, int offset=0) -> Tensor
python_module: linalg
variants: method, function
dispatch:
CPU, CUDA: linalg_trace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is overwritten by CompositeExplicitAutograd: linalg_trace. The code in the linalg_trace function is independent of the device so CPU, CUDA specialization is not needed here and CompositeExplicitAutograd is the correct choice of the dispatch key.

Suggested change
CPU, CUDA: linalg_trace

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought using CompositeImplicitAutograd should be better, then the backward function is not needed.

CompositeExplicitAutograd: linalg_trace

- func: linalg_trace_backward(Tensor grad, int[] sizes, int offset) -> Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove this entry from native_functions.yaml?
Most of the backward functions in PyTorch are placed in torch\csrc\autograd\FunctionsManual.cpp and torch\csrc\autograd\FunctionsManual.h, so let's move linalg_trace_backward from ReduceOps.cpp.

variants: function
device_check: NoCheck
device_guard: False

- func: trace_backward(Tensor grad, int[] sizes) -> Tensor
variants: function
device_check: NoCheck
Expand Down
3 changes: 3 additions & 0 deletions tools/autograd/derivatives.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1412,6 +1412,9 @@
self: trace_backward(grad, self.sizes())
result: auto_linear

- name: linalg_trace(Tensor self, int offset=0) -> Tensor
self: linalg_trace_backward(grad, self.sizes(), offset)

- name: transpose.int(Tensor(a) self, int dim0, int dim1) -> Tensor(a)
self: grad.transpose(dim0, dim1)
result: auto_linear
Expand Down
16 changes: 16 additions & 0 deletions torch/linalg/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2042,3 +2042,19 @@
>>> torch.dist(Q.transpose(-2, -1) @ Q, torch.eye(4))
tensor(6.2158e-07)
""")

trace = _add_docstr(_linalg.linalg_trace, r"""
trace(input, offset=0) -> Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input, *, offset=0


Returns the sum of the elements of the diagonal.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What diagonal? Given that we have the parameter offset, this should probably read:

Returns the sum of the elements of a diagonal.

Followed by an explanation of how th F438 e offset parameter chooses a diagonal.


Example::

>>> x = torch.arange(1., 10.).view(3, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When working with a single tensor prefer t for the name, when working with multiple prefer a, b, c... when the tensors don't already have natural names.

>>> x
tensor([[ 1., 2., 3.],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool! Add an example showing how to use offset, too

[ 4., 5., 6.],
[ 7., 8., 9.]])
>>> torch.linalg.trace(x)
tensor(15.)
""")
1 change: 1 addition & 0 deletions torch/overrides.py
Original file line number Diff line number Diff line change
Expand Up @@ -958,6 +958,7 @@ def get_testing_overrides() -> Dict[Callable, Callable]:
torch.tile: lambda input, dims: -1,
torch.topk: lambda input, k, dim=-1, descending=False, out=None: -1,
torch.trace: lambda input: -1,
torch.linalg.trace: lambda input, offset=0: -1,
torch.transpose: lambda input, dim0, dim1: -1,
torch.trapz: lambda y, x=None, dim=-1: -1,
torch.trapezoid: lambda y, x=None, dim=-1: -1,
Expand Down
0