-
Notifications
You must be signed in to change notification settings - Fork 24.3k
Implementing NumPy-like function torch.heaviside() #42523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 209ba37 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
65d3b0c
to
ac83ad8
Compare
e07067f
to
f2e9ea2
Compare
Hi, @mruberry ! This PR implements a new NumPy-like function |
aten/src/ATen/native/UnaryOps.cpp
Outdated
TORCH_CHECK(result.is_floating_point(), | ||
"heaviside is only implemented for floating point types output."); | ||
|
||
result = result.to(kFloat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems very interesting that, In NumPy, no matter what dtype of input parameters is, we will always get a float64
result. So I imitated numpy in my function and let its results keeping in float32
, which is the default float-point dtype in pytorch. Does this make sense?
>>> import numpy as np
>>> input_np=np.array([-1,0,1], dtype=np.int)
>>> input_np
array([-1, 0, 1])
>>> input_np.dtype
dtype('int32')
>>> val_np=np.array([2,3,4], dtype=int)
>>> val_np
array([2, 3, 4])
>>> val_np.dtype
dtype('int32')
>>> np.heaviside(input_np, val_np)
array([0., 3., 1.])
>>> np.heaviside(input_np, val_np).dtype
dtype('float64')
>>> import torch
>>> input=torch.tensor([-1,0,1], dtype=torch.int)
>>> input
tensor([-1, 0, 1], dtype=torch.int32)
>>> val=torch.tensor([2,3,4], dtype=torch.int)
>>> val
tensor([2, 3, 4], dtype=torch.int32)
>>> torch.heaviside(input, val)
tensor([0., 3., 1.])
>>> torch.heaviside(input, val).dtype
torch.float32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for being so thorough in your review, @Kiyosora! I find it hard to believe this behavior is intended in NumPy, however. And looking at the heaviside PR it doesn't appear. See this final comment:
cc @rgommers as an fyi.
I would implement standard PyTorch binary op type promotion (which you'll get automatically if you use the binary_op function to construct your TensorIterator).
aten/src/ATen/native/UnaryOps.cpp
Outdated
} | ||
|
||
Tensor heaviside(const Tensor& self, Scalar val) { | ||
Tensor result = at::empty({0}, self.options().dtype(kFloat)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you support type promotion you'll need to construct this as Tensor result;
since you won't know it's dtype at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
aten/src/ATen/native/UnaryOps.cpp
Outdated
8000 | return result; | |
} | ||
|
||
Tensor& heaviside_scalar_out(Tensor& result, const Tensor& self, Scalar val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you wrap the scalar value in a tensor and just call the Tensor, Tensor, Tensor
variant of this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
aten/src/ATen/native/UnaryOps.cpp
Outdated
@@ -298,6 +298,59 @@ Tensor& sigmoid_out(Tensor& result, const Tensor& self) { return unary_op_impl_o | |||
Tensor sigmoid(const Tensor& self) { return unary_op_impl(self, at::sigmoid_out); } | |||
Tensor& sigmoid_(Tensor& self) { return unary_op_impl_(self, at::sigmoid_out); } | |||
|
|||
Tensor& heaviside_tensor_out(Tensor& result, const Tensor& self, const Tensor& val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heaviside is a binary, not unary, op, so it goes in BinaryOps.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, Moved to BinaryOps.cpp.
AT_DISPATCH_FLOATING_TYPES(iter.dtype(), "heaviside_tensor_cpu", [&]() { | ||
cpu_kernel(iter, [=](scalar_t self, scalar_t val) -> scalar_t { | ||
if (self > 0) { | ||
return 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need an update to handle scalar types properly. Same with its dispatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
}); | ||
} | ||
|
||
static void heaviside_scalar_kernel(TensorIterator& iter, Scalar val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need a separate kernel for a scalar val because TensorIterator will handle the broadcasting for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we no longer need a separate kernel for scalar val, deleted.
}); | ||
} | ||
|
||
void heaviside_scalar_kernel_cuda(TensorIterator& iter, Scalar val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need this kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, we no longer need a separate kernel for scalar val, deleted.
@@ -173,6 +173,35 @@ void clamp_max_kernel_cuda(TensorIterator& iter, Scalar max_value) { | |||
}); | |||
} | |||
|
|||
void heaviside_tensor_kernel_cuda(TensorIterator& iter) { | |||
AT_DISPATCH_FLOATING_TYPES(iter.dtype(), "heaviside_tensor_cuda", [&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dispatch + scalar handling updates here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
void heaviside_tensor_kernel_cuda(TensorIterator& iter) { | ||
AT_DISPATCH_FLOATING_TYPES(iter.dtype(), "heaviside_tensor_cuda", [&]() { | ||
gpu_kernel(iter, [=]GPU_LAMBDA(scalar_t self, scalar_t val) -> float { | ||
if (self > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an inner loop like:
self == 0 ? val : (self > 0)
would be less code and possibly faster, since comparing to zero is generally faster than other comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, Thanks for advice!
- func: heaviside.Tensor(Tensor self, Tensor val) -> Tensor | ||
use_c10_dispatcher: full | ||
variants: function, method | ||
device_guard: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't bother trying to manually manipulate device guards here or in the following function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
test/test_torch.py
Outdated
input_dtype = dtypes[0] | ||
val_dtype = dtypes[1] | ||
|
||
val = torch.empty(5, 5, device=device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
device=device, dtype=dtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed! Has been rewritten.
@unittest.skipIf(not TEST_NUMPY, "Numpy not found") | ||
@dtypes(*list(product(torch.testing.get_all_dtypes(include_complex=False), | ||
torch.testing.get_all_dtypes(include_complex=False)))) | ||
def test_heaviside(self, device, dtypes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty cool test but I think you can simplify it by generating an array with NumPy's integers and then a second NumPy array with integers or rand depending on the dtype of the rhs argument. Then you can run NumPy's heaviside on these args to get your expected values.
For your actual values you can torch.from_numpy(lhs/rhs).to(device=device, dtype=dtype)
the NumPy arrays and run your torch.heaviside
. Then you can compare the results with self.assertEqual(actual, expected, exact_dtype=False)
, and verify the torch.heaviside
function produced the correct dtype by checking the dtype of the actual
tensor against the result of torch.result_type
.
This should let you write a thorough but compact test, I think. What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, LGTM. I would improve my test case, thanks for your advice! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look good. This first test will need a tweak depending on how you decide to handle type promotion. It's probably easiest to require the types be the same, then you could just check for that case and that a runtime error is thrown.
test/test_torch.py
Outdated
|
||
@unittest.skipIf(not TEST_NUMPY, "Numpy not found") | ||
@dtypes(*(torch.testing.get_all_int_dtypes() + [torch.bool])) | ||
def test_heaviside_non_float_output(self, device, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will have to be updated with the type promotion changes. Maybe removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we no longer need this case, deleted.
torch/_torch_docs.py
Outdated
@@ -5110,6 +5110,41 @@ def merge_dicts(*dicts): | |||
|
|||
""".format(**common_args)) | |||
|
|||
add_docstr(torch.heaviside, | |||
r""" | |||
heaviside(input, val, out=None) -> Tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(input, val, *, out=None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
torch/_torch_docs.py
Outdated
heaviside(input, val, out=None) -> Tensor | ||
|
||
Returns a new tensor with each of the elements of :attr:`input` | ||
computed with Heaviside step function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Computes the Heaviside step function for each element in :attr:`input`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start, @Kiyosora!
I made some simplifying comments and suggestions. The one big change is, I think, supporting PyTorch-style type promotion instead of always returning a float tensor.
Looking forward to an update!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, Kiyosora!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Thanks for taking the time to work on this PR and be so diligent, @Kiyosora. Whenever you'd like you can ping me (on this PR, for example, or Slack, or the NumPy rollup issue, etc.) and we can find a new project that fits your interests. |
Codecov Report
@@ Coverage Diff @@
## master #42523 +/- ##
==========================================
+ Coverage 69.31% 69.40% +0.09%
==========================================
Files 378 378
Lines 46745 46613 -132
==========================================
- Hits 32403 32354 -49
+ Misses 14342 14259 -83
Continue to review full report at Codecov.
|
Hey @Kiyosora, a bit of bad news: this hit a merge conflict that wasn't detected by the automated tools and had to be reverted. It will need to rebased. The signature of TensorIterator's binary operation constructor changed. |
1d924d6
to
ef7d53e
Compare
@Kiyosora you can simply rebase your change on top of main trunk and try it again. |
ef7d53e
to
209ba37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This fixes a `katex` error I was getting trying to build the docs: ``` ParseError: KaTeX parse error: Undefined control sequence: \0 at position 55: …gin{cases} ``` This failure was introduced in #42523. Pull Request resolved: #44481 Reviewed By: colesbury Differential Revision: D23627700 Pulled By: mruberry fbshipit-source-id: 9cc09c687a7d9349da79a0ac87d6c962c9cfbe2d
Summary: This fixes a `katex` error I was getting trying to build the docs: ``` ParseError: KaTeX parse error: Undefined control sequence: \0 at position 55: …gin{cases} ``` This failure was introduced in #42523. Pull Request resolved: #44481 Reviewed By: colesbury Differential Revision: D23627700 Pulled By: mruberry fbshipit-source-id: 9cc09c687a7d9349da79a0ac87d6c962c9cfbe2d
torch.heaviside()
.