8000 OpInfo: `fmod` and `remainder` by krshrimali · Pull Request #57941 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

OpInfo: fmod and remainder #57941

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

krshrimali
Copy link
Contributor
@krshrimali krshrimali commented May 10, 2021

@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented May 10, 2021

💊 CI failures summary and remediations

As of commit c3273f7 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@krshrimali
Copy link
Contributor Author
krshrimali commented May 10, 2021

There are tests failing for test_variant_consistency_jit and test_variant_consistency_eager. I'm taking a look on what's going wrong, will ping once ready.

Still WIP for remainder, will push changes for remainder by tonight.

@mruberry
Copy link
Collaborator

This seems like a good opportunity to talk about fmod and remainder. See their documentation:

https://pytorch.org/docs/master/generated/torch.fmod.html?highlight=fmod#torch.fmod
https://pytorch.org/docs/master/generated/torch.remainder.html?highlight=remainder#torch.remainder

The documentation for these operations is weird. Like, really weird. Let's talk about why:

  • the sentence "The dividend and divisor may contain both for integer and floating point numbers." doesn't make grammatical sense; I think it's trying to say that we support integer and floating point inputs, but this explained later in the "Supports..." statement, anyway, so we can get rid of this sentence
  • the difference between the functions (if any) is unclear
  • the note for remainder references fmod, and the two notes contain different information

So what's going on here?

The kernels are different:

void remainder_kernel(TensorIterator& iter) {

AT_DISPATCH_FLOATING_TYPES_AND(kHalf, iter.common_dtype(), "fmod_cpu", [&]() {

but the difference isn't immediately obvious from the code, either.

So here's what I'd like to suggest:

  • clarify the docs for fmod and remainder by removing the nonsense sentence and adding a clear description of what each does, a seealso section explaining how they're different from each other, and a note about they're handling of zero divisors and a separate note about their lack of handling of complex tensors
  • update the example for each to be a common example that demonstrates what each function does and how they're different
  • merge their sample inputs into a single sample inputs func
  • ensure the divisor doesn't contain zero for that sample inputs func

For that last bullet I want to suggest expanding make_tensor() to take a new kwarg, include_zero, which is True by default but, if false, zeros are replaced with some epsilon value.

cc @pmeier because make_tensor() supporting an include_zero argument is also something we need for automated testing of binary elementwise operations (like fmod and remainder are).

It's OK to pursue all of these changes in one PR; the sample inputs for fmod and remainder will be a good test case for the update to make_tensor.

What do you think @krshrimali? @kshitij12345, does this approach make sense to you?

@krshrimali
Copy link
Contributor Author

This seems like a good opportunity to talk about fmod and remainder. See their documentation:

https://pytorch.org/docs/master/generated/torch.fmod.html?highlight=fmod#torch.fmod
https://pytorch.org/docs/master/generated/torch.remainder.html?highlight=remainder#torch.remainder

The documentation for these operations is weird. Like, really weird. Let's talk about why:

  • the sentence "The dividend and divisor may contain both for integer and floating point numbers." doesn't make grammatical sense; I think it's trying to say that we support integer and floating point inputs, but this explained later in the "Supports..." statement, anyway, so we can get rid of this sentence
  • the difference between the functions (if any) is unclear
  • the note for remainder references fmod, and the two notes contain different information

So what's going on here?

The kernels are different:

void remainder_kernel(TensorIterator& iter) {

AT_DISPATCH_FLOATING_TYPES_AND(kHalf, iter.common_dtype(), "fmod_cpu", [&]() {

but the difference isn't immediately obvious from the code, either.

So here's what I'd like to suggest:

  • clarify the docs for fmod and remainder by removing the nonsense sentence and adding a clear description of what each does, a seealso section explaining how they're different from each other, and a note about they're handling of zero divisors and a separate note about their lack of handling of complex tensors
  • update the example for each to be a common example that demonstrates what each function does and how they're different
  • merge their sample inputs into a single sample inputs func
  • ensure the divisor doesn't contain zero for that sample inputs func

For that last bullet I want to suggest expanding make_tensor() to take a new kwarg, include_zero, which is True by default but, if false, zeros are replaced with some epsilon value.

cc @pmeier because make_tensor() supporting an include_zero argument is also something we need for automated testing of binary elementwise operations (like fmod and remainder are).

It's OK to pursue all of these changes in one PR; the sample inputs for fmod and remainder will be a good test case for the update to make_tensor.

What do you think @krshrimali? @kshitij12345, does this approach make sense to you?

Hi, @mruberry - Thanks for the detailed explanation. It's a good point, I went through the implementation of fmod and remainder. The difference I understand from the documentation as well as online resources (documentation of math.fmod: https://docs.python.org/3/library/math.html#math.fmod):

torch.remainder makes sure that the output is same as the denominator, while torch.fmod returns the exact mathematical output. Apparently, the example mentioned in the documentation does (though it's somehow cryptic as people may not notice instantly) mention this:

>>> torch.fmod(torch.tensor(-3), 2)
tensor(-1)
>>> torch.remainder(torch.tensor(-
10000
3), 2)
tensor(1)

I agree on all the points mentioned, and this can be done in this PR's scope. The documentation does sound gramatically incorrect, and I'll correct that. On including include_zero as a kwarg to make_tensor, I agree that this will help for testing other binary ops as well. Will ping again once the PR is ready for review. Thanks again for your inputs.

replace_with = torch.tensor(1, device=device, dtype=dtype)
elif dtype in floating_types_and(torch.half, torch.bfloat16):
replace_with = torch.tensor(sys.float_info.epsilon, device=device, dtype=dtype)
result[result == 0] = replace_with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an else clause that asserts to be sure no types are missed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this should be resolved now.

@@ -1770,10 +1773,17 @@ def make_tensor(size, device: torch.device, dtype: torch.dtype, *, low=None, hig
result = torch.repeat_interleave(result, 2, dim=-1)
result = result[..., ::2]

if not include_zero and dtype not in complex_types():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we extend this approach to complex numbers, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks for pointing out.

@kshitij12345
Copy link
Collaborator
kshitij12345 commented May 12, 2021

cc @pmeier because make_tensor() supporting an include_zero argument is also something we need for automated testing of binary elementwise operations (like fmod and remainder are).

@mruberry @krshrimali
I am wondering how many ops will actually need to set include_zero=False? (We need to set it to false for remainder and fmod because they throw an error for integral types). For most of the other operators we just verify that out-of-domain values (or problematic points) are also consistent with numpy.

Are there other operators which may need this feature i.e. are they other operators that throw error like this? (mvlgamma does and we plan to change that)

Based on this, do we need to add include_zeros to make_tensor or should this be handled inside sample_inputs_func?

Note: numpy does not throw

import numpy as np
import torch
np.remainder(np.array(1), np.array(0))
torch.remainder(torch.tensor(1), torch.tensor(0)) # RuntimeError: ZeroDivisionError
torch.fmod(torch.tensor(1), torch.tensor(0)) # RuntimeError: ZeroDivisionError

facebook-github-bot pushed a commit that referenced this pull request May 26, 2021
Summary:
Context:
The Error message when `broadcasts_input` is marked incorrectly is uninformative [See Error Currently]
#57941 (comment)

Error Currently
```
Traceback (most recent call last):
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 326, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 310, in _test_consistency_helper
    variant_forward = variant(cloned,
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 227, in __exit__
    self._raiseFailure("{} not raised".format(exc_name))
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 164, in _raiseFailure
    raise self.test_case.failureException(msg)
AssertionError: RuntimeError not raised
```

Error After PR
```
Traceback (most recent call last):
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 329, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 313, in _test_consistency_helper
    variant_forward = variant(cloned,
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 227, in __exit__
    self._raiseFailure("{} not raised".format(exc_name))
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 164, in _raiseFailure
    raise self.test_case.failureException(msg)
AssertionError: RuntimeError not raised : inplace variant either allowed resizing or you have marked the sample SampleInput(input=Tensor, args=(tensor([[[ 2.1750, -8.5027, -3.1403, -6.9942,  3.2609],
         [-2.5057, -5.9123, -5.4633,  6.1203, -8.2124],
         [-3.5802, -8.4869, -6.0700,  2.3431, -8.1955],
         [-7.3316,  1.3248, -6.8661,  7.1483, -8.0719],
         [ 4.5977, -4.0448, -6.2044, -2.1314, -8.4956]],

        [[ 3.2769, -8.4360,  1.2826,  7.1749,  4.7653],
         [-0.2816, -2.5997, -4.7659, -3.7814,  3.9704],
         [-2.1778, -3.8117, -6.0276, -0.8423, -5.9646],
         [ 8.6544, -3.0922,  0.2558, -4.9318, -4.7596],
         [ 4.5583,  4.3830,  5.8793,  0.9713, -2.1481]],

        [[-1.0447,  0.9334,  7.6405, -4.8933, -7.4010],
         [ 7.7168, -8.4266, -5.5980, -6.9368,  7.1309],
         [-8.7720, -5.0890, -0.4975,  1.9518,  1.7074],
         [-8.5783,  8.5510, -8.5459, -3.5451,  8.4319],
         [ 8.5052, -8.9149, -6.6298, -1.2750, -5.7367]],

        [[-6.5625,  8.2795, -4.9311,  1.9501, -7.1777],
         [-8.4035,  1.1136, -7.6418, -7.0726, -2.8281],
         [ 4.2668, -0.2883, -6.2246,  2.3396,  1.2911],
         [ 4.6550, -1.9525,  4.4873, -3.8061, -0.8653],
         [-3.4256,  4.4423,  8.2937, -5.3456, -4.2624]],

        [[ 7.6128, -6.3932,  4.7131, -5.4938,  6.4792],
         [-6.5385,  2.4385,  4.5570,  3.7803, -8.3281],
         [-2.9785, -4.4745, -1.1778, -8.9324,  1.3663],
         [ 3.7437,  3.5171, -6.3135, -8.4519, -2.7033],
         [-5.0568, -8.4630, -4.2870, -3.7284, -1.5238]]], device='cuda:0',
       dtype=torch.float32, requires_grad=True),), broadcasts_input=True) incorrectly with `broadcasts_self=True
```

**NOTE**:
Printing the sample looks very verbose and it may be hard to figure out which sample is incorrectly configured if there are multiple samples with similar input shapes.

Two Options to make this error less verbose
* Don't print the sample and just print `inplace variant either allowed resizing or you have marked one of the sample incorrectly with broadcasts_self=True`
* Have some mechanism to name samples which will be printed in the `repr` (which will need extra machinery)

Pull Request resolved: #58295

Reviewed By: ngimel

Differential Revision: D28627308

Pulled By: mruberry

fbshipit-source-id: b3bdeacac3cf9c0d984f0b85410ecce474291d20
Comment on lines +1836 to +1837
'fmod',
'fmod.autodiffed',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to add both variants of fmod registered in OpInfo. Minor suggestion related to this, is to make errors more verbose (it should mention the op name). Currently, if the op is supported but is not registered in works_list, following error is raised (see: https://github.com/pytorch/pytorch/blob/master/test/test_jit_fuser_te.py#L1976)

Expected test to fail. If it now works, move op into works_list

Even though the failure will mention the test name (fmod_autodiffed_cpu_float32 - order may change of keywords autodiffed, cpu and float32) but the op name will be fmod.autodiffed (here autodiffed was the variant_name we passed to the OpInfo). Giving the op name in the error will help in general:

f"Expected test to fail. If it now works, move {get_name(op)} into works_list"

Not a big change, but just something I felt might help.

What do you think? @mruberry @kshitij12345

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Chillee; we should create an attribute like "supports_nnc" on the OpInfo, like we have supports_grad

@krshrimali
Copy link
Contributor Author
krshrimali commented May 26, 2021

Sorry for the delay in this PR, @mruberry. The ops had to be registered in works_list (all variants) in `test_jit_fuser_te.

This PR is ready for review now, the documentation has updated and all the reviews have been addressed. About the failure test: I don't think it's relevant to the changes made in this PR.

I'll raise an issue on discussing implementation of remainder (it not using std::remainder and instead using std::fmod) this week.

@krshrimali krshrimali requested a review from mruberry May 26, 2021 15:45
@mruberry
Copy link
Collaborator

Sorry for the delay in this PR, @mruberry. The ops had to be registered in works_list (all variants) in `test_jit_fuser_te.

cc @Chillee We should kill these lists and add the appropriate metadata to the OpInfo itself so we don't need to hunt down the actual tests when adding OpInfos.

This PR is ready for review now, the documentation has updated and all the reviews have been addressed. About the failure test: I don't think it's relevant to the changes made in this PR.

Cool!

I'll raise an issue on discussing implementation of remainder (it not using std::remainder and instead using std::fmod) this week.

That would be great; thank you

@@ -24,24 +24,6 @@ void THCTensor_(mul)(THCState *state, THCTensor *self_, THCTensor *src_, scalar_
THCudaCheck(cudaGetLastError());
}

void THCTensor_(fmod)(THCState *state, THCTensor *self_, THCTensor *src_, scalar_t value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome that we could remove this. cc @ngimel, who's been looking at killing TH


The dividend and divisor may contain both for integer and floating point
numbers. The remainder has the same sign as the dividend :attr:`input`.
Applies C++'s `std::fmod <https://en.cppreference.com/w/cpp/numeric/math/fmod>`_
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great intro

@@ -3288,6 +3288,11 @@ def merge_dicts(*dicts):
on both CPU and GPU; raises ``RuntimeError`` for integer division by
zero on CPU; Integer division by zero on GPU may return any value.

.. note::

Complex inputs are not supported. In some cases, it is not mathematically
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool note

def sample_inputs_fmod_remainder(op_info, device, dtype, requires_grad, *, autodiffed=False, **kwargs):
make_arg = partial(make_tensor, dtype=dtype, device=device, requires_grad=requires_grad)

if autodiffed:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice case breakdown

samples = cases + cases_with_tensor_scalar + cases_with_broadcasting # type: ignore[assignment]

def generator():
for shape, arg_other, broadcasts_input in samples:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice unpack

@@ -1773,6 +1778,19 @@ def make_tensor(size, device: torch.device, dtype: torch.dtype, *, low=None, hig
result = torch.repeat_interleave(result, 2, dim=-1)
result = result[..., ::2]

if exclude_zero:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. The way this has been documented is very clear

Copy link
Collaborator
@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job, @krshrimali, this was an interesting PR with multiple parts

FYI there's a long weekend for the "Memorial Day" holiday here in the US, but I hope to get caught up on all PR reviews during that time

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 0c1420a.

deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
Context:
The Error message when `broadcasts_input` is marked incorrectly is uninformative [See Error Currently]
pytorch#57941 (comment)

Error Currently
```
Traceback (most recent call last):
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 326, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 310, in _test_consistency_helper
    variant_forward = variant(cloned,
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 227, in __exit__
    self._raiseFailure("{} not raised".format(exc_name))
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 164, in _raiseFailure
    raise self.test_case.failureException(msg)
AssertionError: RuntimeError not raised
```

Error After PR
```
Traceback (most recent call last):
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 329, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/kshiteej/Pytorch/pytorch_i0_promotion/test/test_ops.py", line 313, in _test_consistency_helper
    variant_forward = variant(cloned,
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 227, in __exit__
    self._raiseFailure("{} not raised".format(exc_name))
  File "/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/unittest/case.py", line 164, in _raiseFailure
    raise self.test_case.failureException(msg)
AssertionError: RuntimeError not raised : inplace variant either allowed resizing or you have marked the sample SampleInput(input=Tensor, args=(tensor([[[ 2.1750, -8.5027, -3.1403, -6.9942,  3.2609],
         [-2.5057, -5.9123, -5.4633,  6.1203, -8.2124],
         [-3.5802, -8.4869, -6.0700,  2.3431, -8.1955],
         [-7.3316,  1.3248, -6.8661,  7.1483, -8.0719],
         [ 4.5977, -4.0448, -6.2044, -2.1314, -8.4956]],

        [[ 3.2769, -8.4360,  1.2826,  7.1749,  4.7653],
         [-0.2816, -2.5997, -4.7659, -3.7814,  3.9704],
         [-2.1778, -3.8117, -6.0276, -0.8423, -5.9646],
         [ 8.6544, -3.0922,  0.2558, -4.9318, -4.7596],
         [ 4.5583,  4.3830,  5.8793,  0.9713, -2.1481]],

        [[-1.0447,  0.9334,  7.6405, -4.8933, -7.4010],
         [ 7.7168, -8.4266, -5.5980, -6.9368,  7.1309],
         [-8.7720, -5.0890, -0.4975,  1.9518,  1.7074],
         [-8.5783,  8.5510, -8.5459, -3.5451,  8.4319],
         [ 8.5052, -8.9149, -6.6298, -1.2750, -5.7367]],

        [[-6.5625,  8.2795, -4.9311,  1.9501, -7.1777],
         [-8.4035,  1.1136, -7.6418, -7.0726, -2.8281],
         [ 4.2668, -0.2883, -6.2246,  2.3396,  1.2911],
         [ 4.6550, -1.9525,  4.4873, -3.8061, -0.8653],
         [-3.4256,  4.4423,  8.2937, -5.3456, -4.2624]],

        [[ 7.6128, -6.3932,  4.7131, -5.4938,  6.4792],
         [-6.5385,  2.4385,  4.5570,  3.7803, -8.3281],
         [-2.9785, -4.4745, -1.1778, -8.9324,  1.3663],
         [ 3.7437,  3.5171, -6.3135, -8.4519, -2.7033],
         [-5.0568, -8.4630, -4.2870, -3.7284, -1.5238]]], device='cuda:0',
       dtype=torch.float32, requires_grad=True),), broadcasts_input=True) incorrectly with `broadcasts_self=True
```

**NOTE**:
Printing the sample looks very verbose and it may be hard to figure out which sample is incorrectly configured if there are multiple samples with similar input shapes.

Two Options to make this error less verbose
* Don't print the sample and just print `inplace variant either allowed resizing or you have marked one of the sample incorrectly with broadcasts_self=True`
* Have some mechanism to name samples which will be printed in the `repr` (which will need extra machinery)

Pull Request resolved: pytorch#58295

Reviewed By: ngimel

Differential Revision: D28627308

Pulled By: mruberry

fbshipit-source-id: b3bdeacac3cf9c0d984f0b85410ecce474291d20
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
See pytorch#54261

cc: mruberry Lezcano kshitij12345

Pull Request resolved: pytorch#57941

Reviewed By: mrshenli

Differential Revision: D28744464

Pulled By: mruberry

fbshipit-source-id: 19847277d4f8d3a39a706c2b3c9eddf0dedcb20c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0