Fix lerp weight type promotion #141117

zeshengzong · 2024-11-20T08:32:46Z

Fixes #140601

Enable promote_inputs_to_common_dtype when tensors not same dtype when invoke lerp function.

For lerp_Tensor

Check whether same dtype of tensors, enable promote if not
Remove type check assert

For lerp_Scalar

Seems already enable promote_inputs_to_common_dtype by default, just remove the type check. Make sure promote behavior consistent with lerp_Tensor

lerp_Scalar get TensorIteratorConfig from here

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 979 to 985 in c37185c

    
           #define BINARY_OP_CONFIG()                              \ 
        
             TensorIteratorConfig()                                \ 
        
               .set_check_mem_overlap(true)                        \ 
        
               .allow_cpu_scalars(true)                            \ 
        
               .promote_inputs_to_common_dtype(true)               \ 
        
               .cast_common_dtype_to_outputs(true)                 \ 
        
               .enforce_safe_casting_to_output(true)               \

Test Result
Test case in issue passed

>>> import torch
>>> 
>>> x = torch.ones(2, 2, dtype=torch.float64)
>>> w = torch.ones(2, 2, dtype=torch.float64)
>>> s = torch.tensor(2.2) 
>>> x.lerp_(w, s)
tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

>>> x = torch.ones(2, 2, dtype=torch.float16)
>>> w = torch.ones(2, 2, dtype=torch.float16)
>>> s = torch.tensor(2.2) 
>>> x.lerp_(w, s)
tensor([[1., 1.],
        [1., 1.]], dtype=torch.float16)

$ pytest test/test_binary_ufuncs.py -k 'test_lerp_tensor_type_promotion or test_lerp_scalar_type_promotion'

$ lintrunner

cc @janeyx99

pytorch-bot · 2024-11-20T08:32:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141117

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit af28e9e with merge base b5655d9 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / linux-focal-py3.9-clang10 / test (dynamo_wrapped, 3, 3, linux.2xlarge) (gh) (disabled by #131089)
nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_forward_swap_True
pull / linux-focal-py3.9-clang10-onnx / test (default, 1, 2, linux.2xlarge) (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 1
trunk / macos-py3-arm64 / test (default, 1, 3, macos-m1-stable) (gh) (disabled by #131082 but the issue was closed recently and a rebase is needed to make it pass)
export/test_export_training_ir_to_run_decomp.py::TrainingIRToRunDecompExportTestExport::test_slice_with_floordiv_training_ir_to_decomp
trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable) (gh) (disabled by #131082, #131083, #131088, #131101, #131119, #131136, #138675, #138884 but the issue was closed recently and a rebase is needed to make it pass)
export/test_export.py::TestExport::test_slice_with_floordiv

This comment was automatically generated by Dr. CI and updates every 15 minutes.

janeyx99 · 2024-12-06T20:00:37Z

@zeshengzong thanks for taking a stab at this! lmk when the PR is ready for review--i've triggered the CI based on your current changes.

zeshengzong · 2024-12-12T01:49:09Z

Sorry for late reply, still have some small inconsistent behavior need to confirm, I will open for review after all works fine.

About the promotion I think currently only works for weight param promote to same type as input, but not the other way around. Does this make sense? Or all tensors should promote to same type?

torch.lerp(input, end, weight, *, out=None)

Thanks! @janeyx99

janeyx99 · 2024-12-12T15:21:36Z

Ah great question--I am noticing now that this PR would not fix the issue at hand. The issue is referring to the Scalar weight overload (not the Tensor overload). So I would anticipate changes to the CUDA and CPP impls relating to lerp_kernel_scalar_weight in the codebase and no changes to the Tensor weight signatures.

Thanks for asking for clarification!

zeshengzong · 2024-12-17T02:52:51Z

@janeyx99 Hello, please help me trigger CI and review the change when available, thanks!

test/test_binary_ufuncs.py

aten/src/ATen/native/Lerp.cpp

zeshengzong · 2025-01-02T06:09:59Z

@janeyx99 Hello, please review new updates, thanks!

janeyx99 · 2025-01-02T19:40:59Z

CI failures are real

zeshengzong · 2025-01-03T09:22:40Z

@pytorchbot rebase

pytorchmergebot · 2025-01-03T09:24:07Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-01-03T09:24:11Z

Successfully rebased fix/aten/lerp onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix/aten/lerp && git pull --rebase)

test/test_binary_ufuncs.py

aten/src/ATen/native/Lerp.cpp

torch/_meta_registrations.py

aten/src/ATen/native/Lerp.cpp

zeshengzong · 2025-01-08T08:25:56Z

There were recent changes to enable lerp to have CPU scalar tensors with CUDA. I would expect this change to reuse that path as much as possible.

After setting promote_inputs_to_common_dtype, weight scalar tensor promotion will be handled by TensorIteratorBase::compute_types logic.

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 543 to 547 in aaf5615

    
           if (config.promote_inputs_to_common_dtype_ && !op.is_output && op.current_dtype != common_dtype_) { 
        
             op.exchange_tensor(c10::MaybeOwned<TensorBase>::owned(op.tensor().to(common_dtype_))); 
        
             op.current_dtype = common_dtype_; 
        
             op.target_dtype = common_dtype_; 
        
           }

Seems no more extra logic needed in kernel as lerp to have CPU scalar tensors with CUDA

Hi @janeyx99, please check whether current implement works, thanks!

janeyx99

Thnaks for looking into this! This approach looks better :)

One nit, and waiting on CI

janeyx99 · 2025-01-09T17:27:18Z

aten/src/ATen/native/Lerp.cpp

+  bool promote_weight = weight.dim() == 0 && self.dtype() != weight.dtype();
+  if (!promote_weight) {


Suggested change

bool promote_weight = weight.dim() == 0 && self.dtype() != weight.dtype();

if (!promote_weight) {

bool promote_weight = weight.dim() == 0

if (!promote_weight) {

what if we don't duplicate the dtype check?

Originally I write like this

if (weight.dim() != 0) { TORCH_CHECK(self.dtype() == weight.dtype(), "expected dtype ", self.dtype(), " for `weight` but got dtype ", weight.dtype()); } build(at::TensorIteratorConfig() .allow_cpu_scalars(true) .promote_inputs_to_common_dtype(weight.dim() == 0 && self.dtype() != weight.dtype())

Introduce a variable promote_weight, I think it might be easier for others to get lerp promote standard when reading this code, their reading experience like

promote standard is weight.dim() == 0 && self.dtype() != weight.dtype() if not promote need check dtype equal do weight promote if match promote standard

In each version dtype check twice, cause we need to avoid promotion in other cases, will change out behavior if set .promote_inputs_to_common_dtype(true) directly.

If original version is better(withtout promote_weight variable), I can change it back. Thanks!

But why don’t we just promote inputs to common dtype whenever weight is a scalar (no dtype check)?

the previous checks should prevent the other inputs from getting promoted right?

Hi, after some tests find out that if we have code like this, without check dtype when set promote_inputs_to_common_dtype

if (weight.dim() != 0) { TORCH_CHECK(self.dtype() == weight.dtype(), "expected dtype ", self.dtype(), " for `weight` but got dtype ", weight.dtype()); } build(at::TensorIteratorConfig() .allow_cpu_scalars(true) .promote_inputs_to_common_dtype(weight.dim() == 0) .add_output(maybe_get_output()) .add_const_input(self) .add_const_input(end) .add_const_input(weight)); }

In the case all inputs are same type, but not out param, will cause lerp has different behavior as before

import torch a=torch.tensor([[ 0.5385, 8.4653, -8.5042, -6.7041, 0.9973], [-1.5006, 5.3119, -7.8279, -8.0691, 0.9812], [ 2.6690, 1.3635, -3.8211, 1.0685, 5.1207], [-8.9332, -5.6855, 4.2723, 8.9549, -6.6269], [ 8.5845, -4.1670, 6.6996, -2.9766, -7.7093]], device='cuda:0') b=torch.tensor([ 8.2217, 3.2300, 7.5432, -5.6094, -0.2661], device='cuda:0') c=torch.rand((), dtype=torch.float, device='cuda:0') d=torch.tensor([[-1, 5, 7, -4, -7], [-5, 3, -5, -5, -4], [ 9, 2, -3, 4, -9], [ 7, -6, 7, 1, 5], [ 2, -5, -2, -1, -8]], dtype=torch.long, device='cuda:0') # Before change will raise error about out=d >>> torch.lerp(a,b,c,out=d) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: Found dtype Long but expected Float # After change it will pass, and get the result seems wrong >>> torch.lerp(a,b,c,out=d) tensor([[ 7, 3, 6, -5, 0], [ 7, 3, 7, -5, 0], [ 8, 3, 7, -5, 0], [ 7, 2, 7, -5, 0], [ 8, 2, 7, -5, 0]], device='cuda:0')

I think it better to keep check out param behavior not change when dim==0, so I add check dtype when set promote_inputs_to_common_dtype. Thanks!

Ah, thanks for the clarification, and I totally agree with you, but the code currently does not fix the concern completely! For example, the new code would NOT error for the following:

a=torch.tensor([[ 0.5385, 8.4653, -8.5042, -6.7041, 0.9973], [-1.5006, 5.3119, -7.8279, -8.0691, 0.9812], [ 2.6690, 1.3635, -3.8211, 1.0685, 5.1207], [-8.9332, -5.6855, 4.2723, 8.9549, -6.6269], [ 8.5845, -4.1670, 6.6996, -2.9766, -7.7093]], device='cuda:0') b=torch.tensor([ 8.2217, 3.2300, 7.5432, -5.6094, -0.2661], device='cuda:0') c=torch.rand((), dtype=torch.double, device='cuda:0') # changed this line d=torch.tensor([[-1, 5, 7, -4, -7], [-5, 3, -5, -5, -4], [ 9, 2, -3, 4, -9], [ 7, -6, 7, 1, 5], [ 2, -5, -2, -1, -8]], dtype=torch.long, device='cuda:0')

even if it would beforehand.

So it'd be more important to figure out where the code is promoting out + be more precise even still.

Hello, I've removed repeat weight.dtype check, found enforce_safe_casting_to_output flag will guard out.dtype avoid wrong output in above use case. Check code is here:

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 508 to 512 in 1ce5338

if (config.enforce_safe_casting_to_output_ && op.is_output && op.current_dtype != common_dtype_) {

TORCH_CHECK(canCast(common_dtype_, op.current_dtype),

"result type ", common_dtype_, " can't be cast to the "

"desired output type ", op.current_dtype);

}

Will raise error:

import torch a=torch.tensor([[ 0.5385, 8.4653, -8.5042, -6.7041, 0.9973], [-1.5006, 5.3119, -7.8279, -8.0691, 0.9812], [ 2.6690, 1.3635, -3.8211, 1.0685, 5.1207], [-8.9332, -5.6855, 4.2723, 8.9549, -6.6269], [ 8.5845, -4.1670, 6.6996, -2.9766, -7.7093]], device='cuda:0') b=torch.tensor([ 8.2217, 3.2300, 7.5432, -5.6094, -0.2661], device='cuda:0') c=torch.rand((), dtype=torch.double, device='cuda:0') # changed this line d=torch.tensor([[-1, 5, 7, -4, -7], [-5, 3, -5, -5, -4], [ 9, 2, -3, 4, -9], [ 7, -6, 7, 1, 5], [ 2, -5, -2, -1, -8]], dtype=torch.long, device='cuda:0') torch.lerp(a,b,c,out=d) print(d.dtype) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: result type Float can't be cast to the desired output type Long

But in the case out can be safely promote, the behavior is different as before:

import torch a=torch.tensor([[ 0.5385, 8.4653, -8.5042, -6.7041, 0.9973], [-1.5006, 5.3119, -7.8279, -8.0691, 0.9812], [ 2.6690, 1.3635, -3.8211, 1.0685, 5.1207], [-8.9332, -5.6855, 4.2723, 8.9549, -6.6269], [ 8.5845, -4.1670, 6.6996, -2.9766, -7.7093]], device='cpu', dtype=torch.double) b=torch.tensor([ 8.2217, 3.2300, 7.5432, -5.6094, -0.2661], device='cpu', dtype=torch.double) c=torch.rand((), dtype=torch.double, device='cpu') d=torch.tensor([[-1, 5, 7, -4, -7], [-5, 3, -5, -5, -4], [ 9, 2, -3, 4, -9], [ 7, -6, 7, 1, 5], [ 2, -5, -2, -1, -8]], dtype=torch.float, device='cpu') torch.lerp(a,b,c,out=d) print(d.dtype) # Before, raise error Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: Found dtype Float but expected Double # After, no error torch.float32

As out usage description in here

A "safe copy" is different from PyTorch's regular copy. For operations that do not participate in type promotion the device and dtype of the source and destination tensors must match. For operations that do participate in type promotion the copy can be to a different dtype, but the destination of the copy cannot be a lower "type kind" than the source. PyTorch has four type kinds: boolean, integer, float, and complex, in that order. So, for example, an operation like add (which participates in type promotion) will throw a runtime error if given float inputs but an integer out= tensor.

Since weight.dim==0 will do the promotion now, I think such change in out behavior is consistent with the description.

Please check whether this works, thanks!

janeyx99 · 2025-01-09T19:13:43Z

Ah it looks like our CI is going through some infra issues, could you rebase as well?

zeshengzong · 2025-01-10T01:49:24Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-01-10T01:50:47Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>

janeyx99 · 2025-01-22T17:03:47Z

aten/src/ATen/native/Lerp.cpp

  build(at::TensorIteratorConfig()
        .allow_cpu_scalars(true)
+        .promote_inputs_to_common_dtype(promote_weight)
+        .enforce_safe_casting_to_output(promote_weight)
+        .cast_common_dtype_to_outputs(promote_weight)


what does this do?

Hello,

promote_inputs_to_common_dtype enable type promotion instead of raise error directly, main function used to fix the issue.

enforce_safe_casting_to_output will do check on out.dtype with others, make sure canCast to out.dtype (which guard use-case like out.dtype=torch.long inconsistent with other param like above)

For operations that do participate in type promotion the copy can be to a different dtype, but the destination of the copy cannot be a lower "type kind" than the source. PyTorch has four type kinds: boolean, integer, float, and complex, in that order.

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 508 to 512 in d95a6ba

if (config.enforce_safe_casting_to_output_ && op.is_output && op.current_dtype != common_dtype_) {

TORCH_CHECK(canCast(common_dtype_, op.current_dtype),

"result type ", common_dtype_, " can't be cast to the "

"desired output type ", op.current_dtype);

}

cast_common_dtype_to_outputs in cpu device for creating temp tensor to cast output,

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 516 to 540 in d95a6ba

if (common_device == kCPU) {

// Casts to outputs by creating temporaries of the correct dtype (if needed)

// NB: we skip this on is_meta_, because the temporary allocation here is

// unnecessary if we aren't going to actually do the compute

if (config.cast_common_dtype_to_outputs_ && op.is_output && op.current_dtype != common_dtype_ && !is_meta_) {

TORCH_INTERNAL_ASSERT(op.tensor_base().defined());

// Marker [Output original_tensor is set]

// NB: do NOT use set_output here, as the temporary is NOT a true output;

// op.tensor is the true output and it was pre-provided for us.

// TODO: The logic for cast_outputs will need to be handled by the

// structured kernels implementation. What probably should happen

// is that we pass in the inferred dtype into the out kernel, and

// then after calling the out kernel, do the conversion (which

// is cast_outputs here), but integrating this with existing

// TensorIterator will take a little doing

op.exchange_tensor(c10::MaybeOwned<TensorBase>::owned(

at::empty_like(op.tensor(),

op.tensor_base().options().dtype(common_dtype_),

LEGACY_CONTIGUOUS_MEMORY_FORMAT)));

if (!names_.empty()) {

namedinference::propagate_names(op.tensor_base(), names_);

}

op.current_dtype = common_dtype_;

op.target_dtype = common_dtype_;

}

For other ops support promotion by setting all these flags to true, like lerp_Scalar (add, sub, mul, ... as well) use macros in here

pytorch/aten/src/ATen/TensorIterator.cpp

Lines 979 to 1000 in d95a6ba

#define BINARY_OP_CONFIG() \

TensorIteratorConfig() \

.set_check_mem_overlap(true) \

.allow_cpu_scalars(true) \

.promote_inputs_to_common_dtype(true) \

.cast_common_dtype_to_outputs(true) \

.enforce_safe_casting_to_output(true) \

void TensorIteratorBase::build_binary_op(const TensorBase& out, const TensorBase& a, const TensorBase& b) {

build(BINARY_OP_CONFIG()

.add_owned_output(out)

.add_owned_const_input(a)

.add_owned_const_input(b));

}

void TensorIteratorBase::build_borrowing_binary_op(

const TensorBase& out, const TensorBase& a, const TensorBase& b) {

build(BINARY_OP_CONFIG()

.add_output(out)

.add_const_input(a)

.add_const_input(b));

}

The default value of flags is False, currently only make them work when weight.dim == 0. Thanks!

janeyx99

Yes! This approach def looks the best of all! Just one more q

janeyx99

Yes! This approach def looks the best of all! Just one more q

janeyx99

Yes! This approach def looks the best of all! Just one more q

janeyx99

Yes! This approach def looks the best of all! Just one more q

janeyx99 · 2025-01-22T17:04:55Z

Approving workflows to see what CI thinks

janeyx99

thanks for the diligence!

janeyx99 · 2025-01-23T19:33:13Z

@pytorchbot merge

pytorchmergebot · 2025-01-23T19:35:20Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

zeshengzong · 2025-01-24T01:29:09Z

@janeyx99 Thank you for your time and patience! 🎉

Fixes pytorch#140601 Enable `promote_inputs_to_common_dtype` when tensors not same dtype when invoke `lerp` function. For `lerp_Tensor` - Check whether same `dtype` of tensors, enable promote if not - Remove type check assert For `lerp_Scalar` - Seems already enable `promote_inputs_to_common_dtype` by default, just remove the type check. Make sure promote behavior consistent with `lerp_Tensor` `lerp_Scalar` get TensorIteratorConfig from here https://github.com/pytorch/pytorch/blob/c37185c76ae4068899869e48a8388e78437508e8/aten/src/ATen/TensorIterator.cpp#L979-L985 **Test Result** Test case in issue passed ```python >>> import torch >>> >>> x = torch.ones(2, 2, dtype=torch.float64) >>> w = torch.ones(2, 2, dtype=torch.float64) >>> s = torch.tensor(2.2) >>> x.lerp_(w, s) tensor([[1., 1.], [1., 1.]], dtype=torch.float64) >>> x = torch.ones(2, 2, dtype=torch.float16) >>> w = torch.ones(2, 2, dtype=torch.float16) >>> s = torch.tensor(2.2) >>> x.lerp_(w, s) tensor([[1., 1.], [1., 1.]], dtype=torch.float16) ``` ```bash $ pytest test/test_binary_ufuncs.py -k 'test_lerp_tensor_type_promotion or test_lerp_scalar_type_promotion' ``` ![image](https://github.com/user-attachments/assets/288a5294-a9ee-47f3-bbf7-d4ff986f3ba8) ```bash $ lintrunner ``` ![image](https://github.com/user-attachments/assets/d469836f-5c49-4d89-a2fd-379cad4db3af) Pull Request resolved: pytorch#141117 Approved by: https://github.com/janeyx99 Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>

vadimkantorov · 2025-04-25T12:01:20Z

@janeyx99 Does this then fix this?

torch.lerp to support argument type promotion / broadcasting - including of input / end arguments #57947

Or does it only fix weight type promotion without start/end input type promotion?

Could it somehow reuse the kernel / type promotion from torch.where (or even uinfy the two)? The difference is that torch.where only accepts a bool mask and torch.lerp a float [0; 1] mask

which asked for torch.where-like type promotion, including accepting python scalars as one of bounds and also working with uint8/integral inputs (useful for interpolating between two uint8 images e.g. with a matting bool/float mask) and still producing uint8 image outputs.

Also, does lerp support BoolTensor weight? then I guess it could decay to doing torch.where

It's also useful to allow broadcasting between start / end

janeyx99 · 2025-04-25T17:49:57Z

@vadimkantorov Your observation is correct that this PR does not allow type promotion between start and end, as they're not scalar tensors. I would think torch.where promotion would be different (as the true vs false branches do not interact with each other and the start and end here do), so further discussion should go in the original issue.

And yes, lerp supports bool tensor weights, which does reduce it to a torch.where in essence, though most lerp use cases I'd imagine would not have such a binary weight.

vadimkantorov · 2025-04-25T17:53:15Z

as they're not scalar tensors

Sometimes supporting constant python scalar as start/end is also useful.

I'd imagine would not have such a binary weight.

True, but given the amount of corner cases in promotion, I wonder if the torch.where/torch.lerp kernels could be unified or made templated...

janeyx99 · 2025-04-25T17:59:11Z

Sometimes supporting constant python scalar as start/end is also useful.

How come? Why wouldn't someone just use the python scalars as is and escape a kernel launch? Is it mostly for compile capturing?

Unifying lerp and where doesn't seem to make that much sense to me as their underlying implementations are quite different and their main similarity is just that they're don't fit under our classic binary or unary op buckets.

vadimkantorov · 2025-04-25T19:23:52Z

For python scalars I mean usage like so:

foreground_mask = torch.rand(16, 16)
image = torch.randint(0, 256, (16, 16), dtype = torch.uint8)
torch.lerp(image, 255, foreground_mask)

# TypeError: lerp() received an invalid combination of arguments - got (Tensor, int, Tensor), but expected one of:
#  * (Tensor input, Tensor end, Tensor weight, *, Tensor out = None)
#  * (Tensor input, Tensor end, Number weight, *, Tensor out = None)

image1 = torch.randint(0, 256, (16, 16), dtype = torch.uint8)
image2 = torch.randint(0, 256, (16, 16), dtype = torch.uint8)
torch.lerp(image1, image2, 0.5)
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# RuntimeError: "lerp_kernel_scalar" not implemented for 'Byte'

as their underlying implementations are quite different

Oh, I see :( I had assumed they are similar because they both have to read from both passed arguments (named input/end for torch.lerp and input/other torch.where) and then modulate the output given the read mask (named weight for torch.lerp and condition for torch.where)

vadimkantorov · 2025-04-26T10:01:35Z

@janeyx99 pasted these examples also into:

torch.lerp to support argument type promotion / broadcasting - including of input / end arguments #57947 (comment)

currently python scalar as input / end are not supported as the only overloads are

 * (Tensor input, Tensor end, Tensor weight, *, Tensor out = None)
 * (Tensor input, Tensor end, Number weight, *, Tensor out = None)

also, seems there are no kernels for integral tensors (useful for streamlining lerp / blending for uint8 images, int16 audio, uint16 images)

pytorchbot added the open source label Nov 20, 2024

zeshengzong force-pushed the fix/aten/lerp branch from 0f5ea29 to 8b29276 Compare November 21, 2024 01:40

janeyx99 self-requested a review December 6, 2024 19:58

janeyx99 added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module release notes: python_frontend python frontend release notes category topic: improvements topic category labels Dec 6, 2024

zeshengzong force-pushed the fix/aten/lerp branch from 23a00fc to 2b89afe Compare December 13, 2024 07:11

zeshengzong marked this pull request as ready for review December 13, 2024 07:25

zeshengzong requested a review from mruberry as a code owner December 13, 2024 07:25

janeyx99 reviewed Dec 19, 2024

View reviewed changes

pytorchmergebot force-pushed the fix/aten/lerp branch from 784dd96 to b4c2e5c Compare January 3, 2025 09:24

zeshengzong force-pushed the fix/aten/lerp branch from b4c2e5c to a715af2 Compare January 3, 2025 09:29

janeyx99 reviewed Jan 3, 2025

View reviewed changes

test/test_binary_ufuncs.py Outdated Show resolved Hide resolved

janeyx99 reviewed Jan 6, 2025

View reviewed changes

aten/src/ATen/native/Lerp.cpp Show resolved Hide resolved

torch/_meta_registrations.py Show resolved Hide resolved

zeshengzong commented Jan 8, 2025

View reviewed changes

aten/src/ATen/native/Lerp.cpp Show resolved Hide resolved

janeyx99 reviewed Jan 9, 2025

View reviewed changes

zeshengzong and others added 3 commits January 21, 2025 15:19

Update

7b7325d

Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>

Update

f28c3f6

Update

af28e9e

zeshengzong force-pushed the fix/aten/lerp branch from c9d483b to af28e9e Compare January 22, 2025 02:00

janeyx99 reviewed Jan 22, 2025

View reviewed changes

janeyx99 approved these changes Jan 23, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 23, 2025

pytorchmergebot added the merging label Jan 23, 2025

pytorch-bot bot temporarily deployed to upload-benchmark-results January 23, 2025 20:03 Inactive

pytorchmergebot added the Merged label Jan 24, 2025

pytorchmergebot closed this in 54e2f4b Jan 24, 2025

pytorchmergebot removed the merging label Jan 24, 2025

	#define BINARY_OP_CONFIG() \
	TensorIteratorConfig() \
	.set_check_mem_overlap(true) \
	.allow_cpu_scalars(true) \
	.promote_inputs_to_common_dtype(true) \
	.cast_common_dtype_to_outputs(true) \
	.enforce_safe_casting_to_output(true) \

		bool promote_weight = weight.dim() == 0 && self.dtype() != weight.dtype();
		if (!promote_weight) {

	if (config.enforce_safe_casting_to_output_ && op.is_output && op.current_dtype != common_dtype_) {
	TORCH_CHECK(canCast(common_dtype_, op.current_dtype),
	"result type ", common_dtype_, " can't be cast to the "
	"desired output type ", op.current_dtype);
	}

	if (common_device == kCPU) {
	// Casts to outputs by creating temporaries of the correct dtype (if needed)
	// NB: we skip this on is_meta_, because the temporary allocation here is
	// unnecessary if we aren't going to actually do the compute
	if (config.cast_common_dtype_to_outputs_ && op.is_output && op.current_dtype != common_dtype_ && !is_meta_) {
	TORCH_INTERNAL_ASSERT(op.tensor_base().defined());
	// Marker [Output original_tensor is set]
	// NB: do NOT use set_output here, as the temporary is NOT a true output;
	// op.tensor is the true output and it was pre-provided for us.
	// TODO: The logic for cast_outputs will need to be handled by the
	// structured kernels implementation. What probably should happen
	// is that we pass in the inferred dtype into the out kernel, and
	// then after calling the out kernel, do the conversion (which
	// is cast_outputs here), but integrating this with existing
	// TensorIterator will take a little doing
	op.exchange_tensor(c10::MaybeOwned<TensorBase>::owned(
	at::empty_like(op.tensor(),
	op.tensor_base().options().dtype(common_dtype_),
	LEGACY_CONTIGUOUS_MEMORY_FORMAT)));
	if (!names_.empty()) {
	namedinference::propagate_names(op.tensor_base(), names_);
	}
	op.current_dtype = common_dtype_;
	op.target_dtype = common_dtype_;
	}

Fix lerp weight type promotion #141117

Fix lerp weight type promotion #141117

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141117

✅ You can merge normally! (4 Unrelated Failures)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!