E5DF Update torch::stable::Tensor() default constructor by mikaylagawarecki · Pull Request #159507 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update on "Update torch::stable::Tensor() default constructor"
Allows things like

```cpp
Tensor cu_seqlens_q;
if (...) {
   cu_seqlens_q = ...
} 
...
```

Also adds `torch::stable::Tensor.defined()`




[ghstack-poisoned]
  • Loading branch information
mikaylagawarecki committed Aug 8, 2025
commit daef220b3d1c17bf3b37213f2318d01137a12b51
Original file line number Diff line number Diff line change
Expand Up @@ -166,13 +166,13 @@ def fill_infinity(t) -> Tensor:
return torch.ops.libtorch_agnostic.fill_infinity.default(t)


def test_default_constructor(undefined) -> bool:
def test_default_constructor(defined) -> bool:
"""
Tests the default constructor for torch::stable::Tensor.

Args:
undefined: bool - if True, tests defined tensor; if False, tests undefined tensor
defined: bool - if True, tests defined tensor; if False, tests undefined tensor

Returns: bool - result of calling .defined() on the tensor
"""
return torch.ops.libtorch_agnostic.test_default_constructor.default(undefined)
return torch.ops.libtorch_agnostic.test_default_constructor.default(defined)
Original file line number Diff line number Diff line change
Expand Up @@ -222,11 +222,52 @@ def test_fill_infinity(self, device):
def test_default_constructor(self):
import libtorch_agnostic

is_defined_defined = libtorch_agnostic.ops.test_default_constructor(True)
self.assertTrue(is_defined_defined)
defined_tensor_is_defined = libtorch_agnostic.ops.test_default_constructor(
True
)
self.assertTrue(defined_tensor_is_defined)

undefined_tensor_is_defined = (
libtorch_agnostic.ops.test_default_constructor(False)
)
self.assertFalse(undefined_tensor_is_defined)

@onlyCPU
def test_default_constructor_memory_cleanup(self):
import subprocess
import sys

import libtorch_agnostic

def run_in_subprocess():
import gc
import os

import psutil

process = psutil.Process(os.getpid())
gc.collect()

initial_memory = process.memory_info().rss

for _ in range(100):
libtorch_agnostic.ops.test_default_constructor(False)

final_memory = process.memory_info().rss
memory_increase = final_memory - initial_memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah a way to test CPU memory! how confident are you that this test is testing the right thing? This looks okay to me but I am not an expert

Copy link
Contributor Author
@mikaylagawarecki mikaylagawarecki Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried creating a list of tensors on CPU and the reserved set size increased, so I felt somewhat confident this was doing the right thing

cc @albanD do you know :P

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will most likely be flaky as we don't control what else might grab cpu memory.
I would recommend running this test on cuda and check for cuda memory which is very reliable for this kind of things.
Don't forget to add @serialTest() decorator to avoid cross polution of course ;)

Copy link
Contributor Author
@mikaylagawarecki mikaylagawarecki Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is testing that the following code doesn't memory leak

stable::Tensor a;

I don't think we can make this use any cuda memory unfortunately, should we just remove the test? @janeyx99


return memory_increase

# Run the memory test in a subprocess
result = subprocess.run(
[sys.executable, "-c", f"print({run_in_subprocess()})"],
capture_output=True,
text=True,
)

is_undefined_defined = libtorch_agnostic.ops.test_default_constructor(False)
self.assertFalse(is_undefined_defined)
# Check the result from the subprocess
memory_increase = int(result.stdout.strip())
self.assertEqual(memory_increase, 0)

instantiate_device_type_tests(TestLibtorchAgnostic, globals(), except_for=None)

Expand Down
8 changes: 5 additions & 3 deletions torch/csrc/stable/tensor.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,13 @@ class Tensor {
std::shared_ptr<AtenTensorOpaque> ath_;

public:
// Construct a stable::Tensor with an uninitialized AtenTensorHandle (ATH)
// Steals ownership from the ATH
Tensor() {
AtenTensorHandle ret;
AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_new_uninitialized_tensor(&ret));
TORCH_ERROR_CODE_CHECK(aoti_torch_new_uninitialized_tensor(&ret));
ath_ = std::shared_ptr<AtenTensorOpaque>(ret, [](AtenTensorHandle ath) {
AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_delete_tensor_object(ath));
TORCH_ERROR_CODE_CHECK(aoti_torch_delete_tensor_object(ath));
});
}

Expand Down Expand Up @@ -123,7 +125,7 @@ class Tensor {

bool defined() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same semantic as the one in TensorBase.h right?

And to make sure I'm understanding correctly: an undefined tensor is an uninitialized tensor which I know has no memory but is it basically a nullptr?

Copy link
Contributor Author
@mikaylagawarecki mikaylagawarecki Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes to the first

AOTITorchError aoti_torch_new_uninitialized_tensor(AtenTensorHandle* ret) {
AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE({
at::Tensor* out_tensor = new at::Tensor();
*ret = tensor_pointer_to_tensor_handle(out_tensor);
});
}

Tensor() = default;

TensorBase() = default;

My understanding is TensorBase() = default constructor would initialize its c10::intrusive_ptr<TensorImpl, UndefinedTensorImpl> impl_ to a nullptr indeed

I think there is a special case where UndefinedTensorImpl is not a nullptr but has defined its bool() method return False when defined() is called, but we're not using that here.

bool defined;
AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_is_defined(ath_.get(), &defined));
TORCH_ERROR_CODE_CHECK(aoti_torch_is_defined(ath_.get(), &defined));
return defined;
}

Expand Down
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.
0