[Nested Tensor] Support NT construction inside PT2 graph #118446

davidberard98 · 2024-01-27T00:35:09Z

🚀 The feature, motivation and pitch

Repro:

import torch
from torch.nested._internal.nested_tensor import ViewNestedFromBuffer, buffer_from_jagged

def fn(values, offsets):
    # return values.cos().sin()
    return ViewNestedFromBuffer.apply(values.cos(), offsets).sin()

fn_c = torch.compile(fn, backend="aot_eager")
# values = torch.rand((12, 8), requires_grad=True)
values = torch.rand((12, 8))
offsets = torch.tensor([0, 1, 2, 5, 8, 9, 12])
lengths = torch.tensor([1, 1, 3, 3, 1, 3])

nt = ViewNestedFromBuffer.apply(values, offsets)
fn_c(values, offsets)

ViewNestedFromBuffer hits skipfiles Using autograd.Functions defined in torch/ cause graph breaks #118334
Dynamic shapes doesn't really work with this, if any part of the values tensor has dynamic shapes in the non-batch dimensions:
- If values is dynamic, then when we construct self._strides = (ragged_size * stride[self._ragged_idx - 1], *stride): stride is values.stride(); if stride[self._ragged_idx - 1] is symbolic, then ragged_size * stride[..] is a multiplication of a singleton symbolic symint and a normal python symbolic symint; this is not supported.
Symbolic strides don't work correctly
- __tensor_unflatten__ needs values to be symbolic in order to identify that we can update the _tensor_symint_registry. When a NT is an input, we know about dynamism properties because of mark_dynamic in the NT constructor. However, when the NT is constructed inside, we don't know about those dynamism properties until after we trace through (and we probably need to check trace order etc. to make sure the dynamism is marked at a point early enough...)
  - Why do we actually need these to be symbolic to do the jaggedness stuff?
- Suggestion from Jeffrey and Joel: let's just use values instead of offsets, and construct the symbolic int stuff before tracing starts
  - I think we'll still hit the same issue, we need values to be partially dynamic.

Alternatives

No response

Additional context

No response

cc @cpuhrsch @jbschlosser @bhosmer @drisspg @soulitzer @ezyang @msaroufim @bdhirsh @anijain2305 @zou3519 @chauhang

The text was updated successfully, but these errors were encountered:

jbschlosser · 2024-01-29T20:44:05Z

Why do we actually need these to be symbolic to do the jaggedness stuff?

This is a good question. Symbolic SymInts support more operations than non-symbolic SingletonSymInts do (e.g. multiplication). It might be theoretically possible to add any support needed by PT2 downstream to non-symbolic SingletonSymInts; it's just a decent amount of work.

cc @soulitzer in case there he is aware of any theoretical issues preventing us from adding this support

davidberard98 · 2024-01-29T21:38:36Z

@jbschlosser what I meant was, whether we can do this: #118577 (still waiting on CI to see if anything fails...)

(edit: the PR has changed since when I originally wrote the comment... originally it just removed the check for symbolic sizes, but that fails; now the PR is testing other stuff)

… from inputs" Creating symbolic nested ints within the graph is difficult. Using unbacked symints should solve the most important(?) cases in the mean time. See #118446 Known gaps: - creating NJT from intermediate offsets (offsets created within the graph, as opposed to being offsets passed in as inputs) - when the same offsets is also passed in as a input to the graph. We are not smart enough to realize that the offsets from that input is the same and therefore would fail when the sizes are compare ("s0 cannot be compared with u0") cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang [ghstack-poisoned]

…ph using offsets from inputs" Creating symbolic nested ints within the graph is difficult. Using unbacked symints should solve the most important(?) cases in the mean time. See #118446 Known gaps: - creating NJT from intermediate offsets (offsets created within the graph, as opposed to being offsets passed in as inputs) - when the same offsets is also passed in as a input to the graph. We are not smart enough to realize that the offsets from that input is the same and therefore would fail when the sizes are compare ("s0 cannot be compared with u0") cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang [ghstack-poisoned]

… from inputs" Creating symbolic nested ints within the graph is difficult. Using unbacked symints should solve the most important(?) cases in the mean time. See #118446 Known gaps: - creating NJT from intermediate offsets (offsets created within the graph, as opposed to being offsets passed in as inputs) - when the same offsets is also passed in as a input to the graph. We are not smart enough to realize that the offsets from that input is the same and therefore would fail when the sizes are compare ("s0 cannot be compared with u0") cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang [ghstack-poisoned]

…ph using offsets from inputs" Creating symbolic nested ints within the graph is difficult. Using unbacked symints should solve the most important(?) cases in the mean time. See #118446 Known gaps: - creating NJT from intermediate offsets (offsets created within the graph, as opposed to being offsets passed in as inputs) - when the same offsets is also passed in as a input to the graph. We are not smart enough to realize that the offsets from that input is the same and therefore would fail when the sizes are compare ("s0 cannot be compared with u0") cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang [ghstack-poisoned]

… from inputs" Creating symbolic nested ints within the graph is difficult. Using unbacked symints should solve the most important(?) cases in the mean time. See #118446 Known gaps: - creating NJT from intermediate offsets (offsets created within the graph, as opposed to being offsets passed in as inputs) - when the same offsets is also passed in as a input to the graph. We are not smart enough to realize that the offsets from that input is the same and therefore would fail when the sizes are compare ("s0 cannot be compared with u0") cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang [ghstack-poisoned]

IvanKobzarev · 2024-09-13T10:17:57Z

Doing issue scrapping. The repro does not fail, closing.

davidberard98 self-assigned this Jan 27, 2024

davidberard98 changed the title ~~[Nested Tensor] Support NT construction inside graph~~ [Nested Tensor] Support NT construction inside PT2 graph Jan 27, 2024

colesbury added module: nestedtensor NestedTensor tag see issue #25032 oncall: pt2 labels Jan 29, 2024

mlazos added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 30, 2024

yf225 added the feature A request for a proper, new feature. label Mar 29, 2024

soulitzer mentioned this issue Apr 22, 2024

[NJT] Allow construction of NJT within graph using offsets from inputs #124624

Draft

jbschlosser mentioned this issue May 6, 2024

[torch.compile][jagged nested tensor creation] AttributeError: 'torch._C._SymNode' object has no attribute 'expr' #125560

Closed

jbschlosser mentioned this issue May 15, 2024

[NT] Implementing Multi-Head Attention with NestedTensors #125214

Closed

IvanKobzarev closed this as completed Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Nested Tensor] Support NT construction inside PT2 graph #118446

[Nested Tensor] Support NT construction inside PT2 graph #118446

Uh oh!

Uh oh!

Uh oh!

[Nested Tensor] Support NT construction inside PT2 graph #118446

[Nested Tensor] Support NT construction inside PT2 graph #118446

Comments

Uh oh!

🚀 The feature, motivation and pitch

Alternatives

Additional context

Uh oh!

Uh oh!

Uh oh!

Uh oh!