8000 Fix get_source_partitions when weights are tied by yushangdi · Pull Request #142446 · pytorch/pytorch · GitHub

Fix get_source_partitions when weights are tied #142446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

yushangdi wants to merge 1 commit into pytorch:main from yushangdi:export-D66998592

Contributor

yushangdi commented

•

Summary:
Fix #142035 and #143621

When Linear module params are tied to another parameter, like this:

class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b

We get a graph like below:

graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)

Notice that %p_linear_weight : [num_users=2].

When we get source partitions, we should exclude attributes nodes like p_linear_weight from outputs.

A real world example where people do something like this is in #142035.

Test Plan:

 buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied

Differential Revision: D66998592

cc @ezyang @SherlockNoMad @EikanWang @jgong5 @wenzhe-nrv

yushangdi requested review from avikchaudhuri, tugsbayasgalan, zhxchen17, ydwu4 and angelayi as code owners

December 10, 2024 01:29

pytorch-bot bot added the release notes: export label

pytorch-bot bot commented

•

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142446

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 5134ff9 with merge base 45411d1 ():

NEW FAILURE - The following job has failed:

Lint / Test run_test.py is usable without boto3 (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Contributor

facebook-github-bot commented

This pull request was exported from Phabricator. Differential Revision: D66998592

facebook-github-bot added fb-exported fx labels

yushangdi added a commit to yushangdi/pytorch that referenced this pull request


          Fix get_source_partitions when weights are tied (pytorch#142446)

fa70cd0

Summary:

Fix pytorch#142035



When Linear module params are tied to another parameter, like this:

```
class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b
```

We get a graph like below:

```
graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)
```

Notice that ` %p_linear_weight : [num_users=2]`.

When we get source partitions, we should exclude attributes nodes like `p_linear_weight` from outputs.


A real world example where people do something like this is in pytorch#142035.

Test Plan:
```
 buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied
```

Differential Revision: D66998592

yushangdi force-pushed the export-D66998592 branch from a18a945 to fa70cd0 Compare

December 10, 2024 01:36

Contributor

facebook-github-bot commented

This pull request was exported from Phabricator. Differential Revision: D66998592

yushangdi removed request for zhxchen17, avikchaudhuri and tugsbayasgalan

December 10, 2024 18:09

yushangdi force-pushed the export-D66998592 branch from fa70cd0 to 7098027 Compare

December 18, 2024 01:16

yushangdi added a commit to yushangdi/pytorch that referenced this pull request


          Fix get_source_partitions when weights are tied (pytorch#142446)

Summary:

Fix pytorch#142035



When Linear module params are tied to another parameter, like this:

```
class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b
```

We get a graph like below:

```
graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)
```

Notice that ` %p_linear_weight : [num_users=2]`.

When we get source partitions, we should exclude attributes nodes like `p_linear_weight` from outputs.


A real world example where people do something like this is in pytorch#142035.

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied
```

Differential Revision: D66998592

Contributor

facebook-github-bot commented

This pull request was exported from Phabricator. Differential Revision: D66998592

yushangdi added a commit to yushangdi/pytorch that referenced this pull request


          Fix get_source_partitions when weights are tied (pytorch#142446)

4799d59

Summary:

Fix pytorch#142035



When Linear module params are tied to another parameter, like this:

```
class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b
```

We get a graph like below:

```
graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)
```

Notice that ` %p_linear_weight : [num_users=2]`.

When we get source partitions, we should exclude attributes nodes like `p_linear_weight` from outputs.


A real world example where people do something like this is in pytorch#142035.

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied
```

Differential Revision: D66998592

yushangdi force-pushed the export-D66998592 branch from 7098027 to 4799d59 Compare

December 18, 2024 01:30

Contributor

facebook-github-bot commented

This pull request was exported from Phabricator. Differential Revision: D66998592

yushangdi added a commit to yushangdi/pytorch that referenced this pull request


          Fix get_source_partitions when weights are tied (pytorch#142446)

9a2e624

Summary:

Fix pytorch#142035



When Linear module params are tied to another parameter, like this:

```
class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b
```

We get a graph like below:

```
graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)
```

Notice that ` %p_linear_weight : [num_users=2]`.

When we get source partitions, we should exclude attributes nodes like `p_linear_weight` from outputs.


A real world example where people do something like this is in pytorch#142035.


#build_targets_regex[fbcode//bolt/nn/.*]

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied
```

Differential Revision: D66998592

yushangdi force-pushed the export-D66998592 branch from 4799d59 to 9a2e624 Compare

December 18, 2024 18:24

Contributor

facebook-github-bot commented

This pull request was exported from Phabricator. Differential Revision: D66998592

yushangdi mentioned this pull request

[inductor][cpu]text-classification+albert-base-v1 failure in prepare_pt2e #143621

Closed

angelayi approved these changes

View reviewed changes

Contributor

angelayi left a comment

cc @andrewor14 @tarun292 I'm not sure if there's some partitioner relying on this behavior, but it looks like CI is green so far

pytorch-bot bot added the ciflow/trunk label


          Fix get_source_partitions when weights are tied (pytorch#142446)

5134ff9

Summary:

Fix pytorch#142035



When Linear module params are tied to another parameter, like this:

```
class SimpleLinearModel(nn.Module):
    def __init__(self, input_size, output_size):
        super(SimpleLinearModel, self).__init__()
        # Define a linear layer
        self.linear = nn.Linear(input_size, output_size)
        self.tied_weight = self.linear.weight

    def forward(self, x):
        # Forward pass through the linear layer
        b = self.tied_weight + 1
        return self.linear(x), b
```

We get a graph like below:

```
graph():
    %p_tied_weight : [num_users=0] = placeholder[target=p_tied_weight]
    %p_linear_weight : [num_users=2] = placeholder[target=p_linear_weight]
    %p_linear_bias : [num_users=1] = placeholder[target=p_linear_bias]
    %x : [num_users=1] = placeholder[target=x]
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%p_linear_weight, 1), kwargs = {})
    %linear : [num_users=1] = call_function[target=torch.ops.aten.linear.default](args = (%x, %p_linear_weight, %p_linear_bias), kwargs = {})
    return (linear, add)
```

Notice that ` %p_linear_weight : [num_users=2]`.

When we get source partitions, we should exclude attributes nodes like `p_linear_weight` from outputs.


A real world example where people do something like this is in pytorch#142035.


#build_targets_regex[fbcode//bolt/nn/.*]

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r test_module_partitioner_weight_tied
```

Reviewed By: angelayi

Differential Revision: D66998592

yushangdi force-pushed the export-D66998592 branch from 9a2e624 to 5134ff9 Compare

January 3, 2025 20:04

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

2 similar comments

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Collaborator

pytorchmergebot commented

Merge started

Your change will be merged while ignoring the following 1 checks: Lint / Test run_test.py is usable without boto3

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Collaborator

pytorchmergebot commented

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

18 similar comments

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Contributor

facebook-github-bot commented

@pytorchbot merge -i

(Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally)

Collaborator

pytorchmergebot commented

Merge started

Your change will be merged while ignoring the following 1 checks: Lint / Test run_test.py is usable without boto3

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

f15af07

pytorchmergebot removed the merging label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk fb-exported fx Merged release notes: export

0