10000 [inductor] weight prepack for _convolution_transpose_pointwise by chunyuan-w · Pull Request #90266 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[inductor] weight prepack for _convolution_transpose_pointwise #90266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from

Conversation

chunyuan-w
Copy link
Collaborator
@chunyuan-w chunyuan-w commented Dec 6, 2022

@pytorch-bot
Copy link
pytorch-bot bot commented Dec 6, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90266

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 Failures

As of commit 7ccfae3:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
Comment on lines 669 to 677
// aten tensor with the shape of the prepacked tensor
at::Tensor origin_weight_t;
if (groups > 1) {
origin_weight_t = weight.transpose(0, 1).reshape(weight_IOHW_sizes);
} else {
origin_weight_t = weight.transpose(0, 1);
}
w = itensor_from_tensor(origin_weight_t);
w.transpose_(0, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a case that we want to handle this "aten tensor with the shape of the prepacked tensor", or we always model prepacked weight with "MKLDNNTensor"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compiling the graph, each node is run with FakeTensor inputs and in this case, the weight tensor is an aten tensor with the shape of the prepacked tensor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Meta tensor: added an implementation mkldnn_convolution_transpose_pointwise_meta.

For MKLDNN and CPU tensor: supported the computation for both of them in the original implementation: mkldnn_convolution_transpose_pointwise.

…wise"

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.


Different from Conv2d, once ConvTranspose weight has been prepacked, the size changes:
- Original weight size: `[i, o, ...]`
- Prepacked size:

  - Groups > 1:  `[g*o, i/g, ...]`
  - Groups == 1: `[o, i, ...]`

The  `_convolution_transpose_pointwise` kernel handles the below two situations:

- During compilation, when running the node of the FX graph, the kernel gets a public weight tensor with prepacked size. The kernel will convert the weight back to the original weight size for computation.
- During execution, it gets a mkldnn tensor and will directly use it for computation.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.


Different from Conv2d, once ConvTranspose weight has been prepacked, the size changes:
- Original weight size: `[i, o, ...]`
- Prepacked size:

  - Groups > 1:  `[g*o, i/g, ...]`
  - Groups == 1: `[o, i, ...]`

The  `_convolution_transpose_pointwise` kernel handles the below two situations:

- During compilation, when running the node of the FX graph, the kernel gets a public
10000
 weight tensor with prepacked size. The kernel will convert the weight back to the original weight size for computation.
- During execution, it gets a mkldnn tensor and will directly use it for computation.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.


Different from Conv2d, once ConvTranspose weight has been prepacked, the size changes:
- Original weight size: `[i, o, ...]`
- Prepacked size:

  - Groups > 1:  `[g*o, i/g, ...]`
  - Groups == 1: `[o, i, ...]`

The  `_convolution_transpose_pointwise` kernel handles the below two situations:

- During compilation, when running the node of the FX graph, the kernel gets a public weight tensor with prepacked size. The kernel will convert the weight back to the original weight size for computation.
- During execution, it gets a mkldnn tensor and will directly use it for computation.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]

const ideep::tensor x = itensor_from_tensor(input);

ideep::tensor w = itensor_from_tensor(weight);;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ideep::tensor w = itensor_from_tensor(weight);;
ideep::tensor w = itensor_from_tensor(weight);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the typo.

@@ -709,6 +725,8 @@ Tensor mkldnn_convolution_transpose_pointwise(
torch::List<c10::optional<at::Scalar>> scalars,
c10::optional<c10::string_view> algorithm) {
c10::impl::ExcludeDispatchKeyGuard edkg(c10::autograd_dispatch_keyset);
bool use_channels_last =
weight_t.is_mkldnn() || mkldnn_conv_use_channels_last(input_t, weight_t);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is ok. Not related to this PR, but does it make more sense that mkldnn_conv_use_channels_last should prefer channels last if input is a strided tensor and weight is mkldnn? @XiaobingSuper

Comment on lines 3353 to 3355
def _conv_input_size(
output_size, weight_size, padding, output_padding, stride, dilation, groups
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we keep this function local to _prepare_convolution_fusion_create first? It is only used by this function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this function local to _prepare_convolution_fusion_create.

Comment on lines 3381 to 3384
def _original_deconv_weight_size(
prepacked_weight,
groups,
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@chunyuan-w chunyuan-w requested a review from jgong5 December 13, 2022 02:03
chunyuan-w added a commit to chunyuan-w/pytorch that referenced this pull request Dec 13, 2022
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@chunyuan-w chunyuan-w marked this pull request as ready for review December 15, 2022 01:33
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@chunyuan-w chunyuan-w added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 15, 2022
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
…wise"



This PR implements weight prepack for `_convolution_transpose_pointwise`, similar to #88988.

In addition, we add a kernel for Meta tensor input to reduce the compilation time.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
chunyuan-w added a commit to chunyuan-w/pytorch that referenced this pull request Jan 10, 2023
@chunyuan-w
Copy link
Collaborator Author

Re-opened in #91955

@chunyuan-w chunyuan-w closed this Jan 11, 2023
chunyuan-w added a commit that referenced this pull request Jan 18, 2023
…for _convolution_transpose_pointwise"


Re-open #90266 since earlier pr on that stack got reverted.
Depend on internal ideep upgrade.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Jan 18, 2023
…_transpose_pointwise"


Re-open #90266 since earlier pr on that stack got reverted.
Depend on internal ideep upgrade.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Jan 31, 2023
…for _convolution_transpose_pointwise"


Re-open #90266 since earlier pr on that stack got reverted.
Depend on internal ideep upgrade.
[Update]: internal ideep upgrade issue is resolved in #92239.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Jan 31, 2023
…_transpose_pointwise"


Re-open #90266 since earlier pr on that stack got reverted.
Depend on internal ideep upgrade.
[Update]: internal ideep upgrade issue is resolved in #92239.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Jan 31, 2023
…pointwise (#91955)

Re-open #90266 since earlier pr on that stack got reverted.
Depend on internal ideep upgrade.
[Update]: internal ideep upgrade issue is resolved in #92239.

Pull Request resolved: #91955
Approved by: https://github.com/jgong5, https://github.com/desertfire
@facebook-github-bot facebook-github-bot deleted the gh/chunyuan-w/17/head branch June 8, 2023 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor open source topic: not user facing topic category
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants
0