8000 add max_and_min function and cpu kernel to speed up observers by vkuzo · Pull Request #41570 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

add max_and_min function and cpu kernel to speed up observers #41570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

vkuzo
Copy link
Contributor
@vkuzo vkuzo commented Jul 17, 2020

Stack from ghstack:

Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:

  • CUDA kernel and tests
  • making this work per channel
  • benchmarking on observer
  • benchmarking impact on QAT overhead

Test Plan:

python test/test_torch.py TestTorch.test_min_and_max

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca

(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D22589349

Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 17, 2020
Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0f7ed56
Pull Request resolved: #41570
@vkuzo vkuzo requested a review from ngimel July 17, 2020 00:38
@dr-ci
Copy link
dr-ci bot commented Jul 17, 2020

💊 CI failures summary and remediations

As of commit 4944b3f (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 19 times.

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Jul 17, 2020

About naming: pytorch already has torch.std_mean and torch.var_mean (do not use and). So torch.min_max would also make perfect sense IMO

There was a discussion on np.minmax, but no concrete results: numpy/numpy#9836

@vkuzo
Copy link
Contributor Author
vkuzo commented Jul 17, 2020

numpy/numpy#9836

sure, I'm flexible. min_max does read better

@ngimel
Copy link
Collaborator
ngimel commented Jul 17, 2020

Also, keeping similar structure to var_mean, it would be good to return 2 tensors and not one. Another thing, since at this stage it is not intended as a fully usable API (e.g. it does not support dim the way min/max do, it does not work on the GPU, it does not support autograd, it does not have docs) and it is unclear if it ever will, can you start the name with _? _min_max

@vkuzo
Copy link
Contributor Author
vkuzo commented Jul 17, 2020

Also, keeping similar structure to var_mean, it would be good to return 2 tensors and not one. Another thing, since at this stage it is not intended as a fully usable API (e.g. it does not support dim the way min/max do, it does not work on the GPU, it does not support autograd, it does not have docs) and it is unclear if it ever will, can you start the name with _? _min_max

would be good to return 2 tensors and not one

ah, great, didn't know returning multiple tensors was supported in native_functions.yaml, will do

Another thing, since at this stage it is not intended as a fully usable API (e.g. it does not support dim the way min/max do, it does not work on the GPU, it does not support autograd, it does not have docs

sure thing. For quantization we'll add CUDA and indices, we'll add it in future PRs just to keep the PR size manageable. But yeah, unless this is actually needed outside of quantization keeping it private would be good.

@ngimel
Copy link
Collaborator
ngimel commented Jul 17, 2020

I'm pretty sure it will be usable outside of quantization, so perhaps we should work on making it first class citizen. Would it be a lot of trouble for you guys to start as _min_max and then switch to min_max if it becomes stable and public?

@vkuzo
Copy link
Contributor Author
vkuzo commented Jul 17, 2020

I'm pretty sure it will be usable outside of quantization, so perhaps we should work on making it first class citizen. Would it be a lot of trouble for you guys to start as _min_max and then switch to min_max if it becomes stable and public?

sg. Docs we can also add ourselves. For autograd support I haven't looked into it tbh.

… speed up observers"

Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22589349](https://our.internmc.facebook.com/intern/diff/D22589349)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 17, 2020
Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 6d6a101
Pull Request resolved: #41570
Copy link
Collaborator
@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this looks good for starters. Looking forward to cuda and dim implementations.

output2.fill_(result.second);
}

template <typename scalar_t, typename func_t, typename vec_func_t1, typename vec_func_t2>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: vec_func_t1 and vec_func_t2 are the same type, so you don't need 2 template args here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so I tried this a few days ago and the compiler wasn't happy. Looks like it would be some non-trivial extra work to go from lambdas to functions which resolve to the same templated type (context: https://stackoverflow.com/questions/7477310/why-cant-i-create-a-vector-of-lambdas-of-the-same-type-in-c11). Thoughts on if it's worth it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if it's nontrivial then it's fine to leave as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. thanks for the review!

…ers"

Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22589349](https://our.internmc.facebook.com/intern/diff/D22589349)

[ghstack-poisoned]
…ers"

Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22589349](https://our.internmc.facebook.com/intern/diff/D22589349)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 21, 2020
Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 72bf781
Pull Request resolved: #41570
Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22589349](https://our.internmc.facebook.com/intern/diff/D22589349)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 21, 2020
Summary:

For min/max based quantization observers, calculating min and max of a tensor
takes most of the runtime. Since the calculation of min and max is done
on the same tensor, we can speed this up by only reading the tensor
once, and reducing with two outputs.

One question I had is whether we should put this into the quantization
namespace, since the use case is pretty specific.

This PR implements the easier CPU path to get an initial validation.
There is some needed additional work in future PRs, which @jpgraham will
take a look at:
* CUDA kernel and tests
* making this work per channel
* benchmarking on observer
* benchmarking impact on QAT overhead

Test Plan:

```
python test/test_torch.py TestTorch.test_min_and_max
```

quick bench (not representative of real world use case):
https://gist.github.com/vkuzo/7fce61c3456dbc488d432430cafd6eca
```
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=1 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.0390) tensor(-5.4485) tensor([-5.4485,  5.0390])
min and max separate 11.90243935585022
min and max combined 6.353186368942261
% decrease 0.466228209277153
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=4 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.5586) tensor(-5.3983) tensor([-5.3983,  5.5586])
min and max separate 3.468616485595703
min and max combined 1.8227086067199707
% decrease 0.4745142294372342
(pytorch) [vasiliy@devgpu108.ash6 ~/local/pytorch] OMP_NUM_THREADS=8 python ~/nfs/pytorch_scripts/observer_bench.py
tensor(5.2146) tensor(-5.2858) tensor([-5.2858,  5.2146])
min and max separate 1.5707778930664062
min and max combined 0.8645427227020264
% decrease 0.4496085496757899
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a5e8ecb
Pull Request resolved: #41570
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 302e566.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/107/head branch July 25, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0