sync with top of pytorch tree #1810

xw285cornell · 2025-01-02T17:40:55Z

I wonder if it's related with the build timeout for the PRs based on this branch, e.g. pytorch#143695

…143332)" This reverts commit a9c753b. Reverted #143332 on behalf of https://github.com/malfet due to Surprisingly failure is caused by this PR ([comment](#143332 (comment)))

ditto Pull Request resolved: #143685 Approved by: https://github.com/kit1980, https://github.com/seemethere, https://github.com/atalman

This PR adds support for export to unwrap/wrap subclasses AOT so that we can trace through subclass parameters. This will resolve the UX issue in torchao where users had to manually unwrap their subclasses before calling export. Differential Revision: [D67531057](https://our.internmc.facebook.com/intern/diff/D67531057) Pull Request resolved: #141941 Approved by: https://github.com/bdhirsh

tlparse PR: pytorch/tlparse#83 Pull Request resolved: #141907 Approved by: https://github.com/ezyang

clearing at dynamo start is an issue because it throws away events from compiled autograd Pull Request resolved: #143175 Approved by: https://github.com/Skylion007, https://github.com/jamesjwu ghstack dependencies: #141907

When we unflatten, the submodules we generate (`InterpreterModule` or `InterpreterModuleDispatcher`) are not related by type to the original submodules `N`. This makes `isinstance(mod, N)` checks fail. Since we do not have the original types after export, the best we can do is expose a `type_name()` method that carries the original type name, which we do carry in `nn_module_stack` entries. Differential Revision: [D67526542](https://our.internmc.facebook.com/intern/diff/D67526542/) Pull Request resolved: #143664 Approved by: https://github.com/tugsbayasgalan

Test Plan: Sandcastle Differential Revision: D67549758 Pull Request resolved: #143693 Approved by: https://github.com/huydhn

Pull Request resolved: #141748 Approved by: https://github.com/ezyang

Use set_feature_use for logging aot autograd cache so that dynamo_compile has this data as well as PT2 Compile Events. Differential Revision: [D67536293](https://our.internmc.facebook.com/intern/diff/D67536293/) Pull Request resolved: #143674 Approved by: https://github.com/bobrenjc93

Pull Request resolved: #143548 Approved by: https://github.com/yanboliang, https://github.com/jansel, https://github.com/williamwen42

Pull Request resolved: #143567 Approved by: https://github.com/williamwen42, https://github.com/jansel ghstack dependencies: #143548

Fix #143472 Pull Request resolved: #143491 Approved by: https://github.com/desertfire, https://github.com/jansel, https://github.com/EikanWang

…per in runtime. (#142322) This PR aims to removes the de pendency on Intel Compiler at Inductor runtime. Now we only need a SYCL_HOME in runtime to find the sycl headers and libs. Pull Request resolved: #142322 Approved by: https://github.com/EikanWang, https://github.com/desertfire, https://github.com/albanD ghstack dependencies: #143491

Summary: Emit a CMakeLists.txt with compile and link options when package_cpp_only is specified. After unzipping AOTI generated .pt2 package file, user can manually build the generated model code in their local environment. Pull Request resolved: #143680 Approved by: https://github.com/huydhn

This reverts commit c7d9f29. Reverted #143402 on behalf of https://github.com/huydhn due to The internal diff D67148738 has been reverted ([comment](#143402 (comment)))

…k memory usage (#143347)" This reverts commit efe21ee. Reverted #143347 on behalf of https://github.com/huydhn due to D67118173 has been backed out internally ([comment](#143347 (comment)))

https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmprli4iy/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100 ``` [ { "args": { "compile_id": "0/-/-", "graph_id": 0 }, "cat": "dynamo_timed", "name": "compiled_autograd", "ph": "B", "pid": 0, "tid": 0, "ts": 1733886868992655.8 }, { "args": { "compile_id": "0/-/-", "graph_id": 0 }, "cat": "dynamo_timed", "name": "compiled_autograd", "ph": "E", "pid": 0, "tid": 0, "ts": 1733886869130681.0 }, { "args": { "compile_id": "0/0/0" }, "cat": "dynamo_timed", "name": "dynamo", "ph": "B", "pid": 0, "tid": 0, "ts": 1733886869134350.5 }, { ``` Pull Request resolved: #140964 Approved by: https://github.com/masnesral ghstack dependencies: #141907, #143175

…143693)" This reverts commit ae3d385. Reverted #143693 on behalf of https://github.com/huydhn due to Sorry for reverting this change but it has a conflict with #143639 that is breaking trunk ([comment](#143693 (comment)))

This reverts commit 23ca7c2. Reverted #143639 on behalf of https://github.com/huydhn due to This is failing OSS tests ([comment](#143639 (comment)))

Reuse partial reductions for complete reductions. We could expand this to more cover more types of reductions, although we'd have to be a bit more careful about keeping the intermediary, partial reduction in higher precision. Just doing the ops which do not depend on a higher compute_dtype_precision for now to cover the relevant use case initially. Fix for #136267. Longer term, we should make sure cooperative reductions fuse partial and complete reductions. Pull Request resolved: #143600 Approved by: https://github.com/vkuzo

Summary: LLVM-15 has a warning `-Wunused-variable` which we treat as an error because it's so often diagnostic of a code issue. Unused variables can compromise readability or, worse, performance. This diff either (a) removes an unused variable and, possibly, it's associated code or (b) qualifies the variable with `[[maybe_unused]]`. - If you approve of this diff, please use the "Accept & Ship" button :-) Test Plan: Sandcastle Pull Request resolved: #143639 Approved by: https://github.com/kit1980, https://github.com/malfet, https://github.com/cyyever

Test Plan: Sandcastle Pull Request resolved: #143693 Approved by: https://github.com/huydhn

This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash. Pull Request resolved: #143694 Approved by: https://github.com/pytorchbot

…ting a trace like generated kernels and index tensor data (#143430)" This reverts commit 33dd4f1. Reverted #143430 on behalf of https://github.com/huydhn due to The internal diff D58707846 has been backed out ([comment](#143430 (comment)))

# Summary: Full Context: https://docs.google.com/document/d/1-j5KSbfGFJQcH4sYh7BIeJXso3zYzl5G5yFQqXdKx_o/edit?usp=sharing tl;dr This change introduces classes which help determine a dynamic memory budget. This will mostly be helpful for models with many implicit graph breaks. --- New Classes: *GraphInfoProvider* * Takes the joint_graph as well as the input memories and runtimes and parses the graph + values into usable forms for the SolverEvaluator. *KnapsackEvaluator* * Provides a function: Given all of the four inputs (solver function as a callable, max_dynamic_memory_budget, min_dynamic_memory_budget, dynamic_memory_budget_pareto_granularity) it returns an approximation of the knee point of the pareto distribution. # Test Plan: ### LintRunner LintRunner Output: P1700445547 ### Unit Tests ``` $ buck test @mode/opt //caffe2/test/functorch:test_ac_knapsack `@mode/opt` was specified, but not found. Using file at `//mode/opt`. This behavior is being deprecated. Please use `"@//mode/opt"` instead File changed: fbcode//caffe2/.ruff_cache/0.7.4/.tmpB6PmDS File changed: fbsource//xplat/caffe2/test/functorch/test_ac_knapsack.py File changed: fbcode//caffe2/.ruff_cache/0.7.4/.tmpyjCiPn 20 additional file change events Buck UI: https://www.internalfb.com/buck2/414ead46-9ede-4192-8e1a-5d3c52bdb9cc Test UI: https://www.internalfb.com/intern/testinfra/testrun/6473924710342830 Network: Up: 0B Down: 0B (reSessionID-159794b9-9d61-477e-8e63-9bdeaa537dca) Analyzing targets. Remaining 0/214 Executing actions. Remaining 0/6933 0.1s exec time total Command: test. Finished 1 local Time elapsed: 18.5s Tests finished: Pass 15. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ### Test Run Updated the config: ``` activation_memory_budget_solver: DYNAMIC_MEMORY_BUDGET_DP ``` Confirming proper execution via: [aps-fb_fm_v4_768_01_dynamic-2a792ba8af](https://www.internalfb.com/mlhub/pipelines/runs/mast/aps-fb_fm_v4_768_01_dynamic-2a792ba8af?job_attempt=0&version=0&env=PRODUCTION) Pull Request resolved: #143539 Approved by: https://github.com/jansel

Fixes #ISSUE_NUMBER Pull Request resolved: #141787 Approved by: https://github.com/albanD

Retracing while preserving module call signatures used to be a problem because graph modules don't have submodules at given paths. This led to a number of failing retracebility tests. By not trying to wrap modules with export tracepoints we can pass most of these tests; the only exception is where you do module swapping on retraced programs, which is still not possible. Differential Revision: [D67539304](https://our.internmc.facebook.com/intern/diff/D67539304/) Pull Request resolved: #143676 Approved by: https://github.com/zhxchen17, https://github.com/tugsbayasgalan ghstack dependencies: #143664

This reverts commit 6733045. Reverted #140030 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but my first attempt to fix internal build does not fix all the cases, so let us try again ([comment](#140030 (comment)))

Fixes #ISSUE_NUMBER Pull Request resolved: #143355 Approved by: https://github.com/albanD

Summary: If module being quantized contains a some meta tensors and some tensors with actual device, we should not fail quantization. Quantization should also not fail if new quantized module is created on a meta device. Differential Revision: D66895899 Pull Request resolved: #142262 Approved by: https://github.com/iamzainhuda

Fix #143967 Pull Request resolved: #143970 Approved by: https://github.com/EikanWang, https://github.com/jansel

Fixes #136862 1. removed dead code from torch/_dynamo/convert_frame.py 2. ran `lintrunner -a` and all the tests passed. 3. ran the unit tests and everything seems to be in order. Pull Request resolved: #140938 Approved by: https://github.com/zou3519

Pull Request resolved: #142347 Approved by: https://github.com/gujinghui, https://github.com/albanD

# Motivation Due to the potential for the external SYCL queue to have a low priority, we need to support the low-priority SYCL queue for native XPU Streams to maintain consistency. Pull Request resolved: #141119 Approved by: https://github.com/gujinghui, https://github.com/albanD ghstack dependencies: #142347

# Motivation This PR aims to introduce `torch.xpu.ExternalStream` to be used to wrap SYCL queue created in other libraries to PyTorch. # Additional Context Pull Request resolved: #141123 Approved by: https://github.com/albanD, https://github.com/EikanWang ghstack dependencies: #142347, #141119

Pull Request resolved: #143799 Approved by: https://github.com/albanD, https://github.com/EikanWang ghstack dependencies: #142347, #141119, #141123

# Motivation As mentioned in #141119 (comment), we properly handle the priority value if it is outside of the priority range. # Additional Context If the value falls outside of the allowed priority range, it will automatically be mapped to the nearest valid priority(either lowest or highest). Pull Request resolved: #143849 Approved by: https://github.com/albanD, https://github.com/EikanWang ghstack dependencies: #142347, #141119, #141123, #143799

By calling `metal::min` and `metal::max` respectively with argument typecast to a common type to avoid ambiguous calls errors TODO: Implement NaN propagation for both eager and compile, see #143976 `pytest test/inductor/test_torchinductor.py -k _mps` score is 460 failed, 291 passed, 32 skipped Pull Request resolved: #143977 Approved by: https://github.com/jansel ghstack dependencies: #143948, #143949, #143973

At the moment by generating multiple MetalLibraries `pytest test/inductor/test_torchinductor.py -k _mps` score is 434 failed, 317 passed, 32 skipped Pull Request resolved: #143998 Approved by: https://github.com/jansel, https://github.com/ruidazeng ghstack dependencies: #143948, #143949, #143973, #143977

This reverts commit 135a2d4. Reverted #142350 on behalf of https://github.com/jeanschmidt due to breaking internal signals ([comment](#142350 (comment)))

…de in (#143975)" This reverts commit 7c1c073. Reverted #143975 on behalf of https://github.com/jeanschmidt due to Need to revert in order to be able to revert #139321 feel free to merge it back once conflicts are cleared ([comment](#143975 (comment)))

This reverts commit 9e8d84f. Reverted #139321 on behalf of https://github.com/jeanschmidt due to breaking internal signals ([comment](#139321 (comment)))

See #144006 ```py __________________________________________ CudaReproTests.test_repeated_masked_load __________________________________________ RuntimeError: First class dim doesn't work with python 3.12 The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jansel/conda/envs/pytorch/lib/python3.12/unittest/case.py", line 58, in testPartExecutor yield File "/home/jansel/conda/envs/pytorch/lib/python3.12/unittest/case.py", line 634, in run self._callTestMethod(testMethod) File "/home/jansel/conda/envs/pytorch/lib/python3.12/unittest/case.py", line 589, in _callTestMethod if method() is not None: ^^^^^^^^ File "/home/jansel/pytorch/torch/testing/_internal/common_utils.py", line 3108, in wrapper method(*args, **kwargs) File "/home/jansel/pytorch/test/inductor/test_cuda_repro.py", line 1678, in test_repeated_masked_load from functorch.einops import rearrange File "/home/jansel/pytorch/functorch/einops/__init__.py", line 1, in <module> from .rearrange import rearrange File "/home/jansel/pytorch/functorch/einops/rearrange.py", line 7, in <module> from functorch._C import dim as _C ImportError: initialization failed ``` Pull Request resolved: #144006 Approved by: https://github.com/Skylion007

Fixes #141426 Please see details in the issue. Pull Request resolved: #141427 Approved by: https://github.com/jansel

Pull Request resolved: #143926 Approved by: https://github.com/jansel

As titled, this PR add a kwarg src_data_rank to the distribute_tensor API, to allow user specify a specific rank as the full tensor source data. Previously we by default specify group_rank=0 as the source of truth for single device semantic, this new option: * gives advanced user flexiblity to choose the source data rank * allow user to specify None explicity, which means we will skip the communications needed (scatter/broadcast) for the cases that does not care about single device semantic (i.e. loading from a checkpoint) Pull Request resolved: #143883 Approved by: https://github.com/XilunWu, https://github.com/tianyu-l

as titled, this PR propagates the src_data_rank in the TP API, so that module level APIs could leverage the flexibility to choose src_data_rank, and avoid the communication if it does not need to Pull Request resolved: #144005 Approved by: https://github.com/tianyu-l ghstack dependencies: #143883

Followup after #143934, this check is no longer necessary and fixes a subset of inductor tests Before `pytest test/inductor/test_torchinductor.py -k _mps` reports 463 failed, 291 passed, 32 skipped after 456 failed, 298 passed, 32 skipped Pull Request resolved: #144055 Approved by: https://github.com/Skylion007

Fixes #143146 Pull Request resolved: #144030 Approved by: https://github.com/malfet

Change the label to make sure the jobs land on a node which has >= 4 GPUs. Pull Request resolved: #140319 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/kwen2501

rocm-repo-management-api · 2025-01-02T17:55:57Z

Jenkins build for 8f3eb843730f38d7307228485b1accc69c4aa0f0 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

…3944) Pull Request resolved: #143944 Approved by: https://github.com/aorenste ghstack dependencies: #143943

rocm-repo-management-api · 2025-01-02T18:25:49Z

Jenkins build for 8506a2af9aced8f084a27dbf73811a947a47d3f7 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

# Summary: This also makes updates to different repositories throughout FB code to roll any updates needed for this new release. I was not able to get AsyncMM.cu to build (still trying) Yfiu suggested that I just skip it for now Test Plan: Have run various build commands to try and expose errors Pull Request resolved: #143515 Approved by: https://github.com/eqy, https://github.com/Skylion007

rocm-repo-management-api · 2025-01-02T18:55:46Z

Jenkins build for a8c98ce175e20c071a209e7aa69f9f28897cda8b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Add networkx as a dependency for test_bazel Example failure: https://github.com/pytorch/pytorch/actions/runs/12551752021/job/34996706301 ``` INFO: From Testing //:test_bazel: ==================== Test output for //:test_bazel: Traceback (most recent call last): File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/test/_test_bazel.py", line 33, in <module> test_simple_compile_eager() File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/test/_test_bazel.py", line 27, in test_simple_compile_eager opt_foo1 = torch.compile(foo, backend="eager") File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/__init__.py", line 2533, in compile backend = _TorchCompileWrapper(backend, mode, options, dynamic) File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/__init__.py", line 2342, in __init__ self.compiler_fn = lookup_backend(backend) File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/registry.py", line 66, in lookup_backend _lazy_import() File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/registry.py", line 102, in _lazy_import import_submodule(backends) File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/utils.py", line 2797, in import_submodule importlib.import_module(f"{mod.__name__}.{filename[:-3]}") File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/execroot/pytorch/external/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1050, in _gcd_import File "<frozen importlib._bootstrap>", line 1027, in _find_and_load File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 688, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/common.py", line 12, in <module> from torch._functorch.aot_autograd import ( File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/aot_autograd.py", line 147, in <module> from .partitioners import default_partition File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/partitioners.py", line 31, in <module> from ._activation_checkpointing.graph_info_provider import GraphInfoProvider File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/_activation_checkpointing/graph_info_provider.py", line 3, in <module> import networkx as nx ModuleNotFoundError: No module named 'networkx' ``` No periodic runs on this PR or its main branch commit, but I'm pretty sure its started on https://togithub.com/pytorch/pytorch/pull/143539 Pull Request resolved: #143995 Approved by: https://github.com/huydhn

rocm-repo-management-api · 2025-01-02T19:55:48Z

Jenkins build for bb5e439f2d8a46172b8b7d2fdb7609822b9a97b1 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

pytorchmergebot and others added 30 commits December 21, 2024 00:06

Revert "[logging] A few fixes/updates to record_compilation_metrics (#…

ad7ab5e

…143332)" This reverts commit a9c753b. Reverted #143332 on behalf of https://github.com/malfet due to Surprisingly failure is caused by this PR ([comment](#143332 (comment)))

[BE] Remove gcc-5 workaround for unused args (#143685)

553031f

ditto Pull Request resolved: #143685 Approved by: https://github.com/kit1980, https://github.com/seemethere, https://github.com/atalman

[ca] add compiled autograd to CompileId (#141907)

4ee166b

tlparse PR: pytorch/tlparse#83 Pull Request resolved: #141907 Approved by: https://github.com/ezyang

Fix issue with setAttribute and int8_t vs int32_t variables (#143693)

ae3d385

Test Plan: Sandcastle Differential Revision: D67549758 Pull Request resolved: #143693 Approved by: https://github.com/huydhn

[aot] refactor dynamo source and cudagraphs static idx logic (#141748)

ffd1b53

Pull Request resolved: #141748 Approved by: https://github.com/ezyang

[dynamo] Support user defined dicts (#143548)

4627cfd

Pull Request resolved: #143548 Approved by: https://github.com/yanboliang, https://github.com/jansel, https://github.com/williamwen42

[dynamo] Remove transformers ModelOutput hack (#143567)

0da004f

Pull Request resolved: #143567 Approved by: https://github.com/williamwen42, https://github.com/jansel ghstack dependencies: #143548

[Inductor XPU] Add XPU check for is_big_gpu(). (#143491)

af0e159

Fix #143472 Pull Request resolved: #143491 Approved by: https://github.com/desertfire, https://github.com/jansel, https://github.com/EikanWang

Revert "(MTIA) Move "empty_cache" API (#143402)"

dabc956

This reverts commit c7d9f29. Reverted #143402 on behalf of https://github.com/huydhn due to The internal diff D67148738 has been reverted ([comment](#143402 (comment)))

Revert "[MTIA] (3/n) Implement PyTorch APIs to query/reset device pea…

c7d7eff

…k memory usage (#143347)" This reverts commit efe21ee. Reverted #143347 on behalf of https://github.com/huydhn due to D67118173 has been backed out internally ([comment](#143347 (comment)))

Revert "Fix unused-variable issues in caffe2 (#143639)"

97990f4

This reverts commit 23ca7c2. Reverted #143639 on behalf of https://github.com/huydhn due to This is failing OSS tests ([comment](#143639 (comment)))

Fix issue with setAttribute and int8_t vs int32_t variables (#143693)

9f3c291

Test Plan: Sandcastle Pull Request resolved: #143693 Approved by: https://github.com/huydhn

Fix cppcoreguidelines-pro-type-member-init (#141787)

d7e59c2

Fixes #ISSUE_NUMBER Pull Request resolved: #141787 Approved by: https://github.com/albanD

Enable more C++ warnings (#143355)

daa3ffe

Fixes #ISSUE_NUMBER Pull Request resolved: #143355 Approved by: https://github.com/albanD

etaf and others added 20 commits December 31, 2024 06:28

[AOTI] Not use AOTI_TORCH_CHECK in non AOTI mode. (#143970)

01034e9

Fix #143967 Pull Request resolved: #143970 Approved by: https://github.com/EikanWang, https://github.com/jansel

Refine XPU external Stream (#142347)

39450ae

Pull Request resolved: #142347 Approved by: https://github.com/gujinghui, https://github.com/albanD

Add get_stream_from_external API for CUDA backend (#143799)

3848de5

Pull Request resolved: #143799 Approved by: https://github.com/albanD, https://github.com/EikanWang ghstack dependencies: #142347, #141119, #141123

Revert "Update low prec codegen for div/mod (#142350)"

eec3091

This reverts commit 135a2d4. Reverted #142350 on behalf of https://github.com/jeanschmidt due to breaking internal signals ([comment](#142350 (comment)))

Revert "Fix duplicate pattern error (#139321)"

a174ee2

This reverts commit 9e8d84f. Reverted #139321 on behalf of https://github.com/jeanschmidt due to breaking internal signals ([comment](#139321 (comment)))

Fix a bug for wrong stride in fake tensor (#141427)

8d9ff9c

Fixes #141426 Please see details in the issue. Pull Request resolved: #141427 Approved by: https://github.com/jansel

[dynamo] Separate out GetItemSource and DictGetItemSource (#143926)

dec1a6d

Pull Request resolved: #143926 Approved by: https://github.com/jansel

Enable mkldnn pattern matcher tests for BF16 on AArch64 (#144030)

916b510

Fixes #143146 Pull Request resolved: #144030 Approved by: https://github.com/malfet

ROCm: Enable 4 gpu tests for distributed config (#140319)

8f3eb84

Change the label to make sure the jobs land on a node which has >= 4 GPUs. Pull Request resolved: #140319 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/kwen2501

xw285cornell requested review from jeffdaily and jithunnair-amd as code owners January 2, 2025 17:40

remove allow-untyped-defs from _export/pass_infra/proxy_value.py (#14…

8506a2a

…3944) Pull Request resolved: #143944 Approved by: https://github.com/aorenste ghstack dependencies: #143943

jeffdaily merged this pull request into ROCm:main Jan 2, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync with top of pytorch tree #1810

sync with top of pytorch tree #1810

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sync with top of pytorch tree #1810

sync with top of pytorch tree #1810

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!