8000 Tags · NripeshN/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Tags: NripeshN/pytorch

Tags

ciflow/xpu/141479

Toggle ciflow/xpu/141479's commit message
rebase to 5d6acd5

ciflow/trunk/142447

Toggle ciflow/trunk/142447's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Device] Add "mps" to `torch._utils._get_device_attr`

This is a regression introduced by pytorch#141098 that went unnoticed due to pytorch#142206

Test plan:
```
python test_autograd.py -v -k test_dataparallel_saved_tensors_hooks
```

Before this change it failed with
```
ERROR: test_dataparallel_saved_tensors_hooks (__main__.TestMultithreadAutograd.test_dataparallel_saved_tensors_hooks)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/malfet/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 3108, in wrapper
    method(*args, **kwargs)
    ~~~~~~^^^^^^^^^^^^^^^^^
  File "/Users/malfet/git/pytorch/pytorch/test/test_autograd.py", line 13074, in test_dataparallel_saved_tensors_hooks
    model = torch.nn.DataParallel(Model())
  File "/Users/malfet/git/pytorch/pytorch/torch/nn/parallel/data_parallel.py", line 153, in __init__
    raise RuntimeError("no available devices were found")
RuntimeError: no available devices were found
```

After it passes
```

ciflow/trunk/142441

Toggle ciflow/trunk/142441's commit message
[fr] change back vlog(2) to LOG(INFO)

Summary:
Change log message for future execution back from VLOG(2) to LOG(INFO).
This message is useful for Flight Recorder to verify that flight recorder dumps completed successfully (or not).

Test Plan: Tested manually on a mast job and noted that the INFO message was as expected.

Differential Revision: D66996439

ciflow/trunk/142271

Toggle ciflow/trunk/142271's commit message
[Profiler] Add CUDA Overhead to Auto-trace (pytorch#142271)

Summary:

We already have CUDA OVERHEAD events enabled in on-demand so we should also add them to auto-trace

Test Plan:
Tested using servicelab and found no performance difference:
kineto_benchmark
    duration_ms: 21668
    number_of_events: 26542
    profiler_prepare_call_duration_us: 970
    profiler_enable_call_duration_us: 616474
    profiling_window_duration_us: 2188525
    profiler_disable_call_duration_us: 148628
    parse_kineto_call_duration_us: 1672536
    function_events_build_tree_call_duration_us: 285939


kineto_benchmark
    duration_ms: 21718
    number_of_events: 26556
    profiler_prepare_call_duration_us: 885
    profiler_enable_call_duration_us: 7037
    profiling_window_duration_us: 1772481
    profiler_disable_call_duration_us: 174122
    parse_kineto_call_duration_us: 1983683
    function_events_build_tree_call_duration_us: 333582

Differential Revision: D66904879

ciflow/trunk/142093

Toggle ciflow/trunk/142093's commit message
Update on "[dtensor][cp][experiment] add CP experimental API to choos…

…e rotate method"


**Summary**
This PR adds a new experimental API `set_rotate_method` for Context Parallel. This API allows user to choose the desired communication method (between all-to-all and all-gather) for shards rotation.

**Test**
`pytest test/distributed/_tensor/test_attention.py`

cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o

[ghstack-poisoned]

ciflow/trunk/141970

Toggle ciflow/trunk/141970's commit message
Update on "add torchrec collectives to enforce global ordering"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]

ciflow/trunk/141941

Toggle ciflow/trunk/141941's commit message
Update on "Support tensor subclass unwrapping"


Differential Revision: [D66690419](https://our.internmc.facebook.com/intern/diff/D66690419)

This PR adds support for export to unwrap/wrap subclasses AOT so that we can trace through subclass parameters. This will resolve the UX issue in torchao where users had to manually unwrap their subclasses before calling export.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]

ciflow/trunk/141857

Toggle ciflow/trunk/141857's commit message
Add support for bfloat16 atomic adds in fbcode

ciflow/trunk/141842

Toggle ciflow/trunk/141842's commit message
Update on "Refactor NJT to hold metadata on nested int"


Design: https://docs.google.com/document/d/1HV9719blS8OJxf8kuW5U3ihaoTR_H7sJJ29U7mT4J1g/edit?tab=t.0#heading=h.w4x2tmi9rtmd

cc ezyang SherlockNoMad EikanWang jgong5 wenzhe-nrv voznesenskym penguinwu Guobing-Chen XiaobingSuper zhuhaozhe blzheng jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]

ciflow/trunk/141453

Toggle ciflow/trunk/141453's commit message
lint minor refine

0