You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update on "[WIP] sync and async torch.distributed.rpc for builtin operators"
Features:
* sync and async RPC for builtin operators
* RpcAgent API
* ProcessGroupAgent implementation
Goal:
* have a minimum working and testable RPC implementation for #23110
* make sure the RpcAgent API is sufficient for future ThriftAgent and TensorPipeAgent implementation
* For tensor pipe implementation, it might allocate multiple underlying communication channels with different types, and might also use streaming serialization/deserialization for large tensors. To support this requirement, the current implementation only convert a BuiltinOp into a Message which contains a byte vector and a tensor table. It is up to the RpcAgent implementation to determine how it would like to serialize a Message object.
* For ThriftAgent, as Thrift has it own request/response matching solution, the Message.id is no longer necessary. Hence the id can be dropped during serialization. All it needs to do is to pass the response Message object to the Future returned by send(...).
* support blocking and non-blocking RequestCallback
* blocking means the callback won't return before sending out the response
* non-blocking can be achieved by enqueue the `(from, request, RpcAgent&)` tuple and use a different thread to process them. That is why there is an `RpcAgent&` arg in the param list.
Differential Revision: [D15194693](https://our.internmc.facebook.com/intern/diff/D15194693/)
Copy file name to clipboardExpand all lines: .circleci/README.md
+2-3Lines changed: 2 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -272,13 +272,13 @@ Manywheels are pip packages for linux distros. Note that these manywheels are no
272
272
273
273
The entrypoint file `builder/manywheel/build_common.sh` is really really complicated because
274
274
275
-
* This used to handle building for several different python versions at the same time. This is why there are loops everywhere
275
+
* This used to handle building for several different python versions at the same time. The loops have been removed, but there's still unneccessary folders and movements here and there.
276
276
* The script is never used this way anymore. This extra machinery could be removed.
277
277
* This used to handle testing the pip packages too. This is why there’s testing code at the end that messes with python installations and stuff
278
278
* The script is never used this way anymore. This extra machinery could be removed.
279
279
* This also builds libtorch packages
280
280
* This should really be separate. libtorch packages are c++ only and have no python. They should not share infra with all the python specific stuff in this file.
281
-
* There is a lot of messing with rpaths. This is necessary, but could be made much much simpler if the loops for libtorch and separate python versions were removed.
281
+
* There is a lot of messing with rpaths. This is necessary, but could be made much much simpler if the above issues were fixed.
282
282
283
283
## Wheels (MacOS pip and libtorch packages)
284
284
@@ -307,7 +307,6 @@ Libtorch packages are built in the wheel build scripts: manywheel/build_*.sh for
307
307
* It’s confusinig. Most of those scripts deal with python specifics.
308
308
* The extra conditionals everywhere severely complicate the wheel build scripts
309
309
* The process for building libtorch is different from the official instructions (a plain call to cmake, or a call to a script)
310
-
* For Linux specifically, the job is set up to build all libtorch varieties in a single go. This leads to 9+ hour builds times for CUDA 10.0 libtorch. This is more of a problem with the circleci setup though.
0 commit comments