Description
Looking through some models and I notice this is used in one of them. I don't know its relative importance, but noting it for completeness and tracking.
TorchSharp:
-
avgpool1d, 2d, 3d and reverse mode for these (TorchSharp)
-
activation functions gelu, silu, hardswish, relu6, hardsigmoid (TorchSharp)
-
permute (TorchSharp )
-
split based on count (TorchSharp)
-
UpSampling1d, 2d, 3d (TorchSharp)
DiffSharp:
-
avgpool1d, 2d, 3d and reverse mode for these, done pending merge, see feature/avgpool - avgpool1d, avgpool2d, avgpool3d #252
-
permute (DiffSharp) See Implement permute #193, done pending merge, see feature/permute - permute #254
-
activation functions gelu, silu, hardswish, relu6, hardsigmoid
-
split based on count
-
LayerNorm functions and model
-
mean/sum/stddev of multiple dimensions (was mean/sum/stddev of multiple dimensions #216)
-
DepthwiseConv2d
Other things to consider:
-
RMSProp optimizer https://pytorch.org/docs/stable/optim.html#torch.optim.RMSprop
-
AdaDelta optimizer
-
GlobalAvgPool2d model
-
UpSampling2d model
-
MaxPool1d/2d/3d model
-
ZeroPadding2d function and model
-
randn giving mean and stddev
-
Embedding
-
Use TorchSharp loss functions (binary_cross_entropy etc.)
-
max/min along dimensions (was max/min along dimensions #232)