An implementation of ShuffleNet
in PyTorch. ShuffleNet
is an efficient convolutional neural network architecture for mobile devices. According to the paper, it outperforms Google's MobileNet by a small percentage.
In one sentence, ShuffleNet
is a ResNet-like model that uses residual blocks (called ShuffleUnits
), with the main innovation being the use of pointwise, or 1x1, group convolutions as opposed to normal pointwise convolutions.
Clone the repo:
git clone https://github.com/jaxony/ShuffleNet.git
Use the model defined in model.py
:
from model import ShuffleNet
# running on MNIST
net = ShuffleNet(num_classes=10, in_channels=1)
Trained on ImageNet (using the PyTorch ImageNet example) with
groups=3
and no channel multiplier. On the test set, got 62.2% top 1 and
84.2% top 5. Unfortunately, this isn't comparable to Table 5 of the paper,
because they don't run a network with these settings, but it is somewhere
between the network with groups=3
and half the number of channels (42.8%
top 1) and the network with the same number of channels but groups=8
(32.4% top 1). The pretrained state dictionary can be found here, in
the following
format:
{
'epoch': epoch + 1,
'arch': args.arch,
'state_dict': model.state_dict(),
'best_prec1': best_prec1,
'optimizer' : optimizer.state_dict()
}
Note: trained with the default ImageNet settings, which are actually
different from the training regime described in the paper. Pending running
again with those settings (and groups=8
).