刘凡 9ff4d1d109 add S3,archive,truncate 2 years ago
..
.gitignore 9ff4d1d109 add S3,archive,truncate 2 years ago
.gitmodules 9ff4d1d109 add S3,archive,truncate 2 years ago
LICENSE 9ff4d1d109 add S3,archive,truncate 2 years ago
README.md 9ff4d1d109 add S3,archive,truncate 2 years ago
accuracy.jpg 9ff4d1d109 add S3,archive,truncate 2 years ago
binary_tree_wide_resnet.py 9ff4d1d109 add S3,archive,truncate 2 years ago
loss.jpg 9ff4d1d109 add S3,archive,truncate 2 years ago
main.py 9ff4d1d109 add S3,archive,truncate 2 years ago
nutszebra_optimizer.py 9ff4d1d109 add S3,archive,truncate 2 years ago

README.md

What's this

Implementation of Wide networks using binary tree [1] by chainer

Dependencies

git clone https://github.com/nutszebra/wide_networks_using_binary_tree.git
cd wide_networks_using_binary_tree
git submodule init
git submodule update

How to run

python main.py -p ./ -g 0 -trb 4 -teb 4

Details about my implementation

  • Data augmentation
    Train: Pictures are randomly resized in the range of [32, 36], then 32x32 patches are extracted randomly and are normalized locally. Horizontal flipping is applied with 0.5 probability.
    Test: Pictures are resized to 32x32, then they are normalized locally. Single image test is used to calculate total accuracy.

  • Optimization
    Momentum SGD with 0.9 momentum

  • Weight decay
    0.0005

  • Batch size
    128

  • lr
    Initial learning rate is 0.2 and is multiplied by 0.2 at [60, 120, 160] epochs. Total epochs is 200.

Cifar10 result

network d k n number of parameters total accuracy (%)
[1] 4 6 2 1.7M 95.23
my implementation 4 6 2 1.67M 94.82

loss total accuracy

References

Truncating Wide Networks using Binary Tree Architectures[1]