Implementation of Wide networks using binary tree [1] by chainer
git clone https://github.com/nutszebra/wide_networks_using_binary_tree.git
cd wide_networks_using_binary_tree
git submodule init
git submodule update
python main.py -p ./ -g 0 -trb 4 -teb 4
Data augmentation
Train: Pictures are randomly resized in the range of [32, 36], then 32x32 patches are extracted randomly and are normalized locally. Horizontal flipping is applied with 0.5 probability.
Test: Pictures are resized to 32x32, then they are normalized locally. Single image test is used to calculate total accuracy.
Optimization
Momentum SGD with 0.9 momentum
Weight decay
0.0005
Batch size
128
lr
Initial learning rate is 0.2 and is multiplied by 0.2 at [60, 120, 160] epochs. Total epochs is 200.
network | d | k | n | number of parameters | total accuracy (%) |
---|---|---|---|---|---|
[1] | 4 | 6 | 2 | 1.7M | 95.23 |
my implementation | 4 | 6 | 2 | 1.67M | 94.82 |
Truncating Wide Networks using Binary Tree Architectures[1]