PFNが提供するChainerですが、バージョンの移り変わりが激しい(最新情報に追従しているのでうれしいこと)ため、仕様がかわってることがあるので、メモ代わりに残します。結構、気づかないとはまりますね。
dropoutなどを入れてる場合は,Evaluatorの挙動に注意
VGGベースのネットワークで下記のようなものを使う場合、Evaluatorからforwordが呼ばれるので、テスト時にdropoutが有効になってしまうので毎回結果が変わる。
テスト時にdropoutを無効にするにはchainer.configで条件分岐を記載すること。
<https://github.com/chainer/chainer/blob/master/examples/cifar/models/VGG.py>よりネットワーク構造部を引用
上記にchainer.configで条件分岐する例が記載されている。Example参照
class VGG(chainer.Chain): | |
"""A VGG-style network for very small images. | |
This model is based on the VGG-style model from | |
http://torch.ch/blog/2015/07/30/cifar.html | |
which is based on the network architecture from the paper: | |
https://arxiv.org/pdf/1409.1556v6.pdf | |
This model is intended to be used with either RGB or greyscale input | |
images that are of size 32x32 pixels, such as those in the CIFAR10 | |
and CIFAR100 datasets. | |
On CIFAR10, it achieves approximately 89% accuracy on the test set with | |
no data augmentation. | |
On CIFAR100, it achieves approximately 63% accuracy on the test set with | |
no data augmentation. | |
Args: | |
class_labels (int): The number of class labels. | |
""" | |
def __init__(self, class_labels=10): | |
super(VGG, self).__init__() | |
with self.init_scope(): | |
self.block1_1 = Block(64, 3) | |
self.block1_2 = Block(64, 3) | |
self.block2_1 = Block(128, 3) | |
self.block2_2 = Block(128, 3) | |
self.block3_1 = Block(256, 3) | |
self.block3_2 = Block(256, 3) | |
self.block3_3 = Block(256, 3) | |
self.block4_1 = Block(512, 3) | |
self.block4_2 = Block(512, 3) | |
self.block4_3 = Block(512, 3) | |
self.block5_1 = Block(512, 3) | |
self.block5_2 = Block(512, 3) | |
self.block5_3 = Block(512, 3) | |
self.fc1 = L.Linear(None, 512, nobias=True) | |
self.bn_fc1 = L.BatchNormalization(512) | |
self.fc2 = L.Linear(None, class_labels, nobias=True) | |
def forward(self, x): | |
# 64 channel blocks: | |
h = self.block1_1(x) | |
h = F.dropout(h, ratio=0.3) | |
h = self.block1_2(h) | |
h = F.max_pooling_2d(h, ksize=2, stride=2) | |
# 128 channel blocks: | |
h = self.block2_1(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block2_2(h) | |
h = F.max_pooling_2d(h, ksize=2, stride=2) | |
# 256 channel blocks: | |
h = self.block3_1(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block3_2(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block3_3(h) | |
h = F.max_pooling_2d(h, ksize=2, stride=2) | |
# 512 channel blocks: | |
h = self.block4_1(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block4_2(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block4_3(h) | |
h = F.max_pooling_2d(h, ksize=2, stride=2) | |
# 512 channel blocks: | |
h = self.block5_1(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block5_2(h) | |
h = F.dropout(h, ratio=0.4) | |
h = self.block5_3(h) | |
h = F.max_pooling_2d(h, ksize=2, stride=2) | |
h = F.dropout(h, ratio=0.5) | |
h = self.fc1(h) | |
h = self.bn_fc1(h) | |
h = F.relu(h) | |
h = F.dropout(h, ratio=0.5) | |
return self.fc2(h) |