【chainer速成】chainer图像分类从模型自定义到测试

文章首发于微信公众号《有三AI》

欢迎来到专栏《2小时玩转开源框架系列》，这是我们第八篇，前面已经说过了caffe，tensorflow，pytorch，mxnet，keras，paddlepaddle，cntk。

今天说chainer，本文所用到的数据，代码请参考我们官方git

https://github.com/longpeng2008/LongPeng_ML_Course

作者&编辑|汤兴旺

01chainer是什么

chainer是一个基于python的深度学习框架，能够轻松直观地编写复杂的神经网络架构。

当前大多数深度学习框架都基于“Define-and-Run”方案。也就是说，首先定义网络，然后用户定期向其提供小批量的训练数据。由于网络静态定义的，因此所有的逻辑必须作为数据嵌入到网络架构中。

相反，chainer采用“Define-by-Run”方案，即通过实际的前向计算动态定义网络。更确切地说，chainer存储计算历史而不是编程逻辑。这样，Chainer不需要将条件和循环引入网络定义。chainer的核心理念就是Define-by-Run。

02chainer训练准备

2.1chainer安装

chainer安装很简单，只需要在终端输入下面命令即可安装：

pipinstallchainer

2.2数据读取

在chainer中读取数据是非常简单的。数据读取部分的代码如下：

importnumpyasnp

importos

fromPILimportImage

importglob

fromchainer.datasetsimporttuple_dataset

classDataset():

def__init__(self,path,width=60,height=60):

channels=3

path=glob.glob('./mouth/*')

pathsAndLabels=[]

index=0

forpinpath:

print(p+","+str(index))

pathsAndLabels.append(np.asarray([p,index]))

index=index+1

allData=[]

forpathAndLabelinpathsAndLabels:

path=pathAndLabel[0]

label=pathAndLabel[1]

imagelist=glob.glob(path+"/*")

forimgNameinimagelist:

allData.append([imgName,label])

allData=np.random.permutation(allData)

imageData=[]

labelData=[]

下面解释下在chainer中读取数据的一些特色，完整代码请移步github。

在chainer中我们通过chainer.datasets模块来获取数据集，其最基本的数据集就是一个数组，平时最常见的NumPy和CuPy数组都可以直接用作数据集。在本实例中我们采用的是元组数据集即TupleDataset()来获取数据。

2.3网络定义

它的网络定义和pytorch基本上是相似的，如下：

classMyModel(Chain):

def__init__(self):

super(MyModel,self).__init__()

withself.init_scope():

self.conv1=L.Convolution2D(

in_channels=3,out_channels=12,ksize=3,stride=2)

self.bn1=L.BatchNormalization(12)

self.conv2=L.Convolution2D(

in_channels=12,out_channels=24,ksize=3,stride=2)

self.bn2=L.BatchNormalization(24)

self.conv3=L.Convolution2D(

in_channels=24,out_channels=48,ksize=3,stride=2)

self.bn3=L.BatchNormalization(48)

self.fc1=L.Linear(None,1200)

self.fc2=L.Linear(1200,128)

self.fc3=L.Linear(128,2)

def__call__(self,x):

returnself.forward(x)

defforward(self,x):

h1=F.relu(self.conv1(x))

h2=F.relu(self.conv2(h1))

h3=F.relu(self.conv3(h2))

h4=F.relu(self.fc1(h3))

h5=F.relu(self.fc2(h4))

x=self.fc3(h5)

return(x)

上面的例子和之前说过的caffe、tensorflow、pytorch等框架采用的网络结构是一样。这里不在赘述，我具体说下这个框架的特色。

(1)MyModel(Chain)

Chain在chainer中是一个定义模型的类，我们把模型MyModel定义为Chain的子类，即继承Chain这个类，这和Pytorch中的nn.module类似。以后我们在模型定义时都可以通过Chain来构建具有潜在深层功能和链接层次的模型。

(2)Link和Function

在Chainer中，神经网络的每一层都可以认为是由两种广泛类型的函数之一组成即Link和Function。

其中Function是一个没有可学习参数的函数，而LInk是包括参数的，我们也能把Link理解成一个赋予其参数的Function。

在我们使用它之前，我们首先需要导入相应的模块，如下：

importchainer.linksasL
importchainer.functionsasF

另外在平时使用时我们喜欢用L替代Link，用F代替Function。如L.Convolution2D和F.relu

(3)__call__

对于__call__它的作用就是使我们的chain像一个函数一样容易被调用。

03模型训练

数据加载和网络定义好后，我们就可以进行模型训练了，话不多说，我们直接上代码。

model = L.Classifier(MyModel())

if os.path.isfile('./dataset.pickle'):

    print("dataset.pickle is exist. loading...")

    with open('./dataset.pickle', mode='rb') as f:

        train, test = pickle.load(f)

        print("Loaded")

    else:

        datasets = dataset.Dataset("mouth")

        train, test = datasets.get_dataset()

        with open('./dataset.pickle', mode='wb') as f:

            pickle.dump((train, test), f)

            print("saving train and test...")

    optimizer = optimizers.MomentumSGD(lr=0.001, momentum=0.5)

    optimizer.setup(model)

    train_iter = iterators.SerialIterator(train, 64)

    test_iter = iterators.SerialIterator(test, 64, repeat=False, shuffle=True)

    updater = training.StandardUpdater(train_iter, optimizer, device=-1)

    trainer = training.Trainer(updater, (800, 'epoch'),        out='{}_model_result'.format(MyModel.__class__.__name__))

在chainer中，模型训练可以分为如下6个步骤，个人认为这6个步骤是非常好理解的。

Step-01-Dataset

第一步当然就是加载我们的数据集了，我们通常都是通过下面方法加载数据集：

train,test=datasets.get_dataset()

Step-02-Iterator

chainer提供了一些Iterator，通常我们采用下面的方法来从数据集中获取小批量的数据进行迭代。

train_iter=iterators.SerialIterator(train,batchsize)
test_iter=iterators.SerialIterator(test,batchsize,repeat=False,shuffle=True)

Step-03-Model

在chainer中chainer.links.Classifier是一个简单的分类器模型，尽管它里面有许多参数如predictor、lossfun和accfun，但我们只需赋予其一个参数那就是predictor，即你定义过的模型。

model=L.Classifier(MyModel())

Step-04-Optimizer

模型弄好后，接下来当然是优化了，在chainer.optimizers中有许多我们常见的优化器，部分优化器如下：

1、chainer.optimizers.AdaDelta

2、chainer.optimizers.AdaGrad

3、chainer.optimizers.AdaDelta

4、chainer.optimizers.AdaGrad

5、chainer.optimizers.Adam

6、chainer.optimizers.CorrectedMomentumSGD.

7、chainer.optimizers.MomentumSGD

8、chainer.optimizers.NesterovAG

9、chainer.optimizers.RMSprop

10、chainer.optimizers.RMSpropGraves

...

Step-05-Updater

当我们想要训练神经网络时，我们必须运行多次更新参数，这在chainer中就是Updater所做的工作，在本例我们使用的是training.StandardUpdater。

Step-06-Trainer

上面的工作做完之后我们需要做的就是训练了。在chainer中，训练模型采用的是training.Trainer()。

04可视化

trainer.extend(extensions.dump_graph("main/loss"))

trainer.extend(extensions.Evaluator(test_iter, model, device=-1))

trainer.extend(extensions.LogReport())

trainer.extend(extensions.PrintReport( ['epoch', 'main/loss', 'validation/main/loss', 'main/accuracy', 'validation/main/accuracy']))

trainer.extend(extensions.PlotReport(['main/loss', 'validation/main/loss'], x_key='epoch', file_name='loss.png'))

trainer.extend(extensions.PlotReport(['main/accuracy', 'validation/main/accuracy'], x_key='epoch', file_name='accuracy.png'))

trainer.extend(extensions.ProgressBar())

在chainer中可视化是非常方便的，我们常通过trainer.extend()来实现我们的可视化，其有下面几种可视化的方式。

1、chainer.training.extensions.PrintReport

2、chainer.training.extensions.ProgressBar

3、chainer.training.extensions.LogReport

4、chainer.training.extensions.PlotReport

5、chainer.training.extensions.VariableStatisticsPlot

6、chainer.training.extensions.dump_graph

以上就是利用chain来做一个图像分类任务的一个小例子。完整代码可以看配套的git项目，我们看看训练中的记录，如下：