大家可能对Chainer这个词很陌生,但是如果我说tensorflow,keras,caffe这些深度学习框架呢?哦,大家是不是想起什么了,没错,就是你心里想的那样!Chainer诞生于2015年,于2016年转向开源正式进入公众视野,尽管其GitHub代码库非常活跃,但却并没能引起业界的应有重视。可这并不影响该框架的性能,英特尔公司就决定将Chainer作为一种理想的AI工作负载开发途径,并以此为基础促进自家芯片的市场需求量。而且该框架在日本也被广泛使用,存在即合理,就让我们开始学习Chainer这一框架吧! Chainer:A Powerful, Flexible, and Intuitive Framework for Neural Networks,这是Chainer的官网链接.
至此,chainer已经可以支持CPU运行了。但是若想其运行在GPU上还需在你电脑上安装NVIDIA CUDA / cuDNN环境,并且需安装与CUDA对应版本的CuPy包。CuPy是CUDA上与NumPy兼容的多维数组的实现。Cupy由 cupy.ndarray构成,是多维数组类的核心,很多函数都在里面,同时也支持numpy.ndarray的接口。
5、安装CuPy,想要了解更多CuPy内容可参考链接
(For CUDA 8.0)
pip install cupy-cuda80
(For CUDA 9.0)
pip install cupy-cuda90
(For CUDA 9.1)
pip install cupy-cuda91
(For CUDA 9.2)
pip install cupy-cuda92
6、卸载chainer
pip uninstall chainer
5、利用Chainer实践Mnist例子
直接附上代码:
#!/usr/bin/env python
# coding: utf-8
from __future__ import print_function
import numpy as np
import chainer
from chainer import backend, backends
from chainer.backends import cuda
from chainer import Function, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
import matplotlib.pyplot as plt
from chainer.datasets import mnist
from chainer import iterators
from chainer.dataset import concat_examples
from chainer.backends.cuda import to_cpu
train, test = mnist.get_mnist(withlabel=True, ndim=1)
x, t = train[0]
plt.imshow(x.reshape(28, 28), cmap="gray")
plt.savefig("5.png")
batchsize = 128
train_iter = iterators.SerialIterator(train, batchsize)
test_iter = iterators.SerialIterator(test, batchsize, repeat=False, shuffle=False)
# 定义训练模型
class MyNetwork(Chain):
def __init__(self, n_mid_units=100, n_out=10):
super(MyNetwork, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_mid_units)
self.l2 = L.Linear(n_mid_units, n_mid_units)
self.l3 = L.Linear(n_mid_units, n_out)
def forward(self, x):
h = F.relu(self.l1(x))
h = F.relu(self.l2(h))
return self.l3(h)
model = MyNetwork()
gpu_id = -1 # Set to 0 if you use GPU
if gpu_id >= 0:
model.to_gpu(gpu_id)
# 定义优化器
optimizer = optimizers.MomentumSGD(lr=0.01, momentum=0.9)
optimizer.setup(model)
# 开始训练模型
max_epoch = 20
while train_iter.epoch < max_epoch:
# ---------- One iteration of the training loop ----------
train_batch = train_iter.next()
image_train, target_train = concat_examples(train_batch, gpu_id)
# Calculate the prediction of the network
prediction_train = model(image_train)
# Calculate the loss with softmax_cross_entropy
loss = F.softmax_cross_entropy(prediction_train, target_train)
# Calculate the gradients in the network
model.cleargrads()
loss.backward()
# Update all the trainable parameters
optimizer.update()
# --------------------- until here ---------------------
# Check the validation accuracy of prediction after every epoch
if train_iter.is_new_epoch: # If this iteration is the final iteration of the current epoch
# Display the training loss
print('epoch:{:02d} train_loss:{:.04f} '.format(
train_iter.epoch, float(to_cpu(loss.data))), end='')
test_losses = []
test_accuracies = []
while True:
test_batch = test_iter.next()
image_test, target_test = concat_examples(test_batch, gpu_id)
# Forward the test data
prediction_test = model(image_test)
# Calculate the loss
loss_test = F.softmax_cross_entropy(prediction_test, target_test)
test_losses.append(to_cpu(loss_test.data))
# Calculate the accuracy
accuracy = F.accuracy(prediction_test, target_test)
accuracy.to_cpu()
test_accuracies.append(accuracy.data)
if test_iter.is_new_epoch:
test_iter.epoch = 0
test_iter.current_position = 0
test_iter.is_new_epoch = False
test_iter._pushed_position = None
break
print('val_loss:{:.04f} val_accuracy:{:.04f}'.format(
np.mean(test_losses), np.mean(test_accuracies)))
# 保存最佳模型
serializers.save_npz('train_mnist.model', model)
# 利用保存的模型对新数据进行预测
model = MyNetwork()
serializers.load_npz('train_mnist.model', model)
x, t = test[0]
plt.imshow(x.reshape(28, 28), cmap='gray')
plt.savefig('7.png')
print('label: ', t)
# 预测
print(x.shape, end=' -> ')
x = x[None, ...]
print(x.shape)
y = model(x)
y = y.data
pred_label = y.argmax(axis=1)
print("predicted label:", pred_label[0])