<p><span style="color:#262626;">神经网络的典型训练过程如下:</span></p>
<ul style="margin-left:0px;"><li>定义具有一些可学习参数(或权重)的神经网络</li><li>遍历输入数据集</li><li>通过网络处理输入</li><li>计算损失(输出正确的距离有多远)</li><li>将梯度传播回网络参数</li><li>通常使用简单的更新规则来更新网络的权重: <code><span style="color:#6c6c6d;">weight</span> <span style="color:#6c6c6d;">=</span> <span style="color:#6c6c6d;">weight</span> <span style="color:#6c6c6d;">-</span> <span style="color:#6c6c6d;">learning_rate</span> <span style="color:#6c6c6d;">*</span> <span style="color:#6c6c6d;">gradient</span></code></li></ul>
<p>网络定义例子如下:</p>
<pre class="blockcode"><code>import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 1 input image channel, 6 output channels, 3x3 square convolution
# kernel
self.conv1 = nn.Conv2d(1, 6, 3)
self.conv2 = nn.Conv2d(6, 16, 3)
# an affine operation: y = Wx + b
self.fc1 = nn.Linear(16 * 6 * 6, 120) # 6*6 from image dimension
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
# Max pooling over a (2, 2) window
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# If the size is a square you can only specify a single number
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
net = Net()
print(net)</code></pre>
<p><span style="color:#262626;"> 通常,当您必须处理图像,文本,音频或视频数据时,可以使用将数据加载到numpy数组中的标准python包。然后,您可以将此数组转换为<code><span style="color:#6c6c6d;">torch.*Tensor</span></code>。</span></p>
<ul style="margin-left:0px;"><li>对于图像,Pillow,OpenCV等软件包很有用</li><li>对于音频,请使用scipy和librosa等软件包</li><li>对于文本,基于Python或Cython的原始加载,或者NLTK和SpaCy很有用</li></ul>
<p><span style="color:#262626;"> 具体地,对于视觉,我们已经创建了一个叫做 <code><span style="color:#6c6c6d;">torchvision</span></code>,其中有对普通数据集如Imagenet,CIFAR10,MNIST等和用于图像数据的变压器,即,数据装载机 <code><span style="color:#6c6c6d;">torchvision.datasets</span></code>和<code><span style="color:#6c6c6d;">torch.utils.data.DataLoader</span></code>。</span></p>
<p><span style="color:#262626;"> 这提供了极大的便利,并且避免了编写样板代码。</span></p>
<h2 style="margin-left:0px;"><span style="color:#262626;">训练图像分类器</span></h2>
<p><span style="color:#262626;">我们将按顺序执行以下步骤:</span></p>
<ol style="margin-left:0px;"><li>使用以下命令加载和标准化CIFAR10训练和测试数据集 <code><span style="color:#6c6c6d;">torchvision</span></code></li><li>定义卷积神经网络</li><li>定义损失函数</li><li>根据训练数据训练网络</li><li>在测试数据上测试网络</li></ol>
<h2>使用numpy实现网络实例</h2>
<pre class="blockcode"><code># -*- coding: utf-8 -*-
import numpy as np
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random input and output data
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)
# Randomly initialize weights
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)
learning_rate = 1e-6
for t in range(500):
# Forward pass: compute predicted y
h = x.dot(w1)
h_relu = np.maximum(h, 0)
y_pred = h_relu.dot(w2)
# Compute and print loss
loss = np.square(y_pred - y).sum()
print(t, loss)
# Backprop to compute gradients of w1 and w2 with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.T.dot(grad_y_pred)
grad_h_relu = grad_y_pred.dot(w2.T)
grad_h = grad_h_relu.copy()
grad_h[h < 0] = 0
grad_w1 = x.T.dot(grad_h)
# Update weights
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2</code></pre>
<p> <span style="color:#f33b45;"> Numpy是一个很棒的框架,但是它不能利用GPU来加速其数值计算。对于现代深度神经网络,GPU通常会提供</span><a class="reference external" href="https://github.com/jcjohnson/cnn-benchmarks"><span style="color:#f33b45;">50倍或更高的</span></a><span style="color:#f33b45;">加速</span><a class="reference external" href="https://github.com/jcjohnson/cnn-benchmarks"><span style="color:#f33b45;">比</span></a><span style="color:#f33b45;">,因此不幸的是,仅凭numpy不足以实现现 |
|