<div class="blogpost-body" id="cnblogs_post_body">
<p style="text-align:center;"><span style="font-size:18pt;">tensorflow 函数解析</span></p>
<p>1.optimizer.minimize(loss, var_list)</p>
<p>TensorFlow为我们提供了丰富的优化函数,例如GradientDescentOptimizer。这个方法会自动根据loss计算对应variable的导数。示例如下:</p>
<div class="cnblogs_Highlighter">
<pre class="blockcode"><code class="language-python">loss = ...
opt = tf.tf.train.GradientDescentOptimizer(learning_rate=0.1)
train_op = opt.minimize(loss)
init = tf.initialize_all_variables()
with tf.Seesion() as sess:
sess.run(init)
for step in range(10):
session.run(train_op)
</code></pre>
</div>
<p> 看一下<code>minimize()</code>的源代码(为方便说明,部分参数已删除):</p>
<div class="cnblogs_code">
<pre class="blockcode"><span style="color:#008080;"> 1</span> <span style="color:#0000ff;">def</span> minimize(self, loss, global_step=None, var_list=None, name=<span style="color:#000000;">None):
</span><span style="color:#008080;"> 2</span>
<span style="color:#008080;"> 3</span> grads_and_vars = self.compute_gradients(loss, var_list=<span style="color:#000000;">var_list)
</span><span style="color:#008080;"> 4</span>
<span style="color:#008080;"> 5</span> vars_with_grad = [v <span style="color:#0000ff;">for</span> g, v <span style="color:#0000ff;">in</span> grads_and_vars <span style="color:#0000ff;">if</span> g <span style="color:#0000ff;">is</span> <span style="color:#0000ff;">not</span><span style="color:#000000;"> None]
</span><span style="color:#008080;"> 6</span> <span style="color:#0000ff;">if</span> <span style="color:#0000ff;">not</span><span style="color:#000000;"> vars_with_grad:
</span><span style="color:#008080;"> 7</span> <span style="color:#0000ff;">raise</span><span style="color:#000000;"> ValueError(
</span><span style="color:#008080;"> 8</span> <span style="color:#800000;">"</span><span style="color:#800000;">No gradients provided for any variable, check your graph for ops</span><span style="color:#800000;">"</span>
<span style="color:#008080;"> 9</span> <span style="color:#800000;">"</span><span style="color:#800000;"> that do not support gradients, between variables %s and loss %s.</span><span style="color:#800000;">"</span> %
<span style="color:#008080;">10</span> ([str(v) <span style="color:#0000ff;">for</span> _, v <span style="color:#0000ff;">in</span><span style="color:#000000;"> grads_and_vars], loss))
</span><span style="color:#008080;">11</span>
<span style="color:#008080;">12</span> <span style="color:#0000ff;">return</span> self.apply_gradients(grads_and_vars, global_step=<span style="color:#000000;">global_step,
</span><span style="color:#008080;">13</span> name=name)</pre>
</div>
<p>源代码可以知道<code>minimize()</code>实际上包含了两个步骤,即<code>compute_gradients</code>和<code>apply_gradients</code>,前者用于计算梯度,后者用于使用计算得到的梯度来更新对应的variable,<strong><a href="https://blog.csdn.net/NockinOnHeavensDoor/article/details/80632677#%E6%A2%AF%E5%BA%A6%E4%BF%AE%E5%89%AA%E4%B8%BB%E8%A6%81%E9%81%BF%E5%85%8D%E8%AE%AD%E7%BB%83%E6%A2%AF%E5%BA%A6%E7%88%86%E7%82%B8%E5%92%8C%E6%B6%88%E5%A4%B1%E9%97%AE%E9%A2%98">梯度修剪主要避免训练梯度爆炸和消失问题</a></strong>。下面对这两个函数做具体介绍。</p>
<p>1.1 computer_gradients(loss, val_list)</p>
<p>参数含义:</p>
<ul><li><strong>loss</strong>: 需要被优化的Tensor</li><li><strong>val_list</strong>: Optional list or tuple of <code>tf.Variable</code> to update to minimize <code>loss</code>. Defaults to the list of variables collected in the graph under the key <code>GraphKeys.TRAINABLE_VARIABLES</code>.</li></ul>
<p style="margin-left:30px;">简单说该函数就是用于计算loss对于指定val_list的导数的,最终返回的是元组列表,即[(gradient, variable),...]。</p>
<div class="cnblogs_code">
<pre class="blockcode"><span style="color:#008080;">1</span> x = tf.Variable(initial_value=50., dtype=<span style="color:#800000;">'</span><span style="color:#800000;">float32</span><span style="color:#800000;">'</span><span style="color:#000000;">)
</span><span style="color:#008080;">2</span> w = tf.Variable(initial_value=10., dtype=<span style="color:#800000;">'</span><span style="color:#800000;">float32</span><span style="color:#800000;">'</span><span style="color:#000000;">)
</span><span style="color:#008080;">3</span> y = w*<span style="color:#000000;">x
</span><span style="color:#008080;">4</span>
<span style="color:#008080;">5</span> opt = tf.train.GradientDescentOptimizer(0.1<span style="color:#000000;">)
</span><span style="color:#008080;">6</span> grad =<span style="color:#000000;"> opt.compute_gradients(y, [w,x])
</span><span style="color:#008080;">7</span> <span style="color:#000000;">with tf.Session() as sess:
</span><span style="color:#008080;">8</span> <span style="color:#000000;"> sess.run(tf.global_variables_initializer())
</span><span style="color:#008080;">9</span> <span style="color:#0000ff;">print</span>(sess.run(grad))</pre>
</div>
<div class="cnblogs_code">
<pre class="blockcode">>>> [(50.0, 10. |
|