SGD

SGD 随机梯度下降：y = x*x

在我们步长确定的情况，函数最快达到最小值的方向就是当前梯度的方向。

# -*- coding: UTF-8 -*-
import numpy as np
import math

__author__ = 'radio'
x = 2 * np.random.rand(100, 1)

y = 4 + 3 * x + np.random.randn(100, 1)
x_b = np.c_[np.ones((100, 1)), x]
print("1\n{}".format(x_b[0:3]))
n_epochs = 100
t0, t1 = 1, 10

m = n_epochs
def learning_schedule(t):
    return t0 / (t + t1)

theta = np.random.randn(2, 1)

for epoch in range(n_epochs):
    for i in range(m):
        random_index = np.random.randint(m)
        x_i = x_b[random_index:random_index+1]
        y_i = y[random_index:random_index+1]
        gradients = 2 * x_i.T.dot(x_i.dot(theta)-y_i)  
        learning_rate = learning_schedule(epoch * m + i)
        theta = theta - learning_rate * gradients

    if epoch % 30 == 0:
        print("1\n{}".format(theta))
print("1\n{}".format(theta))

error = math.sqrt(math.pow((theta[0][0] - 4), 2) + math.pow((theta[1][0] - 3), 2))
print("error\n{}".format(error))

代码解析：

梯度下降：向着最快下山的方向。

最小二乘原理：最小二乘法采用的手段是尽量使得等号两边的方差最小，也就是找出这个函数的最小值：

参考文章：

https://www.cnblogs.com/yszd/p/9280584.html

https://blog.csdn.net/m2284089331/article/details/76492521

https://blog.csdn.net/kwame211/article/details/80364079(总结十分到位)