机器如何学习直觉

Machine Learning is immensely popular nowadays, influencing what content we see, what products we buy, who gets a mortgage approved and who doesn’t. But how does it work?

如今，机器学习非常流行，它影响着我们看到的内容，购买的产品，谁获得了抵押贷款而谁没有批准。但是它如何工作？

By the end of the article, you will have an understanding of how a machine like a computer can learn. The focus of this article is to develop an intuition to the inner workings of Machine Learning. Not focusing on any particular algorithm, but the intuition behind them.

到本文结尾，您将了解如何像计算机一样的机器学习。本文的重点是发展对机器学习内部运作的直觉。不专注于任何特定算法，而是其背后的直觉。

Machine learning and Artificial intelligent tackle numerous problems. Here I will focus on one of the most common and popular problems. Classification.

机器学习和人工智能解决了许多问题。在这里，我将重点介绍最常见和最受欢迎的问题之一。分类。

Classification is the task of: Given a set of data, predict the correct class. Taking information that we have to predict something that we don’t have.For example. In a bank, a person is applying for a mortgage. We know the person’s job history, transaction history, age, borrowing amount and house price. What the bank wants to know is if approving this mortgage is going to be a good investment or a bad one. In this case, the class to predict is “good investment” or “bad investment”.

分类是以下任务：给定一组数据，预测正确的分类。获取我们必须预测的信息，例如我们没有的东西。在银行中，一个人正在申请抵押。我们知道该人的工作历史，交易历史，年龄，借贷金额和房价。银行想知道的是，批准这种抵押将是一项好投资还是一项坏投资。在这种情况下，要预测的类别是“良好投资”或“不良投资”。

Classification let us abstract, group and make better decisions about the world.

分类使我们可以对世界进行抽象，分组和更好的决策。

So, how can machines learn to do classification?They learn by example. To teach a machine, we show them examples to learn from. We don’t tell them what to do. We show them examples of what were good investments and bad investments in the past and let it learn from them. We call this training data.

那么，机器如何学习分类呢？为了教一台机器，我们向他们展示了一些值得学习的例子。我们不告诉他们该怎么办。我们向他们展示过去曾是好投资和坏投资的例子，并让他们向他们学习。我们称此训练数据。

Okay, so a machine learns by example. But how does it learn?Here, things vary a bit. But the two main approaches are:

好的，因此机器是通过示例学习的。但是它是如何学习的呢？但是，两种主要方法是：

- Learn the decision boundary.- Learn where does the sample to predict falls in the world?

-了解决策边界-了解样本在哪里预测世界范围内的下降？

The decision boundary is the line that best divides the examples we give into the classes that we want. At some point, we need to make a decision.To create a rule or set of rules that divide the two classes. That is the decision boundary.

决策边界是最能将我们给出的示例划分为所需类的界线。在某个时候，我们需要做出决定。要创建一个将这两个类划分的规则或一组规则。那就是决策边界。

A simple example would be if we were trying to classify buildings into the classes skyscraper / non-skyscraper. The modern definition is that buildings above 150m are skyscrapers. But let’s say we didn’t know that. Instead, we had a set of examples of skyscraper and non-skyscraper buildings. So, if we give a machine a set of example buildings with their height and class, it will try to find the line that best divides the buildings.

一个简单的例子是，如果我们试图将建筑物分类为摩天大楼/非摩天大楼类。现代定义是150m以上的建筑物是摩天大楼。但是，可以说我们不知道。相反，我们有一系列摩天大楼和非摩天大楼建筑的示例。因此，如果我们给机器提供一组具有其高度和等级的示例建筑物，它将尝试找到最能划分建筑物的线。

Image for post — Simple decision boundary

That was a very simple example, the decision boundary can be far more complex.

那是一个非常简单的例子，决策边界可能要复杂得多。

The way the machine finds this line will vary based on the specific Machine learning algorithm. From trial and error, decision trees, support vectors, differentiation, etc. But the principle is the same. Find the line that more accurately separates the classes.

机器找到这条线的方式将根据特定的机器学习算法而有所不同。从试错法，决策树，支持向量，微分等等。但是原理是相同的。找到更准确地分隔类的行。

The other approach is to find where does the sample to predict falls in the world. The world meaning the space represented by the input variables of the training data (what we know). In this case, the machine is looking for similar examples.

另一种方法是找到样本来预测世界范围内的下降。 世界是指由训练数据的输入变量表示的空间(我们知道)。在这种情况下，机器正在寻找类似的例子 。

Things tend to be what is similar to them. If it walks like a duck and quacks like a duck. It is likely to be a duck.

事物往往与它们相似。如果它走路像鸭子，嘎嘎叫鸭子。它可能是一只鸭子。

If we had had all the possible examples for every possible variable. And we wanted to predict a class for a given sample. We merely had to look up what class a sample with those values was. But we rarely know every sample in the world.

如果我们拥有每个可能变量的所有可能示例。我们想预测给定样本的类别。我们只需要查找具有这些值的样本是什么类。但是我们很少知道世界上的每个样本。

So we take approaches that approximate this. We can look at the neighbours and predict the most likely class based on them. We see most neighbours are class A, so, it is likely that the sample we want to know is from class A as well.

因此，我们采用近似于此的方法。我们可以看看邻居，并根据他们预测最可能的阶级。我们看到大多数邻居都是A类，因此，我们想知道的样本也可能来自A类。

Another approach involves learning the probability distributions for each class for each input variable. And from there, calculate the most likely class of the sample to predict.

另一种方法涉及为每个输入变量学习每个类别的概率分布。然后从那里计算出最有可能预测的样本类别。

In essence, to learn how likely samples with certain values are to belong to a certain class.

本质上，要了解具有某些值的样本属于某个类别的可能性。

Both examples build an approximate model of the world and then answer the question. Where does this new sample fall in the world?

这两个例子都建立了一个近似的世界模型，然后回答了这个问题。 这个新样本在世界上落在哪里？

The examples here presented are intentionally simplified to develop an intuition for them. Only dealing with one or two variables.In the real world, Machine Learning can leverage computation power to learn from millions of training samples with complex relations between numerous variables and numerous classes. Machine Learning can provide predictions at a much faster speed than any human can. And, in some cases, more accurately.

此处给出的示例经过有意简化，以期为他们带来直觉。仅处理一个或两个变量。在现实世界中，机器学习可以利用计算能力从数百万个训练样本中学习，这些样本具有众多变量和众多类之间的复杂关系。机器学习可以提供比任何人都快得多的预测速度。并且，在某些情况下，更准确。

This is what gives ML its power.

这就是赋予ML强大功能的原因。

It would be impractical to have a person dedicated to choosing what videos are likely to interest specifically to you on youtube. But a machine? No problem.What class of products might interest you? Just send in some profile information and get the top classes.Predict the price of a house based on size, age, number of bathrooms, location and similar data from previous house sales? Yup, Machine Learning.

让专人选择YouTube上您可能特别感兴趣的视频是不切实际的。但是机器呢？没问题，您可能会对哪种产品感兴趣？只需发送一些个人资料信息并获得顶级类别，即可根据房屋的大小，年龄，浴室数量，位置以及之前房屋销售中的类似数据预测房屋价格？是的，机器学习。

There are a lot of applications and benefits to machine learning. Some warning signs too. But it is a very powerful technology and it is here to stay.

机器学习有很多应用和好处。一些警告标志。但这是一项非常强大的技术，并且将持续下去。

I hope this article helped you to develop an intuition on how machines learn. It’s not magic. Some logical principles, data, and clever mathematics to find a solution. With machine learning is possible to tackle complex problems where we don’t know the underlying relationships and rules. And it’s allowing us to make better decisions about the world.

我希望本文能帮助您对机器学习方式有一个直观的认识。这不是魔术。一些逻辑原理，数据和聪明的数学找到了解决方案。使用机器学习可以解决我们不了解基本关系和规则的复杂问题。它使我们能够对世界做出更好的决策。

Machines learn by example.
机器以身作则。
We use data that we know to find what we don’t know
我们使用已知的数据来查找未知的数据
At some point, we need to make a decision. Machines can learn by learning where the decision boundary is.
在某个时候，我们需要做出决定。 机器可以通过学习决策边界在哪里来学习。
Things tend to be what is similar to them. Machines can learn by learning a model of the world and check where a sample falls within the world.
事物往往与它们相似。 机器可以通过学习世界模型来学习，并检查样本在世界范围内的位置。

翻译自: https://medium.com/swlh/how-do-machines-learn-an-intuition-99465f2dff2f