The premise of supervised learning to solve the classification problem is that there must be a sample set with label data , But the price of getting data label is often very expensive . meanwhile , These labels are usually manually labeled , Marking errors also occur from time to time . This promotes the development of unsupervised learning strategies , In short, it is :
Machine learning method for reasoning on unlabeled data .
1. scene

Because the premise of unsupervised learning is that there is no need for early human judgment , Therefore, it is generally used as a pre-step of a learning task , For specification data ; After unsupervised learning , Human knowledge needs to be added to make the results useful . chart 1-10 The two learning strategies are compared from the time point of human knowledge .

chart 1-10 Supervised learning and unsupervised learning
Generally speaking, it is easier for human beings to understand the data regulated by unsupervised learning than to organize the labels in the sample data , So in general, unsupervised learning requires less human participation .
Unsupervised learning algorithm is rich , There are two main branches in the way data are organized :
 clustering (Clustering): Is the most important unsupervised learning method , It refers to dividing the existing sample data into several subsets . The generated model can also be used to categorize new samples .
 Dimension reduction (Dimensionality Reduction): That is to keep the existing distance relationship between data unchanged , Converting high dimensional data to low dimensional data ,.
In addition, there are some small groups of algorithms, such as covariance analysis (Covariance Estimation), edge detection (Outlier Detection) etc .

chart 1-11 An example is given to illustrate the application scenarios of clustering as the most important unsupervised learning method . It is a clustering diagram of bank customers , It divides the existing customers into two subsets . After cluster training , New customers can also be divided into corresponding subsets by existing models .

chart 1-11 Examples of clustering scenarios

Clustering only provides a subset partition scheme , The logical meaning of division needs human beings to distinguish . In the figure 1-11 in , From the results, the algorithm divides all customers into two categories according to the amount of deposits and loans . For most banks , Possible subset 1 Corresponding to ordinary users , subset 2 Corresponding to important customers .
2. clustering algorithm
Clustering algorithm is still a developing field , Various methods are complicated . This book mainly studies several mature clustering strategies at present , They are :
 Distance segmentation method (Partition Methods): It is a basic algorithm , Clustering is performed according to the distance between features . The specific algorithm mainly refers to K-means Sum and its derivation algorithm .
 Density method (Density
Methods): The partition is realized by defining the minimum number of members and the distance between members of each subset . The most typical algorithm is DBSCAN, Namely Density-Based Spatial
Clustering of Applications with Noise.
 Model method (Model Methods): Using probability model ( Gaussian mixture model is a typical model , Namely Gaussian Mixture
Model) And neural network model (SOM,Self Organizing
Maps) As the main representative . It is characterized by incomplete identification of samples as belonging to a subset , It points out the possibility that the sample belongs to each subset .
 hierarchical method (Hierarchical
Methods): Unlike other clusters, the population is divided into subsets with equal status , Finally, the hierarchical method divides the data set into tree structure with parent-child relationship . In this way, we can study the relationship between the subclasses at the same time of clustering , It's typical birch Model .
3. Dimension reduction algorithm
As mentioned above , Dimension reduction is usually used to compress the number of features for subsequent processing , Compared with clustering, it is a little abstract . This book introduces two types of dimension reduction strategies :
 Linear dimensionality reduction : As the name suggests, it is used to deal with linear problems . The model is simple , Including the common principal component analysis (PCA,Principle Component
Analysis) And linear discriminant analysis (LDA,Linear Discriminant Analysis)
 Popular learning (Manifold
Learning): It is a hot spot in recent academic circles , It can deal with non-linear dimension reduction . At present, more mature algorithms include Isomap, Local linear embedding (LLE,Locally Linear
Embedding) etc .
This book 4,5 The main algorithms of clustering and dimensionality reduction are discussed in detail .

** Learning from machines , To deep learning
Learning from depth , To intensive learning
From reinforcement learning , To intensive learning
From optimization model , Transfer learning to model
One book is done !
**

Technology