KL divergence

KL Divergence is an index to measure the matching degree of two probability distributions , The greater the difference between the two distributions ,KL The greater the divergence .

definition :

among p(x) It's a real distribution ,q(x) It's the target distribution ( Modeling target distribution ), If the two distributions match exactly , that

 

  first group : The data set is collected 100 Personal age , As shown in the table below , We use KL Divergence to study the most consistent distribution type .

age012345678910 total
count3671113181511754100
attempt 1: Modeling using uniform distribution

Visualized as : The yellow one is the target uniform distribution model , Comparison with the true blue distribution .

 

attempt 2: Using Gaussian distribution to build the model

Visualized as :( The red dotted line is a normal distribution curve of the same and fitting , The blue bar chart shows the probability density ):

Computational analysis :

How to judge whether the real distribution is closer to uniform distribution or Gaussian distribution , The naked eye is very inaccurate , use KL Divergence is used to measure the amount of information lost when the real distribution matches the target distribution . So the model can be quantified and compared to determine which distribution is close to .

1, Computation and uniformly distributed KL divergence :
import numpy as np import math count=np.array([3,6,7,11,13,18,15,11,7,5,4])
count_rate=count/100 balance_rate=1/11 sum=0 for i in range(11):
sum+=count_rate[i]*math.log(count_rate[i]/balance_rate) print(sum)
The calculation results are as follows :0.12899493763053263

 

2, Computation and Gaussian distribution KL divergence :
def gaosi(x): mu=5.03 sigma=2.4349743325135895
t1=1/(sigma*math.sqrt(2*math.pi)) t2=((x-mu)**2)/(2*sigma*sigma) return
math.exp(-1*t2)*t1 count=np.array([3,6,7,11,13,18,15,11,7,5,4])
count_rate=count/100 sum=0 for i in range(11):
sum+=count_rate[i]*math.log(count_rate[i]/gaosi(i)) print(sum)
The calculation results are as follows :0.03997441345364968

 

conclusion :

In the case of considering only uniform distribution model and Gaussian distribution model , When using itself to fit the target model , The information lost by matching Gaussian distribution is the least , It is found that the distribution of the data set is more in line with the Gaussian distribution .

Technology