- 2020-07-28 20:06
*views 7*- multivariate statistical analysis

Multivariate linear regression model is usually used to study the relationship between a dependent variable and multiple independent variables , If the relationship between them can be described in linear form , A multivariate linear model can be established for analysis .

1. Introduction to the model

1.1 The structure of the model

Multiple linear regression models are usually used to describe variables y and x The random linear relationship between them , Namely ：

If yes y and x Yes x Observations , obtain n Group observation value yi,x1i,…,xki(i=1,2,…,n), They satisfy the relationship ：

1.2 Test of model parameters

Under the normal assumption , If X It's full rank , Then the least square estimation of the parameters of the ordinary linear regression model is ：

therefore y The estimated value of is ：

（1） Significance test of regression equation

（2） Significance test of regression coefficient

2. Modeling steps

（1） The regression model was established according to the data

（2） The model was tested for significance

（3) Regression diagnosis was performed on the model

3. modeling

library(car) a=read.table("C:/Users/MrDavid/data_TS/reg.csv",sep=",",header=T)

a lm.salary=lm( Fuxian ~x1+x2+x3+x4,data=a) summary(lm.salary) # notes ： It's just that y The result of garbled code

find x2,x3,x4 The coefficient is not significant .

（2） Selecting variables

lm.step=step(lm.salary,direction="both")

If you remove the variable x2,AIC The value of is 648.49, If you remove the variable x3,AIC The value of is 650.85, If you remove the variable x1,AIC The value of is 715.19, So remove it here x2.

Carry out the next round of calculation ：

lm.salary=lm( Fuxian ~x1+x3+x4,data=a) lm.step=step(lm.salary,direction="both")

Find out x3,AIC The value of is 647.64, So remove it x3.

Alone x1 and x4, Fit .

lm.salary=lm( Fuxian ~x1+x4,data=a) summary(lm.salary)

It can be seen that F test P Value less than 0.05 remarkable , Each parameter coefficient is also significant .

（3） The regression residuals of the above regression models were diagnosed

Calculate the standardized residual of the model

library(TSA) y.rst=rstandard(lm.step) y.rst

Draw the residual scatter plot ：

It's obvious 4 and 35 Abnormal signal point , Remove these two points .

lm.salary=lm(log( Fuxian )~x1+x2+x3+x4,data=a[-c(4,35),])

lm.step=step(lm.salary,direction="both") y.rst=rstandard(lm.step)

y.fit=predict(lm.step) plot(y.rst~y.fit)

The result after removing two points ：

Draw model diagnosis diagram ：

par(mfrow=c(2,2)) plot(lm.step) influence.measures(lm.step)

The residual fitting diagram basically presents a random distribution pattern , Normal Q-Q The graph basically falls on a straight line , It shows that the residuals obey normal distribution ; size - Location map and residuals - The leverage diagram exists as a group and is not far from the center . This shows that 3,4,35 The observation value of No.1 may be abnormal point and strong influence point .

Technology

- Python153 blogs
- Java137 blogs
- Vue88 blogs
- Flow Chart79 blogs
- javascript44 blogs
- C++43 blogs
- MySQL39 blogs
- programing language39 blogs
- more...

Daily Recommendation

©2020-2021 ioDraw All rights reserved

【139】 Alicloud cloud disk mounting method element-ui in collapse Default deployment HTML+CSS+JS Realized games -" scissors "MySQL— Relational database “Python Is the best language in the world ”unity3d Realize the first person shooting game CS Counter Strike （ one ）（ First person movement ） Ali four sides ： You know? Spring AOP establish Proxy The process ? Those big black technology visual screens , How did you do it ? Template direct application data warehouse Inmon And Kimball Comparison of warehouse theory Ali spring recruitment interview