- 2020-07-28 20:06
*views 5*- multivariate statistical analysis

Multivariate linear regression model is usually used to study the relationship between a dependent variable and multiple independent variables , If the relationship between them can be described in linear form , A multivariate linear model can be established for analysis .

1. Introduction to the model

1.1 The structure of the model

Multiple linear regression models are usually used to describe variables y and x The random linear relationship between them , Namely ：

If yes y and x Yes x Observations , obtain n Group observation value yi,x1i,…,xki(i=1,2,…,n), They satisfy the relationship ：

1.2 Test of model parameters

Under the normal assumption , If X It's full rank , Then the least square estimation of the parameters of the ordinary linear regression model is ：

therefore y The estimated value of is ：

（1） Significance test of regression equation

（2） Significance test of regression coefficient

2. Modeling steps

（1） The regression model was established according to the data

（2） The model was tested for significance

（3) Regression diagnosis was performed on the model

3. modeling

library(car) a=read.table("C:/Users/MrDavid/data_TS/reg.csv",sep=",",header=T)

a lm.salary=lm( Fuxian ~x1+x2+x3+x4,data=a) summary(lm.salary) # notes ： It's just that y The result of garbled code

find x2,x3,x4 The coefficient is not significant .

（2） Selecting variables

lm.step=step(lm.salary,direction="both")

If you remove the variable x2,AIC The value of is 648.49, If you remove the variable x3,AIC The value of is 650.85, If you remove the variable x1,AIC The value of is 715.19, So remove it here x2.

Carry out the next round of calculation ：

lm.salary=lm( Fuxian ~x1+x3+x4,data=a) lm.step=step(lm.salary,direction="both")

Find out x3,AIC The value of is 647.64, So remove it x3.

Alone x1 and x4, Fit .

lm.salary=lm( Fuxian ~x1+x4,data=a) summary(lm.salary)

It can be seen that F test P Value less than 0.05 remarkable , Each parameter coefficient is also significant .

（3） The regression residuals of the above regression models were diagnosed

Calculate the standardized residual of the model

library(TSA) y.rst=rstandard(lm.step) y.rst

Draw the residual scatter plot ：

It's obvious 4 and 35 Abnormal signal point , Remove these two points .

lm.salary=lm(log( Fuxian )~x1+x2+x3+x4,data=a[-c(4,35),])

lm.step=step(lm.salary,direction="both") y.rst=rstandard(lm.step)

y.fit=predict(lm.step) plot(y.rst~y.fit)

The result after removing two points ：

Draw model diagnosis diagram ：

par(mfrow=c(2,2)) plot(lm.step) influence.measures(lm.step)

The residual fitting diagram basically presents a random distribution pattern , Normal Q-Q The graph basically falls on a straight line , It shows that the residuals obey normal distribution ; size - Location map and residuals - The leverage diagram exists as a group and is not far from the center . This shows that 3,4,35 The observation value of No.1 may be abnormal point and strong influence point .

Technology

- Python122 blogs
- Java114 blogs
- Flow Chart79 blogs
- Vue66 blogs
- MySQL35 blogs
- javascript33 blogs
- programing language33 blogs
- Linux32 blogs
- more...

Daily Recommendation

views 7

©2020 ioDraw All rights reserved

Vue The difference between single page and multi page 0.96OLED display -4 Line SPI explain R In language Axis() Detailed explanation of function parameters JAVA Snake games （ Source code + notes ）C++ vector Simulation Implementation of class javascript Adding and deleting form information CCF A series of questions --2016 year 4 Month 1 Calculation of break point postman Interface test get timestamp and MD5 encryption 【 Recommended benefits 】c++ use easyx Making pixel birds , Simple hands on games TCP/IP There are so many loopholes in the agreement ?