当前位置:网站首页>Regression analysis model
Regression analysis model
2022-07-20 07:01:00 【Qingyuan warm song】
Catalog
One 、 Overview of regression analysis
1.1 Determinism ( Function relation )
1.2 Non deterministic relationship ( Correlation )
Two 、 Univariate linear regression
3、 ... and 、 Multiple linear regression
4.1 Univariate regression test
4.1.1 Analysis of variance (F Test method )
4.1.2 Correlation coefficient test (r Test method )
4.2 Multiple linear regression test
4.2.1 Goodness of fit test (r)—— Model validity
4.2.2 Significance test of regression equation (F)—— Linear correlation
4.2.3 Significance test of regression parameters (t)—— Influence of respective variables
5、 ... and 、 Linearization of nonlinear regression
One 、 Overview of regression analysis
1.1 Determinism ( Function relation )
Deterministic relationship refers to the relationship that when the value of some variables is determined, the value of other variables is also completely determined .
1.2 Non deterministic relationship ( Correlation )
Correlation refers to a certain dependency between variables , But when the value of some variables is determined , Although the value of other variables changes with it, it is not completely certain , At this time, the relationship between variables cannot be accurately expressed by functions .
1.3 regression analysis
regression analysis (regression analysis) It is a response variable in mathematical statistics ( The dependent variable ) With several prediction variables ( The independent variables ) An effective method of correlation between . You can use a definite functional relationship Roughly describe y And x The relationship between , be called Regression equation .
only one Regression analysis of predictive variables is called Univariate regression analysis ;
More than one Regression analysis of predictive variables is called Multiple regression analysis .
for example :(1) The independent variables ( Forecast variable ): Father's height ; The dependent variable ( Response variables ): My son is tall
(2) The independent variables ( Forecast variable ):IQ, Time T; The dependent variable ( Response variables ): achievement
1.4 Least square method
And satisfied :
Called random error , It is usually assumed that
~N(0,o²)
Least square method Namely , function m(x)=m(x,a1,a2,L,ak), among a1 To ak Is an unknown parameter , We choose these appropriate parameters , Make the observations yi With the corresponding function value The sum of squares of deviations Minimum
Two 、 Univariate linear regression
3、 ... and 、 Multiple linear regression
According to the least square method , To make
Minimum , Yes bp Find partial derivatives and let them equal 0, secondly
When X'X Irreversible Or between variables Multicollinearity when , Least squares is not available , Principal component analysis and regression can be used
(X': Transpose matrix ; For matrix A reversible , Existence matrix B bring AB=BA=E, Meet one )
Four 、 Regression test
4.1 Univariate regression test
4.1.1 Analysis of variance (F Test method )
Q total : Total sum of squares , Reflect observations y1~yn Overall dispersion
Q Remnant : The sum of the remaining squares , It reflects the degree to which the observed value deviates from the regression line , This deviation is caused by random factors such as observation errors
Q return : Sum of regression squares , Reflect the dispersion of regression value , This dispersion is due to Y And X Caused by the linear correlation between
Q_ return And Q_ Remnant The ratio of reflects this linear correlation and random factor pair y Of influence , The ratio is about large , The stronger the linear correlation
When H0 To true ,
Given the level of significance α, if F≥F_α, Reject the assumption that H_0, That is, the linear relationship is significant . On the contrary, I think y Yes x There is no linear correlation , The linear regression equation is of no practical significance .
4.1.2 Correlation coefficient test (r Test method )
r=S_xy/√(S_xx S_yy )
if r The absolute value of is very small , be y And x The linear correlation is not significant , Or there is no linear correlation ; if The absolute value is larger ( Close to the 1) when , It shows that the linear correlation is significant
(1) if |r|≤r_0.05 (n-2), Think y And x Linear correlation between No significant , Or there is no linear correlation ;
(2) if r_0.05 (n-2)<|r|≤r_0.01 (n-2), Think y And x Between The linear correlation is significant ;
(3) if |r|>r_0.01 (n-2), Think y And x Linear correlation between Particularly significant .
4.1.3 t Test method
4.2 Multiple linear regression test
4.2.1 Goodness of fit test (r)—— Model validity
Build an index to intuitively judge whether the fitting is good or bad :R² = SSR / SST
R² The bigger it is , The more independent variables explain the dependent variables , The change caused by independent variables accounts for a high percentage of the total change . The denser the observation points are near the regression line . Value range :0-1
4.2.2 Significance test of regression equation (F)—— Linear correlation
test Y And explanatory variables x1,x2,……xk Is the linear relationship significant .
The steps of testing :
(1) suggest a hypothesis
(2) Calculation statistics
(3) Look up the table
(4) Make a test
4.2.3 Significance test of regression parameters (t)—— Influence of respective variables
Check whether each explanatory variable is correct Y remarkable
(1) suggest a hypothesis
(2) Construct and calculate statistics
(3) Look up the table
(4) test
|Ti|<t_a/2: Accept ; |Ti|>t_a/2: Refuse
Four 、 Prediction and control
4.1 forecast
For a given arbitrary x_0, Into the regression equation
y ̂_0=a ̂+b ̂x_0
Perform interval estimation , Given confidence 1-a, Find the confidence interval , That is, the prediction interval
(y ̂_0-σ ̑μ_(α/2),y ̂_0+σ ̑μ_(α/2) )
4.2 control
The inverse problem of prediction , That is to observe y To what extent , determine x The scope of the
{(&y_1^′=y ̂_1-σ ̑μ_(α/2)@&y_2^′=y ̂_2+σ ̑μ_(α/2) )
5、 ... and 、 Linearization of nonlinear regression
use Variable substitution Linearize the nonlinear model
(1)y = a + bsint ; Make x = sint
(2)y = a + bt + ct² ; Make x_1 = t ,x_2 = t² Multiple linear regression
(3) Make y = 1/y ;x = 1/x
(4) Make x = lnx
Sure excel、spss、lingo Realization
excel: Insert - Chart - Scatter plot
data - Data analysis - Return to ; Set the confidence level 95%,a>p Then refuse H_0,R² The closer the 1, The more significant the linear correlation ,F As big as possible , The better the model works
spss: Scatter plot
analysis - Return to - linear ;R and F The bigger it is ,Sig The smaller it is , The more significant the linear correlation
variance analysis , Such as : if F>F_0.5(2,5)=5.2, The linear correlation is significant
if F>F_0.01(2,5)=10.2, The linear correlation is highly significant
Coefficient analysis : Look at the coefficient table B
Draw a scatter diagram - Data analysis - Return to -...
边栏推荐
猜你喜欢
随机推荐
[leetcode daily question] - 108 Convert an ordered array into a binary search tree
18_过滤器
硅谷课堂第八课-腾讯云点播管理模块(三)
D. Rating Compression(思维 + 双指针)
Activiti工作流网关
F. Equate Multisets(贪心)
SQL 注入教程:通过示例学习
数据库压力测试方法概述
分享搭建脚手架的一些经验
U++ 使用setTimer函数
为什么调试器会显示错误的函数
U++ subsystem
B tree b+ tree
CD 从抓轨到搭建流媒体服务器 —— 以《月临寐乡》为例
NFT访问工具PREMINT遭黑 损失超37万美元
Web3.0 博客DApp开发实战【2022】
STL之string学习
【C语言刷LeetCode】1604. 警告一小时内使用相同员工卡大于等于三次的人(M)
HCIP第三天
牛客网 - BM39 序列化二叉树 [Hard]