Doğrusal ve doğrusal olmayan regresyon modelleri arasındaki farkı nasıl anlatabilirim?

27

Lineer olmayan regresyon SAS Non Linear ile ilgili aşağıdaki linki okuyordum . “Doğrusal Olmayan Regresyona Karşı Doğrusal Regresyon” adlı ilk bölümü okuduğumdaki anlayış, aşağıdaki denklemin aslında doğrusal bir regresyon olduğuydu, doğru mu? Öyleyse neden?

y = b_{1} x^{3} + b_{2} x^{2} + b_{3} x + c

$y = b_1x^3 + b_2x^2 + b_3x + c$

Am I also to understand that in non linear regression multicollinearity isn't an issue? I know that multicollinearity can be an issue in linear regression so surely if the model above is in fact a linear regression there would be multicollinearity?

— mHelpMe
kaynak

Closely related: stats.stackexchange.com/questions/33876.

— whuber

Also related: What does “curvilinear” mean?

— gung - Reinstate Monica

35

There are (at least) three senses in which a regression can be considered "linear." To distinguish them, let's start with an extremely general regression model

Y = f (X, θ, ε) .

$Y = f(X,\theta,\varepsilon).$

To keep the discussion simple, take the independent variables $X$ to be fixed and accurately measured (rather than random variables). They model $n$ observations of $p$ attributes each, giving rise to the $n$ -vector of responses $Y$ . Conventionally, $X$ is represented as an $n\times p$ matrix and $Y$ as a column $n$ -vector. The (finite $q$ -vector) $\theta$ comprises the parameters. $\varepsilon$ is a vector-valued random variable. It usually has $n$ components, but sometimes has fewer. The function $f$ is vector-valued (with $n$ components to match $Y$ ) and is usually assumed continuous in its last two arguments ( $\theta$ and $\varepsilon$ ).

The archetypal example, of fitting a line to $(x,y)$ data, is the case where $X$ is a vector of numbers $(x_i,\,i=1,2,\ldots,n)$ --the x-values; $Y$ is a parallel vector of $n$ numbers $(y_i)$ ; $\theta = (\alpha,\beta)$ gives the intercept $\alpha$ and slope $\beta$ ; and $\varepsilon = (\varepsilon_1,\varepsilon_2,\ldots,\varepsilon_n)$ is a vector of "random errors" whose components are independent (and usually assumed to have identical but unknown distributions of mean zero). In the preceding notation,

y_{i} = α + β x_{i} + ε_{i} = f (X, θ, ε)_{i}

$y_i = \alpha + \beta x_i +\varepsilon_i = f(X,\theta,\varepsilon)_i$

with $\theta = (\alpha,\beta)$ .

The regression function may be linear in any (or all) of its three arguments:

"Linear regression, or a "linear model," ordinarily means that $f$ is linear as a function of the parameters $\theta$ . The SAS meaning of "nonlinear regression" is in this sense, with the added assumption that $f$ is differentiable in its second argument (the parameters). This assumption makes it easier to find solutions.
A "linear relationship between $X$ and $Y$ " means $f$ is linear as a function of $X$ .
A model has additive errors when $f$ is linear in $\varepsilon$ . In such cases it is always assumed that $\mathbb{E}(\varepsilon) = 0$ . (Otherwise, it wouldn't be right to think of $\varepsilon$ as "errors" or "deviations" from "correct" values.)

Every possible combination of these characteristics can happen and is useful. Let's survey the possibilities.

A linear model of a linear relationship with additive errors. This is ordinary (multiple) regression, already exhibited above and more generally written as

$Y = X θ + ε .$ $Y = X\theta + \varepsilon.$
$X$ has been augmented, if necessary, by adjoining a column of constants, and $\theta$ is a $p$ -vector.
A linear model of a nonlinear relationship with additive errors. This can be couched as a multiple regression by augmenting the columns of $X$ with nonlinear functions of $X$ itself. For instance,

$y_{i} = α + β x_{i}^{2} + ε$ $y_i = \alpha + \beta x_i^2 + \varepsilon$
is of this form. It is linear in $\theta=(\alpha,\beta)$ ; it has additive errors; and it is linear in the values $(1,x_i^2)$ even though $x_i^2$ is a nonlinear function of $x_i$ .
A linear model of a linear relationship with nonadditive errors. An example is multiplicative error,

$y_{i} = (α + β x_{i}) ε_{i} .$ $y_i = (\alpha + \beta x_i)\varepsilon_i.$
(In such cases the $\varepsilon_i$ can be interpreted as "multiplicative errors" when the location of $\varepsilon_i$ is $1$ . However, the proper sense of location is not necessarily the expectation $\mathbb{E}(\varepsilon_i)$ anymore: it might be the median or the geometric mean, for instance. A similar comment about location assumptions applies, mutatis mutandis, in all other non-additive-error contexts too.)
A linear model of a nonlinear relationship with nonadditive errors. E.g.,

$y_{i} = (α + β x_{i}^{2}) ε_{i} .$ $y_i = (\alpha + \beta x_i^2)\varepsilon_i.$
A nonlinear model of a linear relationship with additive errors. A nonlinear model involves combinations of its parameters that not only are nonlinear, they cannot even be linearized by re-expressing the parameters.
- As a non-example, consider
  
  $y_{i} = α β + β^{2} x_{i} + ε_{i} .$ $y_i = \alpha\beta + \beta^2 x_i + \varepsilon_i.$
  By defining $\alpha^\prime = \alpha\beta$ and $\beta^\prime=\beta^2$ , and restricting $\beta^\prime \ge 0$ , this model can be rewritten
  
  $y_{i} = α^{'} + β^{'} x_{i} + ε_{i},$ $y_i = \alpha^\prime + \beta^\prime x_i + \varepsilon_i,$
  exhibiting it as a linear model (of a linear relationship with additive errors).
- As an example, consider
  
  $y_{i} = α + α^{2} x_{i} + ε_{i} .$ $y_i = \alpha + \alpha^2 x_i + \varepsilon_i.$
  It is impossible to find a new parameter $\alpha^\prime$ , depending on $\alpha$ , that will linearize this as a function of $\alpha^\prime$ (while keeping it linear in $x_i$ as well).
A nonlinear model of a nonlinear relationship with additive errors.

$y_{i} = α + α^{2} x_{i}^{2} + ε_{i} .$ $y_i = \alpha + \alpha^2 x_i^2 + \varepsilon_i.$
A nonlinear model of a linear relationship with nonadditive errors.

$y_{i} = (α + α^{2} x_{i}) ε_{i} .$ $y_i = (\alpha + \alpha^2 x_i)\varepsilon_i.$
A nonlinear model of a nonlinear relationship with nonadditive errors.

$y_{i} = (α + α^{2} x_{i}^{2}) ε_{i} .$ $y_i = (\alpha + \alpha^2 x_i^2)\varepsilon_i.$

Although these exhibit eight distinct forms of regression, they do not constitute a classification system because some forms can be converted into others. A standard example is the conversion of a linear model with nonadditive errors (assumed to have positive support)

y_{i} = (α + β x_{i}) ε_{i}

$y_i = (\alpha + \beta x_i)\varepsilon_i$

into a linear model of a nonlinear relationship with additive errors via the logarithm,

\log (y_{i}) = μ_{i} + \log (α + β x_{i}) + (\log (ε_{i}) - μ_{i})

$\log(y_i) = \mu_i + \log(\alpha + \beta x_i) + (\log(\varepsilon_i) - \mu_i)$

Here, the log geometric mean $\mu_i = \mathbb{E}\left(\log(\varepsilon_i)\right)$ has been removed from the error terms (to ensure they have zero means, as required) and incorporated into the other terms (where its value will need to be estimated). Indeed, one major reason to re-express the dependent variable $Y$ is to create a model with additive errors. Re-expression can also linearize $Y$ as a function of either (or both) of the parameters and explanatory variables.

Collinearity

Collinearity (of the column vectors in $X$ ) can be an issue in any form of regression. The key to understanding this is to recognize that collinearity leads to difficulties in estimating the parameters. Abstractly and quite generally, compare two models $Y = f(X,\theta,\varepsilon)$ and $Y=f(X^\prime,\theta,\varepsilon^\prime)$ where $X^\prime$ is $X$ with one column slightly changed. If this induces enormous changes in the estimates $\hat\theta$ and $\hat\theta^\prime$ , then obviously we have a problem. One way in which this problem can arise is in a linear model, linear in $X$ (that is, types (1) or (5) above), where the components of $\theta$ are in one-to-one correspondence with the columns of $X$ . When one column is a non-trivial linear combination of the others, the estimate of its corresponding parameter can be any real number at all. That is an extreme example of such sensitivity.

From this point of view it should be clear that collinearity is a potential problem for linear models of nonlinear relationships (regardless of the additivity of the errors) and that this generalized concept of collinearity is potentially a problem in any regression model. When you have redundant variables, you will have problems identifying some parameters.

— whuber
kaynak

can you recommend a concise, introductory reading that will help me get a better sense of the linearization you mention, which is the heart of the difference between your example and non-example in point 5. Thank you.

— ColorStatistics

@Color I'm not familiar with any. Under mild assumptions about the differentiability of possible transformations, this is addressed by the theory of Partial Differential Equations (PDEs).

— whuber

0

You should start right now by making a difference between reality and the model you're using to describe it

The equation you just mentionned is a polynomial equation (x^power) ie. non-linear ... but you can still model it using a generlized linear model (using a link function) or polynomail regression since the parameters are linear (b1, b2, b3, c)

hope that helped, it actually is a bit sketchy : reality/model

— Po Stulat
kaynak

3

This can be estimated via ordinary least squares since model is linear in parameters.

— Analyst

so its all to do with the parameters? if we b3^2 * x it would still be linear?

— mHelpMe

0

A model is linear if it is linear in parameters or can be transformed to be linear in parameters (linearizable). Linear models can model linear or non-linear relationships. Let's expand on each of these.

A model is linear in parameters if it can be written as the sum of terms, where each term is either a constant or a parameter multiplying a predictor (X_i):

Note that this definition is very narrow. Only the models meeting this definition are linear. Every other model, is non-linear.

There are a two types of linear models that are confused for non-linear models:

1. Linear models of non-linear relationships

For example, the model below models a non-linear relationship (because the derivative of Y with respect to X₁ is a function of X₁). By creating a new variable W₁=X₁², and re-writing the equation with W₁ replacing X₁², we have an equation that satisfies the definition of a linear model.

2. Models that aren't immediately linear but can become linear after a transformation (linearizable). Below are 2 examples of linearizable models:

Example 1:

This model may appear to be non-linear because it does not meet the definition of a model that is linear in parameters, however it can be transformed into a linear model hence it is linearizable/transformably linear, and is thus considered to be a linear model. The following transformations would linearize it. Start by taking the natural logarithm of both sides to obtain:

then make the following substitutions:

to obtain the linear model below:

Example 2:

This model may appear to be non-linear because it does not meet the definition of a model that is linear in parameters, however it can be transformed into a linear model hence it is linearizable/transformably linear, and is thus considered to be a linear model. The following transformations would linearize it. Start by taking the reciprocal of both sides to obtain:

then make the following substitutions:

to obtain the linear model below:

Any model that is not linear (not even through linearization) is non-linear. Think of it this way: If a model does not meet the definition of a linear model then it is a non-linear model, unless it can be proven to be linearizable, at which point it earns the right to be called a linear model.

Whuber's answer above as well as the Glen_b's answer in this link will add more color to my answer. Nonlinear vs. generalized linear model: How do you refer to logistic, Poisson, etc. regression?

— ColorStatistics
kaynak