What is a Non-Linear Regression model?

The first step while starting a machine learning problem is to understand the data available and then to figure out the model suitable for this data. With machine learning, we often start with the understanding of linear and non-linear models. Previously, we have talked at length about linear regression models. In this blog, we will talk about the non-linear regression model. What is a non-linear regression model, how to identify a non-linear regression model, and lastly, how to deal with a non-linear regression model?

Non-Linear Regression Model:

Let’s start with an example. Consider the GDP data of China across the years from 1960 to 2014 as shown in the plot below.

Now ask yourself some questions. Can we predict GDP based on time? Can we use a linear function to model these predictions? If the data is showing a curvy trend it a simple linear regression model will not model it accurately as compared to a non-linear regression model. The plot above shows that GDP and time have a strong relationship but it is clearly not linear.

The growth is slow in the starting years and after 2005 it starts to increase. After 2010 it starts to decline. This relationship better fits as a logistic or an exponential function. It means that the data can be better modeled using a non-linear regression method.

Modeling as an Exponential Function..

Let’s move forward with the assumption to model these data points as an exponential function such that,

Ŷ = Θ₀ + Θ₁Θ₂X

Our next task is to estimate the parameters of this model that are thetas. Then we can use this fitted model to determine the GDP for an unknown period of time.

Different regression forms can be used to fit our model. like quadratic regression, cubic regression, and so on. We can call these polynomial regressions. The relationship between dependent variable Y and independent variable X is Nth degree polynomial in X.

we choose the polynomial regression that best fits our model. Because choosing the best fit is what we want.

Polynomial Regression:

So, what is exactly polynomial regression? In simple words, it fits the curve line to your data. A polynomial relation with degree 3 would be

Ŷ = Θ₀ + Θ₁x + Θ₂x² + Θ₃x³

and we find the best estimates for thetas to best fit our data points.

An interesting thing to note is we can express a polynomial regression model as a linear regression model. Let’s see with an example.

we can define a third-degree polynomial equation as a linear equation by introducing new variables.

x₁ = x

x₂ = x²

x₃ = x³

Ŷ = Θ₀ + Θ₁x₁ + Θ₂x₂ + Θ₃x₃

and this becomes a multiple linear regression model. We can use the same procedure to estimate the parameters and thus solve the problem as we do with linear regression problems.

It means the polynomial regression model can be fitted using least squares.

In the least-squares method, we estimate the parameters by minimizing the sum of the squares of the differences between the observed dependent variables and the predicted variables by the linear model. That is between y and ŷ.

Now, you must be thinking what is non-linear regression exactly? Firstly, it is a method to model a non-linear relationship between the dependent variable and the independent variables. Secondly, for a model to be non-linear ŷ must be a non-linear function of parameters theta and not parameters x.

A non-linear equation cane be exponential, logarithmic, or any other type.

In all these types ŷ is dependent on theta parameters and not only on variables x.

So, a non-linear regression model is non-linear by parameters. We cannot use least squares to fit the data here. In other words, parameter estimation is not that easy.

How to know if a model is non-linear or not?

First, try to figure it out visually. Plot bivariate plots of output variable against each input variable. Calculating the correlation coefficient between dependent and independent variables is also another way to determine the nature of the relation. If it is 0.7 or above for all variables there is a tendency for a linear relation.

Secondly, if we can not accurately fit the model with linear parameters it is time to move on to non-linear regression.

A good source for handling and implementing non-linear regression models is here. Enjoy!