Regression algorithms in Javascript

Algorytm regresji w javascript
Using the knowledge contained in a dataset, allows us to make predictions (forecasting), that is, predictions of values based on certain characteristics and patterns found in the data.

In this article, we will look at how to make predictions based on linear regression algorithms.

When processing data and using machine learning algorithms, we have the opportunity to use many tools available for selected programming languages.

The popularity of such solutions, for example, in Python, provides us with simple methods to build models adapted to solve our current prediction problems.

In Javascript, we can also use such mechanisms or libraries, quite optimally.

This changes the way we look at Javascript, and shows that the ability to use machine learning algorithms does not depend on the environment or programming language, and any mechanisms can be implemented on different platforms without any obstacles.

Regression vs. classification

Classification is a task that aims to match a sample of data with one of the predefined labels.

When using classification, we can therefore calculate with what probability the label determined by the algorithm, will match the selected data.

Klasyfikacja binarna zbioru punktów [Źródło].
Binary classification of a set of points [source].

In the case of regression, we are talking about determining a specific value based on input data.

Such a result, i.e. a forecast, will be a real number, not the value of a predefined class (as in the case of classification). For example, when we consider the prices of apartments, we can make a forecast of their prices (which will be real numbers) based on selected characteristics (such as the number of rooms or the age of the apartment).

In the case of classification, we can easily evaluate the correctness of the model based on the class label. When it is correct, the sample data will be evaluated positively, and in the opposite case – negatively.

In the situation of continuous values, which are the result of regression, we cannot verify the correctness of the model in this way. It is necessary, therefore, to use certain measures, such as the difference between the predicted value and the actual value.

Przykład wyniku algorytmu regresji liniowej [Źródło].
An example of the result of the linear regression algorithm [source].

Linear regression

The task of the linear regression algorithm is to find a straight line that will fit the learning data, and in the future, enable us to make predictions.

From math lessons at school, we know the equation of the straight line expressed by the formula:

f(x) = ax + b

  • where a tells us the slope,

  • and b is a constant value.

Thus, the task of regression is to find the best coefficients of a and b for a given set of data. These values are not chosen at random, they must minimize the error of the cost function. More about the cost function can be found here.

For a linear regression model, the most commonly used cost function is the Sum of Squared Error (SSE), expressed by the formula:

Using the simple gradient algorithm, we find the values of the parameters a and b that best minimize the algorithm’s error.

Algorytm gradientu prostego
Gradient descent algorithm [source].
  • You can read more about the linear regression algorithm here.
  • Read more about the gradient descent algorithm here.

Example of use

So let’s address the use of the linear regression algorithm in practice, using the example of making housing price forecasts.

#1

Dataset

At the very beginning, let’s prepare the dataset on which we will use the regression algorithm. A ready-made dataset on real estate price forecasts can be found here: https://www.kaggle.com/.

This collection presents data on property prices and their parameters, such as age, distance from metro stations and location.

To import data in the form of a CSV file into a Javascript script, we can use the csvtojson library.

We will install it with the command:

npm install csvtojson

For the purposes of the project, we will forecast the price of an apartment based on its age. So when we import the data, we will select only the features we need.

const jsonArray = await csv({
 colParser: {
   "No": "omit",
   "X1 transaction date": "omit",
   "X2 house age": "number",
   "X3 distance to the nearest MRT station": "omit",
   "X4 number of convenience stores": "omit",
   "X5 latitude": "omit",
   "X6 longitude": "omit",
   "Y house price of unit area": "number",
 },
}).fromFile("./Real estate.csv");

So they will be columns:

  • X2 house age,

  • Y house price of unit area.

We can show these values on a graph:

Porównanie wartości wieku do ceny mieszkania. Oś X przedstawia wiek mieszkania, a oś Y jego wartość.
Comparison of the value of the age to the price of the apartment. The X-axis represents the age of the apartment, and the Y-axis represents its value.
#2

Regression algorithms

To use regression algorithms, we can use the regressionjs library: GitHub – Tom-Alexander/regression-js: Curve Fitting in JavaScript.

1. To install it in our project, we can run the command::
npm install regression
2. To properly prepare the data, we can use the following code:

This library, accepts data in the form of an array of arrays, in which the values of each feature are held.

const data = jsonArray.map(features => Object.values(features))
3. To use the algorithm, let's execute the following code:
const result = regression.linear(data);

The result of this function will be the equation of the straight line:

y = -0.25x + 42.41

Such a function will look as follows:

A linear function of the forecast for a data set.

We can now also use the calculation of the forecast for the selected value of the age of the apartment (let’s assume that it will be a value equal to 11).

4. To do this let's perform the function:
console.log(result.predict(11))

The result will be a price of 39.66.

Polynomial regression

Often to improve the results of the forecast, instead of determining the classical linear function, we can use an exponential, logarithmic or polynomial function. This choice depends directly on the type of input data.

A polynomial function is one that we can express with the formula:

f(x) = a n x n + a x – 1 x x – 1 + … + a 1 x + b

So our task is to determine the group of parameters a n, a n-1, … a 1, b in such a way that they best minimize the cost function.

In the regressionjs library, we can use the polynomial function:

const result = regression.polynomial(data);

The result of such an algorithm, will be a parabolic function:

y = 0.04×2 + -1.93x + 53.45

Which looks as follows:

Jak widać, w lepszy sposób reprezentuje ona realne ułożenie danych.
As you can see, it better represents the real arrangement of the data.

The value of the forecast for 11 years will be here: 37.06.

Summary

Using machine learning algorithms is not exclusive to languages like Python. We can successfully use some solutions in other technologies. Making predictions using regression algorithms, therefore, is very easy to implement especially with the use of ready-made Javascript libraries.

Are you interested in practical applications of machine learning and artificial intelligence? Check out the rest of our tutorials:
  1. Semantic analysis in React Native using Tensorflow
  2. Creating native frame processors for Vision Camera in React Native using OpenCV
  3. Real-time pose detection in React Native using MLKit
  1. Burak Kanber: Hands-on Machine Learning with JavaScript, Packt Publishing: (https://github.com/Tom-Alexander/regression-js).

2. Estate price dataset: (https://www.kaggle.com/datasets/quantbruce/real-estate-price-prediction?resource=download).
3. Sandeep Khurana: Linear regression with example (2017): (https://towardsdatascience.com/linear-regression-with-example-8daf6205bd49).

Share this:
Łukasz Kurant

A developer of mobile and web applications with several years of experience. React Native and multi-platform solutions enthusiast.

Leave a comment:

Top