Linear Ridge Regression Using C# -- Visual Studio Magazine

Linear Ridge Regression Using C#

08/01/2023

The Predict Method
The Predict() method is simple. The method assumes that the constant / bias is located at index [0] of the coeffs vector. Therefore, input x[i] is associated with coeffs[i+1].

public double Predict(double[] x)
{
  // constant at coeffs[0]
  double sum = 0.0;
  int n = x.Length;  // number predictors
  for (int i = 0; i < n; ++i)
    sum += x[i] * this.coeffs[i + 1];

  sum += this.coeffs[0];  // add the constant
  return sum;
}

In a non-demo scenario you might want to add normal error checking, such as making sure that the length of the input x vector is equal to 1 less than the length of the coeffs vector.

The Train Method
The Train() method, shown in Listing 1, is short but dense because most of the work is farmed out to matrix functions in the Utils class. The training code starts with:

public void Train(double[][] trainX, double[] trainY)
{
  double[][] DX = 
    Utils.MatMakeDesign(trainX);  // add 1s column
. . .

The MatMakeDesign() function programmatically adds a leading column of 1s to act as inputs for the constant / bias term. An alternative approach is to physically add a leading column of 1s to the training data.

Next, the DXt * DX matrix is computed:

  // coeffs = inv(DXt * DX) * DXt * Y 
  double[][] a = Utils.MatTranspose(DX);
  double[][] b = Utils.MatProduct(a, DX);

Breaking the code down into small intermediate statements is not necessary but makes the code easier to debug, compared to a single statement like double[][] b = Utils.MatProduct(Utils.MatTranspose(DX), DX).

At this point the alpha / noise value is added to the diagonal of the DXt * DX matrix:

  for (int i = 0; i < b.Length; ++i)
   b[i][i] += this.alpha;

If alpha = 0, then no matrix conditioning or L2 regularization is applied and linear ridge regression reduces to standard linear regression.

The Train() method concludes with:

. . .
  double[][] c = Utils.MatInverse(b); 
  double[][] d = Utils.MatProduct(c, a);
  double[][] Y = Utils.VecToMat(trainY, DX.Length, 1);
  double[][] e = Utils.MatProduct(d, Y);
  this.coeffs = Utils.MatToVec(e);
}

The utility MatInverse() function uses Crout's decomposition algorithm. A significantly different approach is to avoid computing a matrix inverse directly and use an indirect technique called QR decomposition. The QR decomposition approach is used by many machine learning libraries because it is slightly more efficient than using matrix inverse. However, using matrix inverse is a much cleaner design in my opinion.

Notice that in order to multiply the vector of target y values in trainY times the inverse, the target y values must be converted from a one-dimensional vector to a two-dimensional matrix using the VecToMat() function. Then after the matrix multiplication, the resulting matrix must be converted back to a vector using the MatToVec() function. Details like this can be a major source of bugs during development.

Because the linear ridge regression training algorithm presented in this article inverts a matrix, the technique doesn't scale to problems with huge amounts of training data. In such situations, it is possible to use stochastic gradient descent (SGD) to estimate model coefficients and constant / bias. You can see an example of SGD training on the data used in my article, "Linear Ridge Regression from Scratch Using C# with Stochastic Gradient Descent."

Wrapping Up
To recap, linear ridge regression is essentially standard linear regression with L2 regularization added to prevent huge model coefficient values that can cause model overfitting. The weakness of linear ridge regression compared to more powerful techniques such as kernel ridge regression and neural network regression is that LRR assumes that data is linear. Even when an LRR model doesn't predict well, LRR is still useful to establish baseline results. The main advantage of LRR is simplicity.

The demo program uses strictly numeric predictor variables. LRR can handle categorical data such as a predictor variable of color with possible values red, blue and green. Each categorical value can be one-hot encoded, for example red = (1, 0, 0), blue = (0, 1, 0) and green = (0, 0, 1). In standard linear regression it's usually a mistake to use one-hot encoding because the resulting columns are mathematically correlated, which causes matrix inversion to fail. Standard linear regression uses a somewhat ugly technique (in my opinion) called dummy coding for categorical predictor variables.

Implementing LRR from scratch requires more effort than using a library such as scikit-learn. But implementing from scratch allows you to customize your code, makes it easier to integrate with other systems, and gives you a complete understanding of how LRR works.

Get Code Download

About the Author

Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Azure and Bing. James can be reached at [email protected].

Printable Format

comments powered by Disqus

Featured

VS Code Insiders Get Copilot Chat AI-Enhanced Extensions in Latest Update

Developers using the "Insiders" build of the latest Visual Studio Code update, version 1.90, can now enjoy enhanced chat AI functionality in extensions.
Regression Using LightGBM

Dr. James McCaffrey of Microsoft Research presents a full-code, step-by-step tutorial on this powerful machine learning technique used to predict a single numeric value.
Microsoft Teases Discounts for August Dev Conference at Redmond HQ

Microsoft is offering a special discount for Visual Studio Professional and Enterprise subscribers wishing to attend a developer conference being held in August at the company's Redmond, Wash., headquarters.
Building Planet-Scale .NET Apps with Azure Cosmos DB

Azure Cosmos DB is a fully managed distributed database that can be transparently replicated across regions while remaining highly performant and seamlessly scaling according to needs, making it great for applications of any scale.
Java Devs in VS Code Can Now Ask Copilot for Syntax Rewrites

Count among the many emerging abilities of GitHub Copilot new functionality for rewriting your Java syntax in Visual Studio Code.