
I am looking for a library to perform Multiple Linear regression with 4 independent variables.
Can any one suggest appropriate library.
Thank you,



Hi,
Use the Linear Algebra library. Here's an adaption of some VB code that I wrote:
Imports MathNet.Numerics.LinearAlgebra.Double
Imports MathNet.Numerics.Distributions
Public Class MyRegression
Public Shared Function QuickRegression(ByVal Yarray() As Double, ByVal XArray(,) As Double) As Double(,)
'Bring in the data and put it into matrices
Dim XMatrix As Matrix = New DenseMatrix(XArray)
Dim YMatrix As Matrix = New DenseVector(Yarray).ToColumnMatrix
Dim YFitted As Matrix
'How many variables are we dealing with?
Dim NumVars As Integer = XArray.GetUpperBound(1)
Dim i As Integer
Dim KeptXR As Matrix
Dim OutputMatrix As Matrix
'Needed for calculating TStats and PValues
Dim SumSqErrors As Double
Dim DOF As Integer
Dim StdD As Double
Dim XXMatrix(,) As Double
Dim TStats(NumVars) As Double
Dim PVals(NumVars) As Double
'What the function will return
Dim OutputArray(2, NumVars) As Double
'Using QR Factorization to solve the problem
Dim MyQR As Factorization.QR = XMatrix.QR()
OutputMatrix = MyQR.Solve(YMatrix)
'Need to get (X'X)^1 to calculate TStats  equivalent to (R'R)^1 from the QR factorization of X
KeptXR = MyQR.R
XXMatrix = KeptXR.TransposeThisAndMultiply(KeptXR).Inverse().ToArray
'Calculate the fitted
YFitted = XMatrix * OutputMatrix
'Calculate the Sum of the Squares of the Errors
SumSqErrors = VectSumSq(YFitted.Column(0), YMatrix.Column(0))
'Degrees of Freedom
DOF = YFitted.RowCount  NumVars
'Get TDistribution to calculate TStats
Dim MyStudentsT As StudentT = New StudentT(0, 1, DOF)
'Standard Deviation
StdD = Math.Sqrt(SumSqErrors / DOF)
'Calculate the TStats and then the PValues of the regression
For i = 0 To NumVars
TStats(i) = OutputMatrix(i, 0) / (StdD * (XXMatrix(i, i)) ^ 0.5)
PVals(i) = 1  MyStudentsT.CumulativeDistribution(Math.Abs(TStats(i))) + _
MyStudentsT.CumulativeDistribution(Math.Abs(TStats(i)))
Next
'Put the whole lot into a single array for the function to return
For i = 0 To NumVars
OutputArray(0, i) = OutputMatrix(i, 0)
OutputArray(1, i) = TStats(i)
OutputArray(2, i) = PVals(i)
Next
Return OutputArray
End Function
Friend Shared Function VectSumSq(ByVal Vector1 As Vector, ByVal Vector2 As Vector) As Double
'Calculates the Sum of the Squares of the difference between two vectors
Dim TempVector As Vector
TempVector = Vector1  Vector2
Return TempVector.PointwiseMultiply(TempVector).Sum
End Function
End Class
Hope this is useful,
Andrew



Thank you for your response.
i was trying to use the code directly, but it seems some of the functions are obsolete in new version of library.
Hence i could only use up to calculating the Output Matrix. But these values are not matching with multiple regression output in standard packages (Excel etc).
Can you respond with your experiences of using this code.
Thank you once again,
Sampath



Hi Sampath,
I'm not sure about the obsolesence of the functions  I thought I was using a reasonably uptodate version. What functions does it not accept?
I did a quick test of my code against Excel and found that the coefficients did agree. You have to remember that in Excel the Linest function
=LINEST(KnownYs, KnownXs)
will give the coefficients in the reverse order so maybe this is where the problem arises.
Andrew



Hi Andrew,
There are many functions which i have checked Highlighted as
RED
Imports MathNet.Numerics.LinearAlgebra.Double
Imports MathNet.Numerics.Distributions
Public Class MyRegression
Public Shared Function QuickRegression(ByVal Yarray() As Double, ByVal XArray(,) As Double) As Double(,)
'Bring in the data and put it into matrices
Dim XMatrix As Matrix = New DenseMatrix(XArray)
Dim YMatrix As Matrix = New DenseVector(Yarray).ToColumnMatrix
Dim YFitted As Matrix
'How many variables are we dealing with?
Dim NumVars As Integer = XArray.GetUpperBound(1)
Dim i As Integer
Dim KeptXR As Matrix
Dim OutputMatrix As Matrix
'Needed for calculating TStats and PValues
Dim SumSqErrors As Double
Dim DOF As Integer
Dim StdD As Double
Dim XXMatrix(,) As Double
Dim TStats(NumVars) As Double
Dim PVals(NumVars) As Double
'What the function will return
Dim OutputArray(2, NumVars) As Double
'Using QR Factorization to solve the problem
Dim MyQR As Factorization.QR = XMatrix.QR()
OutputMatrix = MyQR.Solve(YMatrix)
'Need to get (X'X)^1 to calculate TStats  equivalent to (R'R)^1 from the QR factorization of X
KeptXR = MyQR.R
XXMatrix = KeptXR.TransposeThisAndMultiply(KeptXR).Inverse().ToArray
'Calculate the fitted
YFitted = XMatrix * OutputMatrix
'Calculate the Sum of the Squares of the Errors
SumSqErrors = VectSumSq(YFitted.Column(0), YMatrix.Column(0))
'Degrees of Freedom
DOF = YFitted.RowCount  NumVars
'Get TDistribution to calculate TStats
Dim MyStudentsT As StudentT = New StudentT(0, 1, DOF)
'Standard Deviation
StdD = Math.Sqrt(SumSqErrors / DOF)
'Calculate the TStats and then the PValues of the regression
For i = 0 To NumVars
TStats(i) = OutputMatrix(i, 0) / (StdD * (XXMatrix(i, i)) ^ 0.5)
PVals(i) = 1  MyStudentsT.CumulativeDistribution(Math.Abs(TStats(i))) + _
MyStudentsT.CumulativeDistribution(Math.Abs(TStats(i)))
Next
'Put the whole lot into a single array for the function to return
For i = 0 To NumVars
OutputArray(0, i) = OutputMatrix(i, 0)
OutputArray(1, i) = TStats(i)
OutputArray(2, i) = PVals(i)
Next
Return OutputArray
End Function
Friend Shared Function VectSumSq(ByVal Vector1 As Vector, ByVal Vector2 As Vector) As Double
'Calculates the Sum of the Squares of the difference between two vectors
Dim TempVector As Vector
TempVector = Vector1  Vector2
Return TempVector.PointwiseMultiply(TempVector).Sum
End Function
End Class ________________ Then coming to Excel, i am looking for Regression, which is part of DataANalysis (Data Menu > DataAnalysis > Regression in MsOffice 2007).
But, Its nice to interact with you.
Thanks again.
Sampath.



I think the regression in Excel just runs the Linest function to get its results, but again, the coeffs are backwards. I'll have a look at it over the next few days.
In terms of the obsolete functions I think we'll just have to appeal to one of the moderators to clarify that one...

