Multiple Linear regression in C#

Dec 6, 2011 at 8:26 AM

I am looking for a library to perform Multiple Linear regression with 4 independent variables in C#.

Can any one suggest appropriate library or examples.

Thank you,

Apr 24, 2012 at 8:39 PM
Edited Apr 24, 2012 at 8:42 PM

I know this is a bit of a late reply, but this matix library right here math.net is probably the best one or one of the best. I also considered the licence ,the number of people developing it and the quality of the code, as far as I could see.

You may be aware the solving a linear regression - although the maths is straight forward in a stats book - relies on matrix inversion which has many numerical issues to consider. Rounding errors, divisions by very small numbers etc..am a bit rusty on this subject right now, to give you a better explanation.

This library offers five to six  different solutions in the example code that comes with the library.Also many routines  run in parallel so if you have a big problem with many dimensions this will be of great benefit. There might be libraries specialised in linear regression that do outlier detection, removal of co-linear variables for you automatically but I am not aware of such libraries in C#.  'R' comes to mind, but that is a stastical package/ programming language, not a library you can use in a program. Perhaps people have used math.net in projects to remove colinear variables and outliers already.

 

May 3, 2012 at 4:50 PM
Edited May 6, 2012 at 9:03 PM

Actually I found 2 links using Math.net for a linear regression:

http://mathnetnumerics.codeplex.com/discussions/271868  this one is in VB but should be easy to change.

and another here:

http://sharpstatistics.co.uk/stats/linear-regression-in-c/  is in c#

That should help.!

 

May 5, 2012 at 10:22 AM

Thank you so much :)


From: [email removed]
To: [email removed]
Date: Thu, 3 May 2012 08:50:19 -0700
Subject: Re: Multiple Linear regression in C# [mathnetnumerics:281939]

From: Gregor959
Actually I found 2 links using Math.net for a linear regression:
http://mathnetnumerics.codeplex.com/discussions/271868 this one is in VC but should be easy to change.
and another here:
http://sharpstatistics.co.uk/stats/linear-regression-in-c/ is in c#
That should help.!

Read the full discussion online.
To add a post to this discussion, reply to this email (mathnetnumerics@discussions.codeplex.com)
To start a new discussion for this project, email mathnetnumerics@discussions.codeplex.com
You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.
Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com
May 12, 2012 at 12:29 AM

Note that you should not do what's being done in the C# example.  If you need C# code, translate the VB example.

May 13, 2012 at 5:38 PM

Hi Ted

Thanx, Can you elaborate a bit?

Is it the regression maths (inversion) that is not stable?. or was it something about the C# code.

Am a bit rusty on regressions myself, but am about to use it again in a project.

G

 

May 21, 2012 at 2:29 AM

Sure.  Solving the normal equation is directly is sometimes sensitive to rounding errors so solving via the QR factorization has been the standard for many years.  Under certain circumstances, the even QR factorization method can go awry.  In those cases, a method based on the singular value decomposition should be used.

The C# example solves the normal equations, the VB example uses the QR method.  the QR method is usually good enough.  You might check out "Numerical Linear Algebra" by Trefethan and Bau for more details.

May 21, 2012 at 11:31 AM

I've updated my C# article to include the QR factorization code, http://sharpstatistics.co.uk/stats/linear-regression-in-c.

In practice I've never seen a practical difference in the results obtained by the 2 methods, as I have built software that does both methods for this reason. Often the results disagree after something like the 10th decimal place. Looks like the suggested book is worth a read.

George

May 21, 2012 at 5:22 PM

George --

I think it's really great that you updated the article.  It's rare and to be commended.  Thank you. (BTW, when I hit the site, I still see the old code, but I assume that's a caching problem on my end).

I found this set of lecture notes that do a reasonable job of explaining the situation.  http://www.cs.usask.ca/~spiteri/M313/notes/Lecture19.pdf There's an explicit example of where things go off the rails.'

Thanks,


Ted

May 21, 2012 at 8:05 PM

Hi Ted,

Always happy to update code when good advice comes along, thanks for taking the time to comment. I've been reorganising my site over the weekend, but the updated code should be there now.

Thanks for the link to the pdf, I'll go through it in detail later. A quick read shows it uses a 14 degree polynomial as a example of when things go wrong, so I think solving the normal equation directly is ok for simple problems, which is what experience has shown me.  I don't know as yet where the line comes between simple and complicated so makes sense to use QR. Do you know if there is much difference in time and resources taken to perform the two methods?

I've ordered the book you mentioned as it is something I've been meaning to read up on for some time.

Cheers

George