This project has moved and is read-only. For the latest updates, please go here.

Online Statistics / StdDev

Apr 15, 2011 at 6:39 PM

Hello,

In response to your latest blog: http://numerics.mathdotnet.com/blog/2011/4/12/online-computation-of-statistics.html

Great example post thank you, however for a truly online algorithm one needs to be able to submit new samples as they arrive, if you needed to find your new standard deviation relative to a growing 800MB dataset every 500ms - it would be difficult to use this method,

Is there a way to "addOne()" sample and have the standard deviation update with this system? To better reflect the true nature of sampling?
Thank you
JB

PS. I think the blog comment system is broken, it timed out trying to submit this comment.

Apr 16, 2011 at 9:17 AM

Hi,

>Is there a way to "addOne()" sample and have the standard deviation update with this system?

I don't think so, since Statistics is a static class and it doesn't store any state. I believe Jurgen was planning on breaking up the DescriptiveStatistics class so that one part would only compute statistics that could be done in a single pass. This would be the perfect place to add AddOne().

Regards,

Marcus

Apr 16, 2011 at 2:52 PM

Great thank you,

In my project I used these references: maybe they will help

http://lingpipe-blog.com/2009/07/07/welford-s-algorithm-delete-online-mean-variance-deviation/

http://www.johndcook.com/standard_deviation.html

Best,

JB

Apr 21, 2011 at 3:51 PM

Hey Jared,

You are absolutely right, our classes do not allow for actual online computation of statistics. It is on the todo list but haven't had a chance to look at it just yet. Thanks for the links though: this is exactly the algorithm we should be using for that.

Your feedback is much appreciated!

Apr 21, 2011 at 3:52 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.