Linear Least Squares Fitting with uncertainity
ali8
I have a set of known values (X) and a set of measured values (Y), plus uncertainties (+/- dY).
How do I do a Least Squares linear fit but with taking the uncertainties in Y into account?
Regards,
Ali
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
October 25, 2013 at 09:00 am - Permalink
I have 9 measurements of spectra, each one in the form of x+/- dx. e.g. 4023+/-0.23 Angestrom.
The standard error (= stdev/sqrt(n)) does not take into account the dx. It simply take the standard deviation and divide by square root of the number of measurements. I think what I have is more like "error bars" in each measurement. I would like this error bars to be included in the linear fitting algorithm.
October 25, 2013 at 03:17 pm - Permalink
DisplayHelpTopic "Errors in Variables: Orthogonal Distance Regression"
In ODR fitting, the weighting waves are very important.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
October 28, 2013 at 09:45 am - Permalink
Ali, I think you're confusing terminology here. The standard error formula you're citing is decent way to report error bars for data where the number you're interested in is the average of the measurements. So you're trying to measure a single number (forget fitting functions for second) and you take N identical measurements and get N different values. The measured standard deviation is a measurement (itself subject to measurement error) of the uncertainty of a *single* measurement. But the measured mean of N single measurements has an uncertainly sqrt(N) smaller than the uncertainty of each individual measurement, so the standard error makes for a better error bar than the standard deviation. Better in that it improves (as it should) by taking more data points, which is especially important if you have different data reported by averaging different numbers of measurements. Even better (generally agreed) for error bars is to calculate a proper confidence interval using the StudentT distribution and a selected confidence interval (say 90%, 95%, 99% all common choices), but StudentT factor approaches a constant value for large N, so the standard error is not a terrible stand-in, but that's not actually what you need to do for least squares fitting.
Back to to least squares fitting: Igor's algorithm is looking for your best estimate of the inherent uncertainty on each Y point that you input. If you don't have any apriori knowledge of that uncertainty, you can estimate from individual measurements. If you take 10 measurements of Y at the SAME X value and enter all 10 data points into the fit as X,Y pairs (that's how I would do it...the least squares fitting algorithm doesn't care if you repeat X values) the uncertainty on each data point should be the standard deviation...your estimate of uncertainty for a single measurement of Y. The improved mean of those 10 measurements will be weighted more heavily by the fact that there are 10 input data points rather than 1.
Alternately you can pre-average Y values that are collected at the same X value and enter them as single pairs of X,mean(Y). Then the uncertainty you would want to enter should be the standard error of the mean: you have less uncertainty in the mean of N Y measurements than a single Y measurement. The improved mean will be weighted more heavily by the fact that your standard error goes down with increasing measurements.
Either method should give about the same fit result. It's also possible you have come up with uncertainty estimates from some entirely different method than the individual data point measurements: theoretical or previous instrument characterizations. The assumption is still that you're inputting the uncertainty on each individual data point used in the fit, and if you've estimated correctly you should get a reduced Chi-square value close to 1. If your uncertainties are all off by a constant scaling factor, you'll still get the same fit results, but the reported uncertainties on the fit parameters will be different and reduced Chi-squared will be significantly larger or smaller than 1.
October 29, 2013 at 08:49 am - Permalink
I might add also that doing the fit with all the data as XY pairs will give the same answer as doing the fit using pairs of averaged Y values at each X.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
October 29, 2013 at 09:27 am - Permalink
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
October 29, 2013 at 01:24 pm - Permalink
One advantage of averaging Y values (that share neighboring X values) is that if the distribution of individual errors is not Gaussian, then least-squares fitting is not correct (one should do maximum likelihood, for example). However, the Central Limit Theorem implies that the average TENDS to a Gaussian as the number N of averaged points goes to infinity. If the original distribution is not too pathological, the conversion can be reasonably rapid. So fitting to averaged values is more likely to put you in a limit where least-squares fits are valid and, as a bonus, you get an estimate (sem) of the weighting for each point.
Of course, if the X values are too different, then you are averaging points whose means vary too much, and that can smooth out features in the data.
John Bechhoefer
Department of Physics
Simon Fraser University
Burnaby, BC, Canada
October 30, 2013 at 08:21 pm - Permalink