comparing chi square values
tdgordon
I'm fitting a Gaussian function to a number of different curves. These curves may have quite different maximum amplitudes; they may be quite different "sizes." I'd like to compare the fit quality for the different curves. The V_chisq values that are returned will depend on the relative size of the curves, which means that I need some way to standardize them that will allow me to compare. Is this what the weighting wave is for? I don't quite understand how to implement that. Just to explain a little more, consider just the maxima in two curves. Assume curve A's max=100 and curve B's max=10. Also assume that the fits of A and B have maxima of 90 and 9, respectively. Then for curve A, V_chisq=(y_hat-y)^2=(90-100)^2=100, and for curve B, V_chisq=(y_hat-y)^2=(9-10)^2=1. The V_chisq for curve B is much less, so it appears to be a better fit, but, of course, it is not. Each fit is equally good. How do you set up the weighting to account for this?
Thanks,
Tim
This may seem pedantic, but it is good to understand what you are saying when you apply a weighting wave.
If it is the amplitude of the curve that is different, then by the same kind of argument, you could justify scaling the chi-square value by the amplitude before the comparison.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 16, 2017 at 11:42 am - Permalink
In response to your last sentence, it is not simply the amplitude that is different for different curves. They are going to be subtly different in various ways. They are from actual measured data, so they are not exactly Gaussian. Nevertheless, my previous example about 100 amplitude vs. 10, is relevant. Some peaks are going to be a lot bigger than others. I want to compare the relative quality of each curve's Gaussian fit.
Thanks,
Tim
November 16, 2017 at 12:11 pm - Permalink
It seems to me that, all else being equal, the two parameters of interest are the relative accuracy of the fit and the relative uncertainty (precision) of the fit.
* Relative Accuracy = (A_fit - A_theory)/A_theory
* Relative Precision = \Delta A_fit / A_fit
As I understand, a properly-weighted chi-squared value is a representation of how well the data fits the model. When the relative noise level (e.g. inverse S/N ratio) is the same AND the model is the same AND the number of points are the same, the properly weighted chi-squared value should be the same. So, the only thing learned by the chi-squared is how noisy your data are relative to the perfect model.
A more important aspect of this is to assure that you are keeping the S/N ratio the same. When you lower the amplitude from 100 to 10, are you scaling the noise by the same factor? If not, then chi-squared will be different in fitting the two cases. My (hopefully correct) guess is, when you drop the signal by 10 and keep the noise the same, the appropriate re-scaling of chi-squared is a factor of 10^2 = 100. I base this off the fact that you will have to increase the dwell time per channel by a factor of 100 when your signal drops by a factor of 10 (since S/N goes as dwell time squared).
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
November 16, 2017 at 01:13 pm - Permalink
Tim
November 16, 2017 at 01:24 pm - Permalink
In the example you give, where A=constant*B, sqrt(v_chisq)/w_coef[1] will be the same for both fits. W_coef[1] is the fitted peak height and scales linearly with the signal.
Note that in your example, the fit to curve B IS a better fit to the data, because your model is much closer to your data in signal space. That's how goodness of fit is measured. Least squares minimization doesn't care about the shape of your fitting function, but rather how close it gets to the data.
November 16, 2017 at 03:52 pm - Permalink
It is possible using chi-square values to decide if two different fits to the same data are significantly different, or one significantly better than the other.
You can use the sigmas on the fit coefficients to decide how well each coefficient has been determined, and to decide if a fit to one data set has significantly different coefficient values than another fit
I'm not sure I know a way to measure "how close to Gaussian" a data set is. I'm not sure that's a well-defined concept.
The measurement noise might scale with size- such things often do. But it is just as likely that the noise is simply an ever-present random addition to whatever signal it is that you're measuring. In that case, it's likely to be constant, or nearly so. You might be able to estimate the noise from periods between "blips".
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 16, 2017 at 04:22 pm - Permalink
I would again say that, chi squared will depend quadratically on 1/(S/N)^2. In the limit that S/N -> infinity, chi squared will go to zero (perfect fit of the data to the model). In the limit S/N -> 0, chi squared will be the largest (infinity). The suggestion that you normalize the peaks in height is a reasonable way to see this in practice. The noise will eventually dominate as the amplitude of the blip decreases. At some point, the data will be just as good to fit a random peak shape as they are to fit a Gaussian. The chi squared will fall apart (head sporadically to infinity).
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
November 16, 2017 at 05:27 pm - Permalink
The way I judge this is by looking at the residual of the fit. If the residual looks wavy then the shapes of the curve are mismatched. If the residual is a 'random' distribution of points either side of zero then the data is consistent with the fit.
I have come up with a rough and ready way of attempting to measure this (example code given below). It simply runs a smooth over the residuals and then calculates the RMS deviation of this. The idea being that data that is randomly distributed around zero will smooth towards zero more efficiently than data where there are many consecutive points offset in the same direction. So the 'deviation parameter' is small for consistent shapes, and increases as the shapes mismatch more. You can play with the smooth width, which is a parameter of this method.
I have no idea about the rigour of this, but it intuitively makes sense.
Here is some code to illustrate this idea (hopefully it is self-explanatory):
Variable vRMS
Variable vSmth=7
// True Gaussian with noise
Make/O/D/N=(100) wGauss
wGauss[] = 10000 * StatsNormalPDF(x, 50, 10)
wGauss[] = wGauss[p] + gnoise(sqrt(wGauss[p]))
// Plot and Fit Gaussian
PlotAndFit(wGauss)
// Smooth Residual and calc RMS deviation
Wave wResData = $("Res_wGauss")
vRMS=CalcDeviationParameter(wResData, vSmth)
TextBox/C/N=text0/F=0/A=MC "Deviation Parameter ("+num2str(vSmth)+") = "+num2str(vRMS)
// LogNormal
Make/O/D/N=(100) wLogNorm
wLogNorm[] = 8000 * StatsLogNormalPDF(x, 0.2, -5, 55)
wLogNorm[] = wLogNorm[p] + gnoise(sqrt(wLogNorm[p]))
// Plot and wLogNorm Gaussian
PlotAndFit(wLogNorm)
// Smooth Residual and calc RMS deviation
Wave wResData = $("Res_wLogNorm")
vRMS=CalcDeviationParameter(wResData, vSmth)
TextBox/C/N=text0/F=0/A=MC "Deviation Parameter ("+num2str(vSmth)+") = "+num2str(vRMS)
End
Function PlotAndFit(wData)
wave wData
Display wData
ModifyGraph rgb=(0,15872,65280), mode=3, marker=19, msize=0.5
CurveFit/NTHR=0 gauss wData /D /R
String sResName = "Res_" + NameOfWave(wData)
Wave wRes_Data = $sResName
ModifyGraph zero(Res_Left)=1, mode($sResName)=4
SetAxis/A/E=2 Res_Left
End
Function CalcDeviationParameter(wResData, vSmth)
Wave wResData
Variable vSmth
String sSmthName = NameOFWave(wResData) + "_smth"
Duplicate/O wResData,$sSmthName
Wave wResSmth = $sSmthName
Smooth vSmth, wResSmth
WaveStats/Q wResSmth
Print "Deviation Parameter(" + sSmthName + "): ", V_rms
return(V_rms)
End
Hope this helps,
Kurt
November 17, 2017 at 01:42 am - Permalink
So how to scale the chi sq values so that all three fit qualities can be ascertained?
November 17, 2017 at 08:59 am - Permalink
Your noise level is relatively low; it might be that you could smooth and look at derivatives to decide if the shape is more than one overlapped peak. I think that Multipeak Fit, with suitable tweaking of the auto-detection parameters, would find three peaks in the bottom plot.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 17, 2017 at 09:44 am - Permalink
November 17, 2017 at 10:09 am - Permalink
Are the side peaks always the same width as the main peak? That's what it looks like in your plots. If that is the case, the three-peak fit can be simplified, using just one coefficient for the peak width. My guess is also that for a given width, the side peaks are at a fixed offset from the main peak. That would guide the initial guesses for the three-peak fit.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 17, 2017 at 12:04 pm - Permalink
"Are the side peaks always in the same place for a peak of a given width?" I can't see any physical reason for why they would be, so no. I have seen plots where there is a peak on only one side. Yes, in a perfect instrument there are no side peaks within noise. I don't think side peaks are the same width as the main peak.
Doesn't it seem that something like this but slightly more sophisticated should work: calculate chi sq and then normalize by amplitude. For example, max for data in top panel is 1.152 and max for data in middle panel is 0.896, so multiply chi sq for the middle panel by 1.152/0.896 to compare those two fits. If you do that, you get chi sq for middle = 0.10305, which makes the fit quality look more like it should. I realize this isn't quite right because it doesn't account for width differences (or shape differences, speaking more generally) between the Gaussians in the two panels. But something like that seems like the right metric.
Tim
November 17, 2017 at 12:31 pm - Permalink
I wonder if a first and/or second derivative can be used to find a "one value" correlation characteristic of perfect and out-of-tune instruments. As side peaks form in the blip, the derivatives of that blip will have more zero crossings, the amplitudes of the intermediate derivative "peaks" will vary, and their shapes will be less symmetrical. The integral of the shape will also be different for symmetrical versus asymmetrical blips. The integral seems a bit more difficult to parse for characteristic parameters though. The method of comparing residuals after fitting seems also worth considering.
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
November 17, 2017 at 01:17 pm - Permalink
- compute the residuals of your fit.
- plot the frequencies of the residuals with some appropriate binning width to resolve the shape (also depends on how 'dense' your data is, I guess).
- fit gaussian's to the frequency plot and compare your favorite 'goodness of fit' parameter.
The idea being that a 'good' choice of model will result in a normal distribution of residual frequencies (deviations due to random noise), whereas an inappropriate choice of function may lead to a non-normal distribution. For example, from an eyeball and guess approach to your third plot, I'd guess that you would get at least a bi-modal looking distribution of residual frequencies which would likely kill your 'goodness of fit' parameter. I'd also guess that this approach would be less sensitive to amplitude of signal. You can try to 'calibrate' the method with known data-sets of aligned and mis-aligned measurements to try and decide on a 'cut-off' value.
Probably pretty simple to write some quick code to implement this.
November 26, 2017 at 07:02 am - Permalink