Student T test, are two normal distributions different?
andyfaff
I want to know if two normal distributions are different.
I am fitting two different datasets, obtaining a parameter that I need to compare between the two. Is the parameter different for the two different datasets? I fitted the data 1000 times, with Monte Carlo resampling on each fit. I then histogrammed the parameters (the mean and sd for the parameters should be the same as the Levenberg Marquardt).
The two parameter distributions from the 1000 fits on each dataset have the following statistics:
D1 - mean = 59.6, sd = 0.6
D2 - mean = 61, sd = 1.0
I was doing:
t = (61 - 59.6) / sqrt(0.6^2/1000 + 1^2 /1000) = 37.96
I then tried using:
print 1 - (studentA(37.96, 998) - 1) / 2
Is this correct? Or is the number of degrees of freedom too large, should it be 1 (I have two datasets)?
In general, if you weren't doing this with Monte Carlo resampling, and you just had the results from a Levenberg Marquardt fit, how would you go about answering the question?
Dataset1:
w[0] = 59.6 +/- 0.6
Dataset2:
w[0] = 61 +/- 1.0
I can't remeasure the datasets multiple times because they are expensive to collect and make.
A clean way of getting the result is to store the values obtained for each one of the distributions in a wave, e.g., waveD1, waveD2 and then execute:
Internally, the code would execute essentially the same as what you have indicated above, i.e., compute the individual means and stdv. It also allows you to compute the test based on "pooled variances". The degrees of freedom are computed in thee different ways. The simplest is n1+n2-2.
Strictly speaking, this test is appropriate when the two samples came from normal distributions and when their variances are equal.
If you are really comparing distributions might also consider applying StatsKSTest which is a bit more than testing the means. Also, don't forget a little handy tool that hides under Analysis Menu->Statistics->Two Sample Tests. If you follow this approach the algorithm actually tests the variances before allowing you to proceed with a T-test and if the variances are sufficiently different you are lead to the appropriate test.
A.G.
WaveMetrics, Inc.
November 16, 2011 at 03:43 pm - Permalink
One thing I am worried about though is the fact that there are only two original datasets. How would one go about the test if you did two curvefits on each of the different datasets getting:
Dataset1:
w[0] = 59.6+/-0.36
-and-
Dataset2:
w[0] = 61 +/- 1.0
Can one get the probability that those two parameters are different? Obviously you know the number of points in each dataset, the datasets were only measured once and you have a Chi2 value
November 16, 2011 at 04:08 pm - Permalink
My philosophy is that it is rarely a good idea to run a statistical test on derived data as it is usually more difficult to estimate the effect of the transformation. In this case (assuming that your w[]'s are the centers of the fitted gaussian) then your transformation is a curve-fit which gives you two estimates of the means of the respective distributions. The process of histogram/binning followed by fitting is a particularly inaccurate estimate of the mean, especially if the number of samples per bin and the number of bins are not high. In other words I question the validity of applying a T-test in this way and I would encourage you to use a different test.
Here is an example that shows you just how sensitive the estimate may be as a function of the number of histogram bins:
Make/O/N=1000 ddd=gnoise(10)
Variable sampleMean=mean(ddd)
Make/N=100/O W_Hist
Histogram/P/C/B=4 ddd,W_Hist
Make/O/N=5 W_coef={0,0.01,0,10}
CurveFit/Q/G/NTHR=0 gauss kwCWave=W_coef, W_Hist
Variable/G V_fitOptions=4
printf "/B=4\t sampleMean=%g\t fit=%g\r",sampleMean,w_coef[2]
Make/N=15/O W_Hist
Histogram/P/C/B=1 ddd,W_Hist
Make/O/N=5 W_coef={0,0.01,0,10}
CurveFit/Q/G/NTHR=0 gauss kwCWave=W_coef, W_Hist
printf "/B=1, n=15\t sampleMean=%g\t fit=%g\r",sampleMean,w_coef[2]
Make/N=20/O W_Hist
Histogram/P/C/B=1 ddd,W_Hist
Make/O/N=5 W_coef={0,0.01,0,10}
CurveFit/Q/G/NTHR=0 gauss kwCWave=W_coef, W_Hist
printf "/B=1, n=20\t sampleMean=%g\t fit=%g\r",sampleMean,w_coef[2]
Make/N=25/O W_Hist
Histogram/P/C/B=1 ddd,W_Hist
Make/O/N=5 W_coef={0,0.01,0,10}
CurveFit/Q/G/NTHR=0 gauss kwCWave=W_coef, W_Hist
printf "/B=1, n=25\t sampleMean=%g\t fit=%g\r",sampleMean,w_coef[2]
End
If you are willing to accept the results of your fit as valid estimates for mean and stdv then you still have to account for the number of data that contributed to each of the estimates which brings us back to your initial approach with corrected DF and possibly pooled variances.
A.G.
November 16, 2011 at 05:11 pm - Permalink
display d1, d2
d1 = 2*x + 2 + gnoise(5)
d2 = 2.1*x + 2 + gnoise(5)
CurveFit/M=2/W=0/TBOX=(0x300) line, d1/D
CurveFit/M=2/W=0/TBOX=(0x300) line, d2/D
// w - d1
// a =3.2403 ± 1.31
// b =1.9384 ± 0.046
// w - d2
// a =3.1017 ± 1.73
// b =2.0291 ± 0.0608
How do I work out the probability/test that tells me if the gradients b_d1 and b_d2 are different?
How do I work out the probability/test that tells me if the intercepts a_d1 and a_d2 are the same?
November 16, 2011 at 07:34 pm - Permalink
t = (xbar1 - xbar2)/sqrt(var1/N1 + var2/N2)
I would presume that the appropriate N1 and N2 for your case would be 48 (N-M, where M is the number of fit coefficients). The DF for the test is given by
DF = (var1/N1 + var2/N2)^2/[((var1/N1)/(N1-1))^2 + ((var2/N2)/(N2-1))^2]
I guess the N's here would be the DF's from the fit. I would ask an expert, though.
This test assumes, of course, Gaussian statistics. Did your Monte Carlo test show it to be at least somewhat Gaussian?
Sorry about the hard-to-read equations. The ones in the book are easier to read. See the chapter Statistical Description of Data, section Student's t-test for Significantly Different Means.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 17, 2011 at 09:24 am - Permalink
I also occurs to me that a bootstrap method might just be the best approach to the whole mess. You've already got a start on it, as described in your original post. The canonical text on bootstrap methods is An Introduction to the Bootstrap by Efron and Tibshirani, http://www.amazon.com/Introduction-Bootstrap-Monographs-Statistics-Prob…, also available on the Kindle for a little bit less, but still an outrageous price.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 17, 2011 at 10:09 am - Permalink
My reference from Quantitative Chemical Analysis, 7th Ed D. C. Harris gives the equations below for comparing two population sets of data with sizes n_1 and n_2. The equations apply under the assumption that population standard deviations are the same in both sets. The degrees of freedom for the t-test table will be n_1 + n_2 - 2. The F-test is used to determine if the population standard deviations are different or not.
I was wondering if n_1 and n_2 are the number of bins rather than the number of parameters. However, that itself makes no sense. My quick calculation gives s_pooled = 0.82 and t_calc = 37.97 for your data set using n_1 = n_2 = 1000.
Basically, your two results are significantly different at the 95% confidence level.
Fitting linear regression lines and then doing statistical comparisons on the slopes and intercepts is a bit trickier.
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
November 17, 2011 at 01:39 pm - Permalink
If the issue was really in the statistics of linear fits then the solution is in StatsLinearRegression.
A.G.
WaveMetrics, Inc.
November 17, 2011 at 05:25 pm - Permalink
I should have known it was already covered in Igor somehow :-) Thanks.
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
November 18, 2011 at 07:34 am - Permalink