statistical concepts
introduction to statistics [mathematica: revise; initial]
mean
- it's the average value or the expected value
, where n = sample size; xi = value of data in the sample
- phet sim
standard deviation (SD)
- a measure of the variation in the data, where ~ 68% of the data (in a normal distribution) falls within ± 1 SD of the mean
standard error of the mean (SEM)
- SD of the distribution of means
- a measure of the variation in the value of the mean
regression analysis [for 2 parameter linear equation: pptx; pdf.]
p-value [alternative: pdf; pptx]
types of statistical test
- is used to detect an outlier, a data that is different from the other data
- may be used to delete only a single data
- data is an outlier if Q = | suspect data - nearest data | ÷ (largest data - smallest data) > Qc
Table 1. Q-test values for 90% confidence.
N (sample size)
Q c
3
4
5
6
7
8
9
10
0.94
0.76
0.64
0.56
0.51
0.47
0.44
0.41
For example, is the value of 25 an outlier in the following data set ?
10, 11, 13, 14, 25.
In this example,
Q = | 25 - 14 | ÷ (25 - 10) = 0.73 > Q c = 0.64
thus, the value of 25 is an outlier and may be deleted in subsequent data analysis
to compare 2 groups of data [pdf, pptx]
group 1 group 2 difference X1 Y1 X1 - Y1 X2 Y2 X2 - Y2etc. mean of group 1 mean of group 2 mean differenceuse the following statistical test [mathematica file that does these tests; requires mathematica]
2-sample t-test (or independent sample t-test or unpaired sample t-test)
- compare the mean of group 1 versus the mean of group 2
1-sample t-test (or correlated sample t-test or paired sample t-test)
- compare the mean difference (to zero)
to compare 3 or more groups of data that differ by a single factor, use [pptx; pdf]
- 1-factor analysis of variance (1-anova)- to detect if there is / are any pairwise difference(s) among groups of data [refer to p-value]
- Tukey's test - to identify the specific pair(s) of data that differ; done if the p-value < 0.05 in the preceding 1-anova
- mathematica file that does this test [requires mathematica]
to compare groups of data that differ by two factors, use (optional)
- 2-factor analysis of variance (2-anova) - to detect if there are any effect by the factor in the column or row; also, examines if there any interaction between these 2 factors - the interpreation becomes more complicated if there are any interactions between the 2 factors
- website does calculation; directions on its use (ignore initial portion - it's based on an earlier version of my website)
additional resources: