statistical concepts

introduction to statistics  [mathematica: revise; initial]

mean

standard deviation (SD)

• a measure of the variation in the data, where ~ 68% of the data (in a normal distribution) falls within ± 1 SD of the mean

standard error of the mean  (SEM)

• SD of the distribution of means
• a measure of the variation in the value of the mean

regression analysis  [for 2 parameter linear equation:  pptxpdf.]

• is a method to determine the value of the parameters of a function that describe the experimental data

p-value [alternative:  pdfpptx]

• used in various statistical tests to identify groups of data that are the same or different
• probability of error (to state that there is a difference, when there is a no difference)
• convention
• for 2-tail p-value
• if p-value < 0.05, then:  mean 1  ≠ mean 2, i.e. "the data is different"
• if p-value > 0.05, then:   mean 1  =  mean 2, i.e. "the data is the same"
• for 1-tail p-value
• if p-value < 0.05, then:  mean 1  <  mean 2  (or mean 1  >   mean 2)
• if p-value > 0.05, then:  mean 1  ≥  mean 2 (or mean 1 ≤  mean 2)

types of statistical test

Q-test

• is used to detect an outlier, a data that is different from the other data
• may be used to delete only a single data
• data is an outlier if  Q  =   | suspect data - nearest data | ÷ (largest data - smallest data)  >  Qc

Table 1. Q-test values for 90% confidence.

 N (sample size) Q c 3 4 5 6 7 8 9 10 0.94 0.76 0.64 0.56 0.51 0.47 0.44 0.41

For example, is the value of 25 an outlier in the following data set ?

10, 11, 13, 14, 25.

In this example,

Q   =  | 25 - 14 |  ÷  (25 - 10)  =  0.73   >  Q c =  0.64

thus, the value of 25 is an outlier and may be deleted in subsequent data analysis

to compare 2 groups of data  [pdf,  pptx]

 group 1 group 2 difference X1 Y1 X1 - Y1 X2 Y2 X2 - Y2 etc. mean of group 1 mean of group 2 mean difference

use the following statistical test  [mathematica file that does these tests; requires mathematica]

2-sample t-test (or independent sample t-test or unpaired sample t-test)

• compare the mean of group 1 versus the mean of group 2

1-sample t-test (or correlated sample t-test or paired sample t-test)

• compare the mean difference (to zero)

to compare 3 or more groups of data that differ by a single factor, use  [pptxpdf]

• 1-factor analysis of variance (1-anova)- to detect if there is / are any pairwise difference(s) among groups of data  [refer to p-value]
• Tukey's test  - to identify the specific pair(s) of data that differ; done if the p-value < 0.05 in the preceding 1-anova
• mathematica file that does this test [requires mathematica]

to compare groups of data that differ by two factors, use (optional)

• 2-factor analysis of variance (2-anova) - to detect if there are any effect by the factor in the column or row; also, examines if there any interaction between these 2 factors  -  the interpreation becomes more complicated if there are any interactions between the 2 factors
• website does calculation; directions on its use (ignore initial portion - it's based on an earlier version of my website)