Intro to Excel & statistics

background [prelab assignment (21 points) due at the begining of lab; use in-text citation]

1. you will be using MS Excel to simulate rolling a dice (many times). in the context of an educational setting (i) what is a computer simulation ? and (ii) its benefits ?

2. describe how to use MS Excel to determine the (i) mean, (ii) sample standard deviation, SD, and (iii) histogram based on a list of numbers from a simulation. include screen shot & citation as appropriate. include a copy in your lab NB.

3. describe how to use MS Excel to generate a list of the (i) values for rolling a 6-sided dice and (ii) the numbers from a normal distribution. include screen shot & citation as appropriate. include a copy in your lab NB. potential resources in the below resource section.

purpose

introduction to the use of Excel & statistics

1. simulate rolling 6-sided dice. analyze effect of sample size of dice roll or mean of dice rolls, on (i) histogram and (ii) mean and standard deviation. compare to theory. see content of lab report for specific sample size.

2. simulate normal distribution. analyze effect of sample size on (i) histogram and (ii) mean and standard deviation. compare to theory. see content of lab report for specific sample size.

3. analyze SEM of 6-sided dice as a function of sample size. compare to theory. recall that the SEM is the SD in a distribution of the sample mean. see content of lab report for specific sample size.

fulfill various aspects of the NGSS science & engineering standards; including computational thinking / (cs) computer science

materials

computer with MS Excel.

lab report prepared using Excel & Word (or equivalent)

methods / data analysis

work in a group of up to 4 students, where each student addresses at least one question in the purpose for this lab activity.

content of lab report [75 points; group lab report; up to 4 students / group]

6-sided dice simulation; may use partially developed Excel simulation; see below resource section. [20 points]

histograms of the value of dice rolls using sample size, n = 30, 60. there will be two histograms; label these histograms.

histograms of the mean of dice roll (n' = 120), where the sample size, n = 2, 9, 36, 64. there will be four histograms; label these histograms.

compare / describe the 'shape' of the preceding histograms of the value of dice rolls versus the histograms of the mean of dice roll.

calculate the mean based on theory for rolling a 6-sided dice; show your work; include appropriate citation for the formula. compare / discuss the value of the theory-based mean versus the value based on the simulation as a function of sample size of dice roll, n = 2, 9, 36, 64 with 120 replications.

normal distribution simulation; let mean = 100; SD = 20; develop your own Excel simulation. [15 points]

histograms of simulation using sample size = 10, 50, 120. there will be three histograms. compare / describe / label these histograms.

compare / discuss the value of the theory-based mean ± SD versus the value based on the simulation as a function of sample size; e.g. n = 3, 10, 50, 120

what is the law of large numbers ? relate your observations to the law of large numbers in regards to this simulation.

SEM - using (preceding) 6-sided dice simulation results: refer to the prior histograms of the distribution of the mean of dice roll (n' = 120) for each sample size; n = 2, 9, 36, 64. [20 points]

what is the central limit theorem ? describe these histograms in the context of the central limit theorem.

calculate the experimental SEM for each sample size based on the data in the histogram and its definition; show your work or describe / show the use of Excel to get its value.

what is the theoretical value of the SEM for each sample size ? show your work; include the formula. compare this theoretical value versus the preceding experimental value. hint: what's the relationship between the SEM vs. SD vs variance ?

generate two curves on the same graph: (i) theoretical SEM versus sample size and (ii) experimental SEM versus sample size. describe the extrapolataion of the curves to larger sample size - justify / rationalize your response.

reflection: state five different NGSS science & engineering practices (each practice has more than one item; refer to at least two items in each practice that is met by this laboratory activity. include a description of an aspect of the lab / lab report that fulfil each practice and explicitly state the relevant (high school level) item in each practice. recommend using a two-column table fomat to address this prompt, where one column is the NGSS standard and the other column is the coresponding aspect of the lab / lab report meeting this standard. use appropriate in-text citation / bibliography. [20 points]

resources

excel tutorial to generate random numbers for rolling dice; normal distribution

excel tutorial for standard deviation

partial template: dice roll Excel simulation - xlsm (enable VBA to run file); you'll have to develop your own normal distribution Excel simulation.

central limit theorem (there's an error in sketching relative width in histograms for n=4 vs. n = 20)

SEM (see comment comparing central limit theorem versus law of large numbers; applet in video)

for "geeks / nerds"

caveats in using a spreadsheet for statistics; e.g.

The risks of using spreadsheets for statistical analysis. 2013. (IBM promo)

On the accuracy of statistical procedures in Mircosoft Excel 2010. 2011.

Using Excel for statistical analysis 2010

JCE statistics article: abstract & text & (original) supplement

JCE mathematica article: abstract & text