Intro to Excel & statistics - central limit theorem & monte carlo simulation

background [prelab assignment (24 points) due at the begining of lab; reminder: as appropriate, use in-text (basics & subsequent sections) citation and reference list (basics & subsequent sections); have a copy of your prelab & a printout of this lab handout available, so you can refer to it during the lab]

1. you will be using MS Excel to simulate rolling a dice (many times). what is a simulation versus a Monte Carlo simulation ?

2. describe how to use MS Excel to find / generate

i. mean

ii. SD

iii. histogram

iv. simulate rolling a 6-sided dice

as appropriate, include sketches / screen shots to clarify your response.

3. describe the ___; clearly differentiate between the terms, 'sample' versus 'sample of means' in your response.

a. central limit theoreom (with regards to the normal distribution)

b. standard error of the mean (SEM); relate to the standard deviation.

4. what probability distribution describes rolling a dice ? what is the formula for the mean and standard deviation of this distribution ?

purpose

introduction to the use of Excel & statistics

1. simulate rolling 6-, 8-, and 10- sided dice

2. simulate rolling a 6- sided dice, 120 times, to examine the validity / plausibility of the Central Limit Theoreom by comparing the histogram of the distribution of the means consisting of a sample size of 1, 2, 9, versus 36 to the theoretical histogram

3. compare the mean of rolling a 6- versus 10- sided dice using a sample size of 4 versus 16

4. compare the mean of rolling a 6- versus 8- versus 10- sided dice using a sample size of 4 versus 16

fulfill various aspects of the NGSS science & engineering standards; including computational thinking / (cs) computer science

materials

computer with MS Excel

lab report prepared using Excel & Word (or equivalent)

methods / data analysis

refer to the purpose of the lab and the content of the lab report in determining what to do in the lab

content of lab report [45 points; group lab report; up to 2 students / group]

6-sided dice simulation [15 points]

histograms: distribution of the mean of sample size of 1, 2, 9, or 36 of rolling a 6-sided dice; include a description of the data; use 120 simulated runs (or dice rolls) for each sample size

qualitatively, does the preceding histograms support the central limit thereom ? elaborate.

calculate the theoretical value of the SEM (for sample size of 2, 9, & 36), SD, and mean for rolling a 6-sided dice; show your work; include appropriate citation for formula.

statistical analysis (use links in below resource section); include interpreation / implication of statistics test results, as well as, data entered into the statistical tests; include data source / description. [20 points]

t-tests: compare the value of the mean in the 6- versus 10- sided dice simulation; run 2 separate t-tests for a sample size of 4- (mean of duplicates) & 16 (mena of duplicates)

1-anova / Tukey test: compare the value of the mean in the 6-, 8-, & 10- sided dice simulation; run 2 separate t-tests for a sample size of 4 (mean of duplicates) & 16 (mean of duplicates)

skewness calculation: a simple test if the distribution of means is a normal pdf, calculate its skewness. in a normal pdf, its skewness is 0, so the closer the value is to zero, the more closely a distribution is described by the normal pdf. based on this calculation, is the central limit theorem valid in the 6-sided dice simulation ? elaborate.

what is the motivation of examining the mean of duplicates (rather than simply a sample size of 4 or 16) in the preceding t-test and 1-anova ?

remarks: the motivation of examining the validity of the central limit theorem is that the t-test and 1-anova assumes that the samples comes from a normal distribution.

describe / list five NGSS science & engineering practices that correlates to an aspect of this laboratory activity; provide an example in the lab / lab report that meet these practices [10 points]

resources

excel tutorial to generate random numbers for rolling dice

partial template: dice roll Excel simulation - xlsm (enable VBA macro)

mathematica (accessible only on computers in the classroom).

caveats in using spreadsheet software (optional); e.g.

On the accuracy of statistical procedures in Mircosoft Excel 2010. 2011.

Beware of Excel Histograms. (blog; 2012)

_________________________________________________________________

The risks of using spreadsheets for statistical analysis. 2013. (IBM promo)

Using Excel for statistical analysis 2010

Statistical analysis using Microsoft Excel 2008

Using Excel for Statistical Data Analysis - Caveats 2007

counter - point: value of using Excel (optional); e.g.

Teaching Fundamental Skills in Microsoft Excel to First-Year Students in Quantitative Analysis. 2015

Review of Advanced Excel for Scientific Data Analysis; 3rd ed. 2012