Intro to Excel & statistics - central limit theorem & monte carlo simulation

background [prelab assignment (24 points) due at the begining of lab; reminder: as appropriate, use in-text (basics & subsequent sections) citation and reference list (basics & subsequent sections); have a copy of your prelab & a printout of this lab handout available, so you can refer to it during the lab]

1. you will be using MS Excel to simulate rolling a dice (many times). what is a simulation versus a Monte Carlo simulation ?

2. describe how to use MS Excel to find / generate

i. mean

ii. SD

iii. histogram

iv. simulate rolling a 6-sided dice

as appropriate, include sketches / screen shots to clarify your response.

3. describe the ___; clearly differentiate between the terms, 'sample' versus 'sample of means' in your response.

a. central limit theoreom (with regards to the normal distribution)

b. standard error of the mean (SEM); relate to the standard deviation.

4. what probability distribution describes rolling a dice ? what is the formula for the mean and standard deviation of this distribution ?


introduction to the use of Excel & statistics

1. simulate rolling 6-, 8-, and 10- sided dice

2. simulate rolling a 6- sided dice, 120 times, to examine the validity / plausibility of the Central Limit Theoreom by comparing the histogram of the distribution of the means consisting of a sample size of 1, 2, 9, versus 36 to the theoretical histogram

3. compare the mean of rolling a 6- versus 10- sided dice using a sample size of 4 versus 16

4. compare the mean of rolling a 6- versus 8- versus 10- sided dice using a sample size of 4 versus 16

fulfill various aspects of the NGSS science & engineering standards; including computational thinking / (cs) computer science


computer with MS Excel

lab report prepared using Excel & Word (or equivalent)

methods / data analysis

refer to the purpose of the lab and the content of the lab report in determining what to do in the lab

content of lab report [45 points; group lab report; up to 2 students / group]

6-sided dice simulation [15 points]

histograms: distribution of the mean of sample size of 1, 2, 9, or 36 of rolling a 6-sided dice; include a description of the data; use 120 simulated runs (or dice rolls) for each sample size

qualitatively, does the preceding histograms support the central limit thereom ? elaborate.

calculate the theoretical value of the SEM (for sample size of 2, 9, & 36), SD, and mean for rolling a 6-sided dice; show your work; include appropriate citation for formula.

statistical analysis (use links in below resource section); include interpreation / implication of statistics test results, as well as, data entered into the statistical tests; include data source / description. [20 points]

t-tests: compare the value of the mean in the 6- versus 10- sided dice simulation; run 2 separate t-tests for a sample size of 4- (mean of duplicates) & 16 (mena of duplicates)

1-anova / Tukey test: compare the value of the mean in the 6-, 8-, & 10- sided dice simulation; run 2 separate t-tests for a sample size of 4 (mean of duplicates) & 16 (mean of duplicates)

skewness calculation: a simple test if the distribution of means is a normal pdf, calculate its skewness. in a normal pdf, its skewness is 0, so the closer the value is to zero, the more closely a distribution is described by the normal pdf. based on this calculation, is the central limit theorem valid in the 6-sided dice simulation ? elaborate.

what is the motivation of examining the mean of duplicates (rather than simply a sample size of 4 or 16) in the preceding t-test and 1-anova ?

remarks: the motivation of examining the validity of the central limit theorem is that the t-test and 1-anova assumes that the samples comes from a normal distribution.

describe / list five NGSS science & engineering practices that correlates to an aspect of this laboratory activity; provide an example in the lab / lab report that meet these practices [10 points]


excel tutorial to generate random numbers for rolling dice

partial template: dice roll Excel simulation - xlsm (enable VBA macro)

mathematica (accessible only on computers in the classroom).

caveats in using spreadsheet software (optional); e.g.

On the accuracy of statistical procedures in Mircosoft Excel 2010. 2011.

Beware of Excel Histograms. (blog; 2012)

Spreadsheets in the Cloud - not ready yet. 2013


The risks of using spreadsheets for statistical analysis. 2013. (IBM promo)

Using Excel for statistical analysis 2010

Statistical analysis using Microsoft Excel 2008

Using Excel for Statistical Data Analysis - Caveats 2007

counter - point: value of using Excel (optional); e.g.

Teaching Fundamental Skills in Microsoft Excel to First-Year Students in Quantitative Analysis. 2015

Review of Advanced Excel for Scientific Data Analysis; 3rd ed. 2012