# Advanced High School Statistics - 2nd Edition

David Diez, OpenIntro

Christopher Barr, Varadero Capital

Mine Çetinkaya-Rundel, Duke University

Leah Dorazio, San Francisco University High School

Copyright Year: 2019

Publisher: OpenIntro

Language: English

## Read this book

## Conditions of Use

Attribution-ShareAlike

CC BY-SA

## Reviews

This textbook contains the main components necessary to cover the average beginner course in statistics. It doesn’t provide a staggered introduction to statistical concepts but rather jumps right in with demonstrating how statistics can be used in... read more

This textbook contains the main components necessary to cover the average beginner course in statistics. It doesn’t provide a staggered introduction to statistical concepts but rather jumps right in with demonstrating how statistics can be used in an applied setting. As a standalone the books covers several topics but not in depth it is just enough for a survey course. The real strength of this text is in the video overviews and the informational slides. These supplementary materials allow students greater insight into the content with applied or real life examples. The authors also include examples of how to interpret the numerical findings as well as considering assumptions with the various tests.

I have not detected any obvious errors in the content.

The content is current and includes examples from current topics like stem cell research. The examples vary and are not within the same discipline (e.g., medical, psychological, political, etc.) and they are not aligned with one single software package like R, SPSS, or Excel therefore the text can be used across the natural and social sciences with a variety of computing tools.

This is written in a very straightforward manner and doesn’t over complicate the content by exhausting the reader with too many definitional terms. The authors provides the minimum of what the reader needs to know in order to follow the examples and conduct the exercises. The added use of figures is applauded as it helps develop the reader’s ability over the course of the textbook to visualize the data and distributions.

The text is written in a consistent voice.

The text is easily divisible to coincide with the assignment of multiple interacting units. The subsections can be switched around or reorganized to match the instructors own preferred organization of the content as the units are mostly self-sustaining (except for those with extended examples).

The organization is not like a typical stats textbook where the content is structured by type of statistical test or type of research question. Instead the organization further confirms the data-centric perspective of this text by focusing for the most part on the type of data that the student has and what tests can be done within that limited range. It is a slightly different flow and I find it interesting and logical.

There were no interface issues or navigation problems with the downloaded pdf version. Charts and figures loaded and resized easily.

I have yet to detect major grammatical issues.

The text does not appear obviously offensive or culturally insensitive.

The text is written clearly and the examples are very applied. There are no frills, just good briefly covered content. While this is written for an advanced high school course it could easily be used for college or even beginner Master-level courses. It is also a good resource for those needing a refresher on how to interpret data findings or consider the limitations related to the nature of their data. This text is straight to the point and covers a good range of content.

## Table of Contents

1 Data collection

- 1.1 Case study
- 1.2 Data basics
- 1.3 Overview of data collection principles
- 1.4 Observational studies and sampling strategies
- 1.5 Experiments

2 Summarizing data

- 2.1 Examining numerical data
- 2.2 Numerical summaries and box plots
- 2.3 Considering categorical data
- 2.4 Case study: malaria vaccine (special topic)

3 Probability

- 3.1 Defining probability
- 3.2 Conditional probability
- 3.3 The binomial formula
- 3.4 Simulations
- 3.5 Random variables
- 3.6 Continuous distributions

4 Distributions of random variables

- 4.1 Normal distribution
- 4.2 Sampling distribution of a sample mean
- 4.3 Geometric distribution
- 4.4 Binomial distribution
- 4.5 Sampling distribution of a sample proportion

5 Foundation for inference

- 5.1 Estimating unknown parameters
- 5.2 Confidence intervals
- 5.3 Introducing hypothesis testing
- 5.4 Does it make sense?

6 Inference for categorical data

- 6.1 Inference for a single proportion
- 6.2 Difference of two proportions
- 6.3 Testing for goodness of fit using chi-square
- 6.4 Homogeneity and independence in two-way tables

7 Inference for numerical data

- 7.1 Inference for a mean with the t-distribution
- 7.2 Inference for paired data
- 7.3 Inference for the difference of two means

8 Introduction to linear regression

- 8.1 Line fitting, residuals, and correlation
- 8.2 Fitting a line by least squares regression
- 8.3 Inference for the slope of a regression line
- 8.4 Transformations for skewed data

A Exercise solutions

B Distribution tables

C Distribution Tables

D Calculator reference, Formulas, and Inference guide

## About the Book

We hope readers will take away three ideas from this book in addition to forming a foundation

of statistical thinking and methods.

- (1) Statistics is an applied field with a wide range of practical applications.
- (2) You don't have to be a math guru to learn from real, interesting data.
- (3) Data are messy, and statistical tools are imperfect. But, when you understand the strengths and weaknesses of these tools, you can use them to learn about the real world.

**Textbook overview**

The chapters of this book are as follows:

- 1. Data collection. Data structures, variables, and basic data collection techniques.
- 2. Summarizing data. Data summaries and graphics.
- 3. Probability. The basic principles of probability.
- 4. Distributions of random variables. Introduction to key distributions, and how the normal model applies to the sample mean and sample proportion.
- 5. Foundation for inference. General ideas for statistical inference in the context of estimating the population proportion.
- 6. Inference for categorical data. Inference for proportions using the normal and chisquare distributions.
- 7. Inference for numerical data. Inference for one or two sample means using the t distribution, and comparisons of many means using ANOVA.
- 8. Introduction to linear regression. An introduction to regression with two variables.

Instructions are also provided in several sections for using Casio and TI calculators.

## About the Contributors

### Authors

**David Diez** is a Data Scientist at OpenIntro.

**Christopher Barr** is an Investment Analyst at Varadero Capital.

**Dr. Mine Çetinkaya-Rundel** is the Director of Undergraduate Studies and an Associate Professor of the Practice in the Department of Statistical Science at Duke University. She received her Ph.D. in Statistics from the University of California, Los Angeles, and a B.S. in Actuarial Science from New York University’s Stern School of Business. Her work focuses on innovation in statistics pedagogy, with an emphasis on student-centered learning, computation, reproducible research, and open-source education.

**Leah Dorazio**, Statistics and Computer Science Teacher, San Francisco University High School