# Mostly Harmless Statistics

Rachel L. Webb, Portland State University

Copyright Year: 2021

Publisher: Portland State University Library

Language: English

## Formats Available

## Conditions of Use

Attribution-ShareAlike

CC BY-SA

## Reviews

This OER Textbook includes most concepts that would (and should) be covering in an Introductory (or Elementary) Statistics course. There are; however, a few concepts that I feel were "left out" (whether on purpose or inadvertently) that almost... read more

This OER Textbook includes most concepts that would (and should) be covering in an Introductory (or Elementary) Statistics course. There are; however, a few concepts that I feel were "left out" (whether on purpose or inadvertently) that almost every Introductory (or Elementary) Statistics course covers (at least the ones that I have been required to teach at community colleges or universities anyway). For on example, the Types of Measurement Scales are usually an entire section in almost every Statistics textbook; yet, in this OER textbook, the author spends less than one page covering this topic and there are very few examples of the different types. Another concept that is very lacking is the concept of the difference between Statistical and Practical Significance. The author only discusses that once, and it is not until Page 226 that this topic is discussed. Even then, a staunch definition is not even provided for either type of significance; the definition that is provided is very vague and short. There is also no definition provided for a very important type of sampling, which is that of voluntary response sampling. Another example is the lack of importance put on the definition of bias by this author. While the definition of bias is provided, the word "bias" is not even in bold, as though it is not important. Non-response bias is not even discussed by this author. The author also only mentions the "pitfalls" or reasons why data can be misleading in Section 1.3, instead of going into depth about each different kind of bias and examples of each different kind of bias. In Statistics, bias is an extremely important concept, and the author acts like it is unimportant to the study of statistics. Confounding variables, as well as lurking variables, are not discussed until Page 375, in the regression chapter, when they are supposed to be (and are) discussed in other textbooks) in the first chapter in the same section as Observational Studies and Designed Experiments.

Personally, I think the author was very biased in what they chose to incorporate into this OER textbook and how they chose to incorporate it. I would not consider much of the content to be "accurate." One of the definitions of "accurate" in Miriam Webster's online dictionary, www.m-w.com, is: going to, reaching, or hitting the intended target : not missing the target." In this aspect, I do not think this OER textbook is accurate in many aspects. As previously mentioned, there are significant "omissions" of concepts that should have been discussed, but are not in this Statistics textbook, and there is a serious problem of lack of breadth and depth in this OER textbook regarding many topics. Examples include bias, types of bias and their examples, confounding and lurking variables, statistical versus practical significance, and correlation and regression. While I did not find any errors in calculations, I do not consider a textbook with so many important "omissions" and/or lack of breadth and depth of so many important statistical concepts to be "error-free" and "unbiased." This is why I give this OER textbook only a 3 for accuracy.

I will agree that the content is up-to-date and will not make the text obsolete within a short period of time. I do like the fact that the author does bring up and show several pictures of calculator concepts for the TI-83, 84, and even 89. I think that is extremely helpful for students. However, I do think that when discussing correlation, the author should have brought up the easy way to find the correlation in the calculator, which would simply be to enter the values for the x variable in List 1 (L1), enter the values for the y-variable in List 2 (L2) and then instead of calculating 2 variable statistics (2 Var Stats) as the author suggests, the author should have instead told students to do the following. Press the 2nd button and then the 0 button (which stands for Catalog). Students should then arrow all the way down to "Diagnostic On" and press Enter twice. The calculator will say "Done." Then, the students can then press the STATS button, arrow over to CALC, and then go down to #4, which is LinReg(ax + b) or #8, which is LinReg(a + bx), and then they can make sure that List 1 shows up as L1 and List 2 shows up as L2 and then they can arrow down to Calculate and it will show the values of r and r squared and will also give the values you need to come up with your linear regression equation. The author did not provide this information to students and I think this is a major disadvantage, because instead the author simply shows students how to obtain the formula values from the calculator and then tells students to plug those values into the formula. The whole idea of using the calculator is to be able to avoid the use of formulas and make it easier on yourself. So while I think the author's showings of calculator concepts are very beneficial to students, I think there certainly is room for improvement. However, the author would need to re-organize her entire Chapter 12 on regression and put in into two totally different chapters - one before probability with the non-inference concepts and then leave only the inference concepts regarding regression in Chapter 12. I also think that if the author really wanted ALL students to benefit from using this OER material, that they should have incorporated how to use Excel to solve these problems as well, not just how to solve them with calculators. Those updates would be easy to include; however, they would take a long time to update.

I think the clarity of this OER text is just "average." According to Miriam Webster's online dictionary, www.m-w.com, one of the definitions of lucid is, "clear to the understanding." I think that the author means well; however, several of her definitions are very unclear because they do not go into enough breadth and depth for those definitions. Examples of this include her definitions of bias and her lack of definitions of statistical versus practical significance, which they only discusses once and does not really provide a definition for at all. Bias they do provide a definition for, but it is extremely short and there is not enough depth, as they do not discuss the different types of bias and does not give examples of those different types of bias in any reasonable detail.

According to https://pressbooks.bccampus.ca/openubcpub/chapter/textbook-design-rules-open-ubc/, "A textbook is an organized body of material useful for the formal study of a subject area. It should be discrete, and well-bounded in scope and the text material should relate to a solid understanding of the subject, usually mixing theory and practice for each topic as it covers the subject domain. The textbook should also use examples and problems to assist the student in better grasping each presented concept by following examples, and then applying the concept in structured exercises or problems. The textbook should have an internally consistent style and there should be little or no surprises for the student in terms of layout and presentation of material. The texts user can get comfortable with the layout, the tempo of presentation, and the pattern of figures, illustrations, examples, and exercises. Once reviewed, the textbook should isolate material that is useful to the future application of subject knowledge in well-organized appendices and tables. Finally, the textbook is a structured resource and is not just a collection of useful material. The textbook is a guide for the student for an order of review that will aid in mastering the subject area. Topics are presented in major parts, chapters, sections, and subsections that are organized in a way that facilitates understanding. This means that the text’s organization is based on the intersection of two requirements. The first of these are the requirements of the subject domain. Since most textbooks are developed by, or based on the contributions of subject matter experts, this requirement is usually well attended to." I honestly do not find this OER material to be "well-bounded in scope" and I certainly do not feel that the, "text material...relate(s) to a solid understanding of the subject." To the contrary, I find that there are many topics that are "left out" of this OER textbook that I find to be extremely important and, even ones that are in this textbook, many of them certainly do NOT give students, "a solid understanding of the subject." Examples include topics such as bias, statistical versus practical significance, confounding and lurking variables, and the levels of measurement. As far as, "The textbook should also use examples and problems to assist the student in better grasping each presented concept by following examples, and then applying the concept in structured exercises or problems," this OER textbook does not give examples and problems during several of the sections of the textbook, nor does it give any examples and problems after each section of the textbook for students to practice their skills. To the contrary, instead, the author puts all problems for students to complete AFTER the entire chapter and gives students no chances to practice throughout the Chapter, such as Section Review exercises or Check Your Understanding Problems. Chapter 1 does not even have examples of many things in the Sections of the Chapter. I will say that the section on Misleading Graphs does have examples and they are good ones - the author should consider actually doing things like this for EVERY single section in the OER textbook, not just having one section have comprehensive sets of examples. The author also needs to have Section Review Exercises after each section and should certainly needs to have more examples throughout every section of the topics (more breadth and depth for significant understanding of the topics) and also needs to have more examples where students can practice their skills throughout sections.

I find the modularity of this text to be reasonable. The author does divide the text into smaller sections; however, some of them are way too small to amount to anything, and, some of them, though smaller than most Statistics textbooks, are too long because they still have too many concepts in one small section that really should be broken up into two or more sections and should have more breadth and depth to them. A perfect example is Chapter 1. Section 1-1 basically talks about almost nothing (although, to the author's credit, most Statistics textbooks only use the first section of a textbook as an introduction). However, section 1-2 is way too small and section 1-3 should have actually been changed to be at least two (if not three) sections individually by themselves. The author does do a good job of subheadings; however, when it comes to definitions, some of the most important ones are not in bold and are not emphasized when they really need to be. As a Statistics teacher, I would have to literally re-arrange many of the sections of this book and would have to supplement many of the sections with other material that should be covered in an Introductory Statistics course is either not in this book, or in the wrong place, or not emphasized in the way it should be. For that reason, I only give the modality a two.

Unfortunately, the way the author has organized this OER text, it is not organized at all nearly as well as regular Statistics textbooks. It leaves a ton of concepts out in the beginning that should be taught in Chapter 1, and does not spend nearly enough time on several topics that are clearly important to Statistics. Things like not discussing lurking and confounding variables in Chapter 1 and waiting until Page 375 to discuss Regression and that is the first time that students see the terms confounding and lurking variables is one example of why this text's organization is so poor. The logic and order of how the concepts are presented is poor as well. Waiting until Chapter 12 to discuss Correlation and Regression is very poor organization and a complete disadvantage to students enrolled in Statistics. Correlation and Regression (other than inference tests for Regression) should be taught before Probability sections. Although it is true that some Statistics textbooks do the same thing the author did and teach Correlation and Regression after Inferences of One and Two Samples involving proportions and means, but before Chi-Squared tests, that does not mean that this it the best thing for students, either. In addition to thinking that the content in Chapter 12 is completely out of place, the order in which it is presented is also very confusing for students. The hypothesis tests are the only things that should be covered in Chapter 12; again, the other topics should have been covered in a separate chapter before probability is covered. In addition to this, Multiple Linear Regression is NEVER taught in an Introductory (or Elementary) Statistics course, as it is never on the SLO's. Therefore, it should not be in this OER textbook, or, if the author insists on including it, it should be in a different chapter of "Additional Topics of Interest" that students can take a look at if they are interested; however, the students this OER textbook was designed to serve would never be studying multiple linear regression in an Introductory Statistics course, so it is pointless to have that topic in this OER textbook, other than at the end of the textbook as some type of "Additional Topic(s) of Interest."

I think the interface of the OER textbook is just fine. I did not find any navigation problems or distortions of images or charts. However, I do think there are other things that could confuse or distract the reader, such as the serious level of "omissions" and the lack of breadth and depth of certain concepts discussed in the OER textbook. This is why I gave it only a 4 instead of a 5.

While I did not find any grammatical errors, I am not an English teacher; I am a Math and Statistics teacher. What I did find were many "omissions", even if they were inadvertent (though it appeared that the author did pick and choose what they wanted to go into this OER textbook, so the "omissions" appeared to be purposeful). Therefore, I am giving this section a four due to the level of "omissions" that I found.

I think that the level of "omissions" and the lack of breadth and depth to several of the subject areas of Statistics that the author discussions (or omits and never discusses) are actually a lack of respect to anyone attempting to learn Statistics in the first place. If you are going to teach students Statistics, you should teach it in a way, shape, and form that students can understand. The way the author presents this material leaves so many questions unanswered and there are so many "omissions" and things that should have been discussed that were not. Students should be taught Statistics in detail because it is the most valuable real world Mathematics that any student can and will ever learn in their lifetime. I also think that if the author really wanted to cut costs for students, they should have incorporated how to use Excel to solve statistical problems, since Excel is free for all students with their paid tuition; however, a calculator is not free (even if at the author's institution calculators are provided for each student, if they even are, that is almost never the case at regular institutions). The way the author presents the topics and/or omits important topics and/or does not provide anywhere near enough detail into certain topics makes me give this OER textbook a "3" rating.

When I first saw that there was an OER textbook designed for Statistics, I was beyond thrilled. I laughed at the title and thought this would be a great resource and that I might be able to use it while teaching and that I could go to my department chairs and recommend this resource to be used in our classes, so that we could literally bring down the cost of the Statistics course materials for students. Unfortunately, after looking into this book in more detail, it is one that I would only recommend as a resource for students to "look up" certain concepts. I could never use this OER textbook as a primary textbook in my Statistics courses and I could never recommend that it be the primary resource for students to have access to and use of because there are way too many important "omissions" and the concepts that are covered are not covered in anywhere near enough breadth and depth. The Statistics student needs to understand WHY they are learning the concepts. With what the author has presented, they leave that question unanswered with their omissions. I also do not like the fact at all that many of the sections do not have any examples for students of certain concepts and definitely do not like the fact that there are no examples for students to practice and complete until after the entire Chapter and not after each section. I also found that the structure of the textbook is just not student friendly, since most of the concepts in Chapter 12 should be taught before Probability concepts and they are not. Leaving those concepts until Chapter 12 means that several instructors will not even get to them, and Correlation and Regression are absolute necessities for students to learn in Introductory Statistics courses. I also, as I mentioned earlier, think that the author should consider incorporating how to use Excel to solve certain conceptual problems and not just the calculator. Not every student can afford a calculator and online calculators are no longer free (they were the first two years of COVID; however, they are no longer free). However, every student receives a Microsoft Office subscription, including Excel, to Office 365, as part of their paid tuition every semester. Therefore, the calculator is an added expense for students; whereas Excel is not an added expense. So if the author is really trying to cut costs for all students, they should also add how to use Excel to solve statistical concepts to this book.

## Table of Contents

- Chapter 1 Introduction to Data
- Chapter 2 Organizing Data
- Chapter 3 Descriptive Statistics
- Chapter 4 Probability
- Chapter 5 Discrete Probability Distributions
- Chapter 6 Continuous Probability Distributions
- Chapter 7 Confidence Intervals for One Population
- Chapter 8 Hypothesis Tests for One Population
- Chapter 9 Hypothesis Tests & Confidence Intervals for Two Populations
- Chapter 10 Chi-Square Tests
- Chapter 11 Analysis of Variance
- Chapter 12 Correlation and Regression
- Chapter 12 Formulas
- Chapter 12 Exercises
- Chapter 13 Nonparametric Tests

## Ancillary Material

## About the Book

This text is for an introductory level probability and statistics course with an intermediate algebra prerequisite. The focus of the text follows the American Statistical Association’s Guidelines for Assessment and Instruction in Statistics Education (GAISE). Software examples provided for Microsoft Excel, TI-84 & TI-89 calculators. A formula packet and pdf version of the text are available on the website http://mostlyharmlessstatistics.com. Students new to probability and statistics are sure to benefit from this fully ADA accessible and relevant textbook. The examples resonate with everyday life, the text is approachable, and has a conversational tone to provide an inclusive and easy to read format for students.

## About the Contributors

### Author

**Rachel L. Webb**, Portland State University