Comprehensiveness rating: 5 read less
Professor Downey is an expert writer with over 12 books under his belt. This particular book is very comprehensive. The author guides an engineer with minimal statistical knowledge into the intrinsicness of statistics. Professor Downey started the book with basic concepts of exploratory data to distribution, plotting and effect size, moves to probability mass functions, and cumulative distribution. Then he untangles the complicated subject of modeling distributions and probability density functions. From Chapter 7 he starts a journey to hypothesis testing and regression analysis. The concepts of hypothesis testing and regression analysis are not simple, so he begins by explaining the relationship between variables, demonstrating the relationships with scatter plots. He moves to explaining concepts like correlation, covariance and linear dependency (Pearson correlation coefficient). From this chapter, he moves to explaining sample distributions and sampling bias. By now the student has a strong understanding of sample distribution and ready to learn about hypothesis testing. During the chapter in hypothesis testing, he describes the most common methods to perform hypothesis testing to compare two different groups. In chapter 10, the author explains basic concepts necessary to understand regression like least square, residuals, goodness of fit and weighted resampling. Then, in chapter 11 he describes multiple regression analysis, nonlinear relationships and logistic regression. Finally, he explains the more advanced subjects like time series and survival analysis.
Accuracy rating: 5
Professor Downey is a senior engineer and a data scientist. Consequently, accuracy is part of his training, background, and career. This book is highly accurate.
Relevance/Longevity rating: 5
In the age of big data, this book is relevant and essential for any engineer that wants to move to the are of big data. The longevity of the book is unknown, the area is moving very fast, but he is teaching basic concepts, so I expect that the book will be relevant for at least a decade.
Clarity rating: 5
IMHO, the book is very clear for anybody with some background in computer science and programming. On the other hand, for somebody without any knowledge of Python or programming, it could be hard. The author explains in the preface that some experience in programming will be necessary to understand the book. The author has some other open text books like "Think Python" that should be read before reading this book.
Consistency rating: 5
Consistency is to the extreme. Every chapter starts with an introduction, explanations of methods, examples, and description of the code used to demonstrate the concepts or to generate the graphics. Also, the author provides code, exercises, and a glossary for every chapter.
Modularity rating: 5
The book is modular in the sense that we can read sections that we are not familiar and skip parts that we are not familiar. Every chapter has multiple sections with subheaders just to provide an example chapter 10 has seven different sub-sections plus the exercises and glossary. However, skipping sections or dividing parts among the various students could be confusing because the flow of the book requires understanding essential concepts before moving to more complex chapters. I don't penalize the author for going from simple concepts to more complex, so I will consider the book modular since each chapter has sub-sections.
Organization/Structure/Flow rating: 5
As described in previous questions 1 and 6, Professor Downey developed a logical structure where one concept is described, learned and consolidated with the exercises before moving to more complex sections. In other words, a structure that goes from basic to complex concepts. I understand that every student is different, and each learns in a different way, but I predict that for many students the logical flow of the book will be an enjoyable experience.
Interface rating: 5
I was unable to find any interface errors.
Grammatical Errors rating: 5
I was unable to find any grammatical errors.
Cultural Relevance rating: 5
The book is culturally neutral. However, Professor Downey teaches statistics with Python while the majority of the biostatisticians use R, and many of them will frown upon the use of Python to teach statistics.
Comments
I will definitively recommend this book but recommend to read his "Think Python" book before or at least take a refresh Python course before reading this book.