Linear Regression Using R: An Introduction to Data Modeling
David Lilja, University of Minnesota
Pub Date: 2016
ISBN 13: 978-1-9461350-0-1
Publisher: University of Minnesota Libraries Publishing
Read This Book
Conditions of Use
This is a tutorial that covers basic areas and ideas of linear regression. It covers this material through carefully selected examples. R, the read more
This is a tutorial that covers basic areas and ideas of linear regression. It covers this material through carefully selected examples. R, the software used to present examples in the text, is an open source software which is appropriate and convenient for an open textbook. The book provides an effective and complete index and table of content with page numbers as links to the text.
The open source software (R) used to present data is as accurate as any commercially available software. The rest of the content is accurate and error-free.
As in introductory text, the content is up-to-date. As a basic topic in regression theory, linear regression is here to stay. With the current growth of data mining it is difficult to imagine the future of data analytics without linear regression. The text is written and arranged in such a way that important updates will be easy to implement.
The text is clear and accessible to readers with standard elementary statistical background. It provides explicit guidance for R and the context for statistical terms is clear. The concepts are well explained.
The exposition is consistently clear and well-motivated by examples. The level and presentation is consistent as well. The text uses consistent, standard, and elementary terminology appropriately introduced to deal with linear regression models.
The text, not overly self-referential, is presented in eight chapters, each with a hyperlink to the text. Each chapter has short sections. In addition, each page number in the Index is a hyperlink to the text.
The topics in the text are well motivated by examples that should make the subject more interesting to the reader. The organization is excellent, making each topic clear and easy to read.
It would have been nice to have color images in the Figures. Also, Figure 4.1 (CHAPTER 4. MULTI-FACTOR REGRESSION) would be clearer if it showed only a few of the pairwise comparisons for the Int2000 data frame. But these are just two minor issues of display.
I did not find grammatical errors.
The text is not culturally insensitive or offensive in any way. It uses examples that are culturally neutral.
I would use this tutorial in any undergraduate course dealing with linear regression.
Table of Contents
- 1.1 What is a Linear Regression Model?
- 1.2 What is R?
- 1.3 What’s Next?
2 Understand Your Data
- 2.1 Missing Values
- 2.2 Sanity Checking and Data Cleaning
- 2.3 The Example Data
- 2.4 Data Frames
- 2.5 Accessing a Data Frame
3 One-Factor Regression
- 3.1 Visualize the Data
- 3.2 The Linear Model Function
- 3.3 Evaluating the Quality of the Model
- 3.4 Residual Analysis
4 Multi-factor Regression
- 4.1 Visualizing the Relationships in the Data
- 4.2 Identifying Potential Predictors
- 4.3 The Backward Elimination Process
- 4.4 An Example of the Backward Elimination Process
- 4.5 Residual Analysis
- 4.6 When Things Go Wrong
5 Predicting Responses
- 5.1 Data Splitting for Training and Testing
- 5.2 Training and Testing
- 5.3 Predicting Across Data Sets
6 Reading Data into the R Environment
- 6.1 Reading CSV files
8 A Few Things to Try Next
About the Book
Linear Regression Using R: An Introduction to Data Modeling presents one of the fundamental data modeling techniques in an informal tutorial style. Learn how to predict system outputs from measured data using a detailed step-by-step process to develop, train, and test reliable regression models. Key modeling and programming concepts are intuitively described using the R programming language. All of the necessary resources are freely available online.
About the Contributors
David J. Lilja received a Ph.D. and an M.S., both in Electrical Engineering, from the University of Illinois at Urbana-Champaign, and a B.S. in Computer Engineering from Iowa State University in Ames. He is currently the Louis John Schnell Professor of Electrical and Computer Engineering at the University of Minnesota in Minneapolis, where he also serves as a member of the graduate faculties in Computer Science, Scientific Computation, and Data Science. Previously, he served ten years as the head of the ECE department at the University of Minnesota, worked as a research assistant at the Center for Supercomputing Research and Development at the University of Illinois, and as a development engineer at Tandem Computers Incorporated in Cupertino, California. He received a Fulbright Senior Scholar Award to visit the University of Western Australia, and was awarded a McKnight Land-Grant Professorship by the Board of Regents of the University of Minnesota. He has chaired and served on the program committees of numerous conferences, and was a distinguished visitor of the IEEE Computer Society. He was elected a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) and a Fellow of the American Association for the Advancement of Science (AAAS) for contributions to the statistical analysis of computer performance. He also is a member of the ACM, and is a registered Professional Engineer. His main research interests include computer architecture, parallel processing, computer systems performance analysis, approximate computing, and storage systems.