Python for Everybody: Exploring Data Using Python 3
Charles Severance, University of Michigan
Copyright Year: 2016
ISBN 13: 9781530051120
Publisher: Charles Severance
Conditions of Use
Dr. Charles R. Severance's book introduces the fundamentals of Python programming in Chapters 1-10, without diving deeply into object-oriented programming. These chapters focus on code examples manipulating text and text files. Given the title, it... read more
Dr. Charles R. Severance's book introduces the fundamentals of Python programming in Chapters 1-10, without diving deeply into object-oriented programming. These chapters focus on code examples manipulating text and text files. Given the title, it would have been nice to have examples of other types of data as well, e.g., employee data, species data and income data. Chapters 11–16 branch out to gathering and manipulating text from different sources, including scraping the web. These later chapters also manipulate different types of data, including geographic data. The end of each chapter has a glossary and exercises, with sample code and data files available at the book's website.
Because Chapters 1-10 only touches upon object-oriented program, definitions and explanations can become convoluted. For example, the definition for immutable in Chapter 6 is confusing compared to the more accurate definition "an object with a fixed value". The book could be more accurate by giving readers a gentle treatment of object-oriented programming from the beginning. For example, this is nicely done in Chapter 8 when explaining string objects versus list objects. There is one small error when print is referred to as a statement, instead of a function.
The content is up-to-date for learning how to program. The reader is given timeless advice on how to approach a problem, debug issues, and deal with aggravations that are typical when programming.
The book is written clearly, although explanations are oversimplified at times. The glossary at the end of each chapter is helpful for clearing up confusion on terminology.
Each chapter is consistent in terms of terminology. As terms are explained in greater detail in subsequent chapters, it would be helpful if the revised definitions could be included in the chapter glossary.
Each chapter in the book was well organized into separate modules.
The chapters are presented in a logical order, although as previously stated, it would be helpful to build upon object-oriented concepts throughout the text, rather than waiting to Chapter 14 to address them fully.
I had no problem with the links at the book's website.
The text contains no grammatical errors as far as I could tell.
The text is not culturally biased as far as I could tell.
The book focuses on basic Python programming, along with advanced topics in Structured Query Language, databases, and visualizing data. The subject matter is clearly explained for all beginners. Good programming practices are reinforced throughout the book.
This book is an approachable introduction to both Python the language and its application to information science -- namely retrieving, cleaning, and storing data for later analysis. Chapters two through ten are based heavily on Allen Downey and... read more
This book is an approachable introduction to both Python the language and its application to information science -- namely retrieving, cleaning, and storing data for later analysis. Chapters two through ten are based heavily on Allen Downey and Jeff Elkner's excellent book, "Think Python: How to Think Like a Computer Scientist." While Severance has reworked many of the examples in these chapters to better reflect the book's overarching theme of data exploration Downey and Elkner's clear and concise introduction to the Python language is still prevalent and makes the early material easily accessible for new programmers. Given that the book is written with data exploration in mind I found it somewhat odd that its treatment of data visualization was fairly light with only three examples given in Chapter 16. Even odder was that there was no mention of libraries such as Pandas, NumPy, SciPy for data wrangling nor visualization packages such as Matplotlib, Seaborn, Bokeh, or GGplot. The latter I suspect is due at least in part to the text's age. The book is also lacking in its coverage of string formatting in Python, discussing only the most basic string formatting features and capabilities of the language while completely eschewing the .format() method and f-strings. Also missing is coverage of useful topics such as comprehensions, generators, and lambda expressions. The word "recursion" only appears in the book once, in the preface, where the author states that the word does not appear in the book at all. Finally, there is essentially no treatment of the Python standard library nor any hint that readers should look into it for the amazing wealth of functionality it provides. Overall, this book serves as an introduction to the basics of the Python programming language and its application to data exploration. It teaches enough Python in the early chapters to support the later ones. However, it is not an introduction to programming nor an introduction to computer science using Python as the teaching language.
The content of the book is accurate given its intended scope, even if it is a little dated in its approach to some material, such as string formatting. I found no typographical or layout errors in the HTML-based version I reviewed. Readers are also welcome to provide corrections/edits to the text via pull requests to its git repository.
The scope of the book is somewhat narrow: an introduction to enough Python to do simple data acquisition, wrangling, and visualization. However, as Python has become a, if not the, leading language in data science and the number and capabilities of related libraries have grown any text on data exploration that does not at least touch on libraries such as Pandas, NumPy, Matplotlib, or any of the numerous such libraries is going to quickly find itself becoming less and less relevant in the field.
The chapters based on Downey and Elkner's earlier book are very clear if, again, limited in scope. The later chapters jump in complexity at the expense of clarity in my opinion. However, the author does explore possible errors throughout the text and helps the reader understand what is causing them so as to aid in future debugging.
All the chapters are uniformly formatted and are consistent in their use of terminology.
The book is broken up into logical chapters and each chapter is further divided into meaningful and accessible portions. I did not find any subsection to be overly long and overall each chapter is short enough to be assigned as a single reading. Chapters build upon those that proceeded them as is to be expected in an introductory programming text.
The topics are presented in a logical order.
The text in the browser by default is on the small side but this can be corrected by zooming in on the page. This has the effect, however, of making the section headers overly large. I would also argue that using syntax colorization in the code examples would go a long way towards making the material easier to understand.
I found no grammatical errors in my reading.
I found no offensive content.
As previously stated, this text is not intended to be an introduction to either programming or computer science. Rather, it is an introduction to information science that teaches just enough programming to allow for the topic to be explored. I do think that the book does itself a disservice by not addressing Python's emerging role in data science and by neglecting the many tools the language offers in the form of its impressive library of functionality for that purpose.
The book is a comprehensive and approachable introduction to Python. The first nine chapters are terse, but comprehensive introduction to Python. Given the title, I had expected some discussion of the pandas Python package. It is more geared... read more
The book is a comprehensive and approachable introduction to Python. The first nine chapters are terse, but comprehensive introduction to Python. Given the title, I had expected some discussion of the pandas Python package. It is more geared toward acquiring data (web, databases and SQL).
I found no issues with the content, but there are a few typographical errors from LaTex in the text. They are obvious and don't impact the understanding.
Python 3 is the current standard, but the relevance is more a consequence of the subject matter than the approach.
The first 9 chapters were very clear, but there seemed to be a good jump in difficulty (and likely due to the subject matter) when introducing regex, networked programs. It could be jarring as a reader/learner.
Clear and well written with consistent notation and terminology.
I think any of the chapters could fairly easily turned into a module, so the particular chapters could be included or excluded as needed.
As noted above, the sequence is logical and clear, but the difficulty seems to jump considerably at chapters 10 or 11 (tuples and networked programs).
No issues that I noted.
Little to none that I noticed.
I think this is very good quick-and-dirty introduction to Python, but, as stated above, given the level of the first nine chapters, the remaining chapters might have benefited from talking a slightly lower-level approach. Still a quality book and resource.
This book is a remix of the excellent Think Python book by Allen Downey. The book keeps the clarity of the original while including examples skewed towards data applications, particularly text processing. The remix adds chapters on regular... read more
This book is a remix of the excellent Think Python book by Allen Downey. The book keeps the clarity of the original while including examples skewed towards data applications, particularly text processing. The remix adds chapters on regular expressions, web services, databases and visualization. It drops topics like algorithm analysis and GUIs, and slims down the discussion of classes significantly. These changes make this a good information science textbook and less of a computer science textbook. Students are led on the path of developing web-scraping programs. Programs that can pull raw data from online sources and process it a useful way. The book does not cover data science, plotting, or Python libraries like pandas. The coverage of the Python language is generally thorough, but misses topics like list comprehensions and lambda expressions. The additions are well-thought out and provide students with a useful toolkit that they can start applying right away. The visualization chapter is the only one that is lacking. It provides three well-documented examples of web-scraping programs that use visualization. But it does not provide a general treatment of visualization tools nor a discussion of how to use them effectively.
The overview of the Python language is accurate. The discussion of applications is accurate with regards to common practices of web-scraping programs.
The use of Python 3 ensures that chapters regarding syntax and data structures will remain valid for the foreseeable future. Chapters regarding web services, databases and visualization are more at risk. The author plays it conservatively by discussing XML and JSON for web services and SQLite for databases. These are good choices because they are widely used, but increasingly XML is falling by the wayside and tasks that used to be handled with relational databases are instead being run on NoSQL systems. One of the three visualization examples is based on the Gmane interface to mailing lists, which is likely not very relevant for students and Gmane's continued existence is in doubt. These chapters may need to be updated in a few years.
The book does an excellent job of explaining the Python language, always providing a context in which topics are useful. Information is imparted, not just to be comprehensive, but to help the reader be a better programmer. The examples are well-explained and motivated. The author frequently includes interludes on understanding errors and sections on debugging, providing valuable information for a novice programmer.
The chapters have a consistent style and use of terminology. The Python in the book follows the conventions in the Style Guide for Python.
There is a limit to how modular an introductory textbook on programming can be. The book generally strikes a good balance. Chapters do build on each other, but a course could skip some chapters without encountering much loss of continuity. The later chapters that focus on building up to web-scraping programs are not particularly modular and would need to be taught in order. The chapter on visualization is unfortunately dependent on the database chapter. The book would benefit from making visualization stand more on its own.
The book is well-organized and has a coherent flow through the chapters. Some topics, such as exception handling, are introduced earlier than is typical. But these introductions are done with a light touch and with an eye towards why the topic is immediately useful.
The links to code and outside sites worked. Code downloads nicely into a directory with a helpful Readme file.
No grammatical errors were found by this reviewer.
The book doesn't make use of many cultural references. The examples of text processing are clear and straight-forward and shouldn't be an issue for readers whose first language is not English.
A clear, well-constructed book that would serve an information science curriculum well.
Table of Contents
- 1 Why should you learn to write programs?
- 2 Variables, expressions, and statements
- 3 Conditional execution
- 4 Functions
- 5 Iteration
- 6 Strings
- 7 Files
- 8 Lists
- 9 Dictionaries
- 10 Tuples
- 11 Regular expressions
- 12 Networked programs
- 13 Using Web Services
- 14 Object-Oriented Programming
- 15 Using databases and SQL
- 16 Visualizing data
- A Contributions
- B Copyright Detail
About the Book
I never seemed to find the perfect data-oriented Python book for my course, so I set out to write just such a book. Luckily at a faculty meeting three weeks before I was about to start my new book from scratch over the holiday break, Dr. Atul Prakash showed me the Think Python book which he had used to teach his Python course that semester. It is a well-written Computer Science text with a focus on short, direct explanations and ease of learning.The overall book structure has been changed to get to doing data analysis problems as quickly as possible and have a series of running examples and exercises about data analysis from the very beginning.
Chapters 2–10 are similar to the Think Python book, but there have been major changes. Number-oriented examples and exercises have been replaced with data- oriented exercises. Topics are presented in the order needed to build increasingly sophisticated data analysis solutions. Some topics like try and except are pulled forward and presented as part of the chapter on conditionals. Functions are given very light treatment until they are needed to handle program complexity rather than introduced as an early lesson in abstraction. Nearly all user-defined functions have been removed from the example code and exercises outside of Chapter 4. The word “recursion”1 does not appear in the book at all.
In chapters 1 and 11–16, all of the material is brand new, focusing on real-world uses and simple examples of Python for data analysis including regular expressions for searching and parsing, automating tasks on your computer, retrieving data across the network, scraping web pages for data, object-oriented programming, using web services, parsing XML and JSON data, creating and using databases using Structured Query Language, and visualizing data.
The ultimate goal of all of these changes is a shift from a Computer Science to an Informatics focus is to only include topics into a first technology class that can be useful even if one chooses not to become a professional programmer.
About the Contributors
Charles Severance is a Clinical Associate Professor at the University of Michigan School of Information.