The most recommended data science books

Who picked these books? Meet our 18 experts.

18 authors created a book list connected to data science, and here are their favorite data science books.
Shepherd is reader supported. When you buy books, we may earn an affiliate commission.

What type of data science book?

Loading...
Loading...

Book cover of Biology as Ideology: The Doctrine of DNA

Aubrey Clayton Author Of Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science

From my list on for data scientists trying to be ethical people.

Why am I passionate about this?

I studied statistics and data science for years before anyone ever suggested to me that these topics might have an ethical dimension, or that my numerical tools were products of human beings with motivations specific to their time and place. I’ve since written about the history and philosophy of mathematical probability and statistics, and I’ve come to understand just how important that historical background is and how critically important it is that the next generation of data scientists understand where these ideas come from and their potential to do harm. I hope anyone who reads these books avoids getting blinkered by the ideas that data = objectivity and that science is morally neutral.

Aubrey's book list on for data scientists trying to be ethical people

Aubrey Clayton Why did Aubrey love this book?

People need less Dawkins in their lives and more Lewontin, whose thought-provoking, accessible writing about evolutionary biology stands in fierce opposition to the trend toward genetic determinism that seems to be the rage nowadays. We are not simply our genes, Lewontin says, because the effects DNA has on our lives are mediated by social and environmental factors, many of which we can influence. While it’s nominally about biology, I also read this as a critique of causal inference, generally. What we consider a “cause” reveals our ideological commitments to certain aspects of the world being maintained, and we should be careful what causal lessons we draw from data.

By Richard C. Lewontin,

Why should I read it?

1 author picked Biology as Ideology as one of their favorite books, and they share why you should read it.

What is this book about?

Following in the fashion of Stephen Jay Gould and Peter Medawar, one of the world's leading scientists examines how "pure science" is in fact shaped and guided by social and political needs and assumptions.


Book cover of Jumpstart Snowflake: A Step-by-Step Guide to Modern Cloud Analytics

Valliappa Lakshmanan Author Of Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning

From my list on if you want to become a data scientist.

Why am I passionate about this?

I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.

Valliappa's book list on if you want to become a data scientist

Valliappa Lakshmanan Why did Valliappa love this book?

In industry, your data is very likely to live within a data warehouse such as BigQuery, Redshift, or Snowflake. Therefore, to be an effective data scientist in the industry, you should learn how to use data warehouses effectively. 

Once you learn data warehousing and SQL with any one of these products, it is quite easy to pick up another. So which one do you start with?

You can use Snowflake on all three of the major public clouds. Because it’s a standalone product, it is the most similar to a “traditional” data warehouse and can be picked up easily even if you are not familiar with cloud computing. That makes it a good data warehouse to start with, and is the reason my second book pick is this book on Snowflake.

BigQuery is also available on all three major public clouds, but it works best (and is used most commonly)…

By Dmitry Anoshin, Dmitry Shirokov, Donna Strok

Why should I read it?

1 author picked Jumpstart Snowflake as one of their favorite books, and they share why you should read it.

What is this book about?

Explore the modern market of data analytics platforms and the benefits of using Snowflake computing, the data warehouse built for the cloud.

With the rise of cloud technologies, organizations prefer to deploy their analytics using cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform. Cloud vendors are offering modern data platforms for building cloud analytics solutions to collect data and consolidate into single storage solutions that provide insights for business users. The core of any analytics framework is the data warehouse, and previously customers did not have many choices of platform to use.

Snowflake was…


Book cover of Python for Everyone

Daniel Zingaro Author Of Learn to Code by Solving Problems: A Python Programming Primer

From my list on for a rock solid python programming foundation.

Why am I passionate about this?

Some programmers learn through online articles, videos, and blog posts. Not me. I need a throughline—a consistent, expert distillation of the material to take me from where I am to where I want to be. I am not good at patching together information from disparate sources. I need a great book. I have a PhD in computer science education, and I want to know what helps people learn. More importantly, I want to know how we can use such discoveries to write more effective books. The books I appreciate most are those that demonstrate not only mastery of the subject matter but also mastery of teaching.

Daniel's book list on for a rock solid python programming foundation

Daniel Zingaro Why did Daniel love this book?

I used this book for several years starting in 2013 when the first edition came out. It absolutely holds up today. Learning the Python language (the syntax) is one thing. Learning how to design programs using this syntax is another. We need both but, unfortunately, many books forgo the latter for the former. Not this book! I like the Problem Solving and Worked Example sections: they help learners apply a disciplined, step-by-step strategy to programming projects. There are multiple, varied contexts here as well, which helps capture a broader base of learners. Bonus feature: the Computing & Society boxes.

By Cay S. Horstmann, Rance D. Necaise,

Why should I read it?

1 author picked Python for Everyone as one of their favorite books, and they share why you should read it.

What is this book about?

Python for Everyone, 3rd Edition is an introduction to programming designed to serve a wide range of student interests and abilities, focused on the essentials, and on effective learning. It is suitable for a first course in programming for computer scientists, engineers, and students in other disciplines. This text requires no prior programming experience and only a modest amount of high school algebra. Objects are used where appropriate in early chapters and students start designing and implementing their own classes in Chapter 9. New to this edition are examples and exercises that focus on various aspects of data science.


Book cover of How to Lie with Statistics

Bastiaan C. van Fraassen Author Of Philosophy and Science of Risk: An Introduction

From my list on exploring the meaning of probability and risk.

Why am I passionate about this?

I’ve wanted to be a philosopher since I read Plato’s Phaedo when I was 17, a new immigrant in Canada. Since then, I’ve been fascinated with time, space, and quantum mechanics and involved in the great debates about their mysteries. I saw probability coming into play more and more in curious roles both in the sciences and in practical life. These five books led me on an exciting journey into the history of probability, the meaning of risk, and the use of probability to assess the possibility of harm. I was gripped, entertained, illuminated, and often amazed at what I was discovering. 

Bastiaan's book list on exploring the meaning of probability and risk

Bastiaan C. van Fraassen Why did Bastiaan love this book?

I am laughing out loud, even now that I am rereading this book for the umpteenth time. Fraudsters are so clever, and so is advertising. And then there is sloppy journalism with its “wow” statistics.

I like his book enormously, not least because of its witty illustrations. It is subversive, comic, and provocative, and it makes me wise to seductive, misleading practices–and it does so with a light touch.

By Darrell Huff, Irving Geis (illustrator),

Why should I read it?

3 authors picked How to Lie with Statistics as one of their favorite books, and they share why you should read it.

What is this book about?

From distorted graphs and biased samples to misleading averages, there are countless statistical dodges that lend cover to anyone with an ax to grind or a product to sell. With abundant examples and illustrations, Darrell Huff's lively and engaging primer clarifies the basic principles of statistics and explains how they're used to present information in honest and not-so-honest ways. Now even more indispensable in our data-driven world than it was when first published, How to Lie with Statistics is the book that generations of readers have relied on to keep from being fooled.


Book cover of Be Data Literate: The Data Literacy Skills Everyone Needs to Succeed

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

Not everybody needs to be a data scientist, but everybody does need to be data literate. Without an intentional focus on evangelism and building a strong data culture in your organization it will be an uphill battle to make meaningful change. This book helps individuals and leaders to understand what data literacy is, and how we can build it like any other skill.

By Jordan Morrow,

Why should I read it?

1 author picked Be Data Literate as one of their favorite books, and they share why you should read it.

What is this book about?

In the fast moving world of the fourth industrial revolution not everyone needs to be a data scientist but everyone should be data literate, with the ability to read, analyze and communicate with data. It is not enough for a business to have the best data if those using it don't understand the right questions to ask or how to use the information generated to make decisions. Be Data Literate is the essential guide to developing the curiosity, creativity and critical thinking necessary to make anyone data literate, without retraining as a data scientist or statistician. With learnings to show…


Book cover of All-in On AI: How Smart Companies Win Big with Artificial Intelligence

Roger W. Hoerl Author Of Statistical Thinking: Improving Business Performance

From my list on AI and data science that are actually readable.

Why am I passionate about this?

As a professional statistician, I am naturally interested in AI and data science. However, in our current information age, everyone, in all segments of society, needs to understand the basics of AI and data science. These basics include such things as what these disciplines are, what they can contribute to society, and perhaps most importantly, what can go wrong. However, I have found that much of the literature on these topics is highly technical and beyond the reach of most readers. These books are specifically selected because they are readable by virtually everyone, and yet convey the key concepts needed to be data-literate in the 21st century. Enjoy!

Roger's book list on AI and data science that are actually readable

Roger W. Hoerl Why did Roger love this book?

Books on AI often go to extremes, either promoting it as the solution to all the world’s problems, or depicting it as an evil that will destroy humanity.

This book is much more practical, and based on experience using AI in actual business applications. It is the result of considerable research, involving investigation of applications not only in silicon-valley, but from various business sectors, such as Airbus, Ping, Progressive Insurance, and Capital One Bank.

Don’t let the title fool you; this book is not simply a promotion of AI, but addresses the practical issues that have to be considered if success is to be achieved. For example, they argue that “the most important aspect in AI success is not machinery, but human leadership, behavior, and change.”

By Thomas H. Davenport, Nitin Mittal,

Why should I read it?

1 author picked All-in On AI as one of their favorite books, and they share why you should read it.

What is this book about?

A Wall Street Journal bestseller

A Publisher's Weekly bestseller

A fascinating look at the trailblazing companies using artificial intelligence to create new competitive advantage, from the author of the business classic, Competing on Analytics, and the head of Deloitte's US AI practice.

Though most organizations are placing modest bets on artificial intelligence, there is a world-class group of companies that are going all-in on the technology and radically transforming their products, processes, strategies, customer relationships, and cultures.

Though these organizations represent less than 1 percent of large companies, they are all high performers in their industries. They have better business…


Book cover of Competing on Analytics: The New Science of Winning

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

This is a foundational book on analytics and data science as a business function and helped to shape the development of the practice. It provides a view of the discipline through a business lens and avoids deep technical examinations. Though much has changed in the 15 years since it was originally published, it is still essential reading for a leader in the field. No book since has captured as well the competitive differentiation that analytics provides.

By Thomas H. Davenport, Jeanne G. Harris,

Why should I read it?

1 author picked Competing on Analytics as one of their favorite books, and they share why you should read it.

What is this book about?

You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling. Exemplars of analytics are using new…


Book cover of The Black Swan

Jason McGathey Author Of Riots Of Passage

From Jason's 3 favorite reads in 2023.

Why am I passionate about this?

Author Randomness enthusiast Endlessly energetic Outsider artist Autodidact

Jason's 3 favorite reads in 2023

Jason McGathey Why did Jason love this book?

I've had this on my "want to read" list for eons but only finally got around to it. I found this a highly informative book without really bogging down into a bunch of math or other fine details. 

In fact, that's kind of the whole point of it. His writing style is also quite hilarious and I would suspect we might hit it off, if I was ever fortunate enough to meet the guy, a lot of this stuff sounds like something I would say, though, of course, he thought of it first and went the distance chasing down these thoughts. But yeah, I totally get where he's coming from.

The basic premise is that all these charts, reams of data, prediction models, et cetera are completely useless because one fluke hugely unexpected event arrives (it pretty much always arrives) and totally annihilates any notion of an "average" or anyone…

By Nassim Nicholas Taleb,

Why should I read it?

7 authors picked The Black Swan as one of their favorite books, and they share why you should read it.

What is this book about?

The most influential book of the past seventy-five years: a groundbreaking exploration of everything we know about what we don’t know, now with a new section called “On Robustness and Fragility.”

A black swan is a highly improbable event with three principal characteristics: It is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random, and more predictable, than it was. The astonishing success of Google was a black swan; so was 9/11. For Nassim Nicholas Taleb, black swans underlie almost everything about our world, from the rise of religions…


Book cover of Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools

Naomi R. Ceder Author Of The Quick Python Book

From my list on to level up your Python skills.

Why am I passionate about this?

I’ve been teaching and writing Python code (and managing others while they write Python code) for over 20 years. After all that time Python is still my tool of choice, and many times Python is the key part of how I explore and think about problems. My experience as a teacher also has prompted me to dig in and look for the simplest way of understanding and explaining the elegant way that Python features fit together. 

Naomi's book list on to level up your Python skills

Naomi R. Ceder Why did Naomi love this book?

I like this book not just because it’s a complete guide to the many ins and outs of data cleaning with Python, but also because David lays out the types of problems and the issues behind them. There are always trade-offs in data cleaning and this book lays out those trade-offs better than any other I’ve seen. This is one of the few books that as I go through it, I struggle to think of anything that could have been said better. 

By David Mertz,

Why should I read it?

1 author picked Cleaning Data for Effective Data Science as one of their favorite books, and they share why you should read it.

What is this book about?

Think about your data intelligently and ask the right questions

Key Features Master data cleaning techniques necessary to perform real-world data science and machine learning tasks Spot common problems with dirty data and develop flexible solutions from first principles Test and refine your newly acquired skills through detailed exercises at the end of each chapterBook Description

Data cleaning is the all-important first step to successful data science, data analysis, and machine learning. If you work with any kind of data, this book is your go-to resource, arming you with the insights and heuristics experienced data scientists had to learn the…


Book cover of People Skills for Analytical Thinkers

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

Since data science is, at its core, people helping people make decisions, it is essential that we can establish productive relationships with our stakeholders. This is a skill that needs to be given the same level of effort as we give to coding or statistics. Gilbert’s book is a great resource to help technically oriented people to advance their people skills.

By Gilbert Eijkelenboom,

Why should I read it?

1 author picked People Skills for Analytical Thinkers as one of their favorite books, and they share why you should read it.

What is this book about?

"For the engineer, scientist, or technology professional seeking to communicate better in the business world, this is the book you've been craving your entire career!" ”
— Douglas Laney, Innovation Fellow, West Monroe, and best-selling author of "Infonomics"

Your analytical skills are incredibly valuable. However, rational thinking alone isn’t enough.

Have you ever: Presented an idea, but then no one seemed to care? Explained your analysis, only to leave your colleague confused? Struggled to work with people who are less analytical and more emotional?

In these situations, people skills make the difference, and research shows these skills are becoming increasingly…