The best books for data scientists trying to be ethical people

Why am I passionate about this?

I studied statistics and data science for years before anyone ever suggested to me that these topics might have an ethical dimension, or that my numerical tools were products of human beings with motivations specific to their time and place. I’ve since written about the history and philosophy of mathematical probability and statistics, and I’ve come to understand just how important that historical background is and how critically important it is that the next generation of data scientists understand where these ideas come from and their potential to do harm. I hope anyone who reads these books avoids getting blinkered by the ideas that data = objectivity and that science is morally neutral.


I wrote...

Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science

By Aubrey Clayton,

Book cover of Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science

What is my book about?

There is a logical flaw in the statistical methods used across experimental science. This fault is not a minor academic quibble: it underlies a reproducibility crisis now threatening entire disciplines. In an increasingly statistics-reliant society, this same deeply rooted error shapes decisions in medicine, law, and public policy with profound consequences. The foundation of the problem is a misunderstanding of probability and its role in making inferences from observations.

Aubrey Clayton traces the history of how statistics went astray, beginning with the groundbreaking work of the seventeenth-century mathematician Jacob Bernoulli and winding through gambling, astronomy, and genetics. Clayton recounts the feuds among rival schools of statistics, exploring the surprisingly human problems that gave rise to the discipline and the all-too-human shortcomings that derailed it. 
Shepherd is reader supported. When you buy books, we may earn an affiliate commission.

The books I picked & why

Book cover of Social Sciences as Sorcery

Aubrey Clayton Why did I love this book?

This book is now 50 years old, but its message is as relevant and important now as when it was written. In a series of witty essays that border on rants, Andreski attacks much of social science as fluff obscured by technical jargon and methodology. In particular, he laments the growth of quantitative methods as an attempt to add objectivity to social science and make it appear “harder.” True objectivity is about more than mechanical number-crunching, he says; it’s about a commitment to fairness and resisting the temptations of wishful thinking – a challenge anyone who works with data concerning people and their lives should take seriously.

By Stanislav Andreski,

Why should I read it?

1 author picked Social Sciences as Sorcery as one of their favorite books, and they share why you should read it.

What is this book about?

"Seldom have the social sciences been subject to quite so comprehensive, yet non-partisan, attack. There can be little doubt SOCIAL SCIENCES AS SORCERY is an uncomfortably important and embarassingly comprehensive book." -- Times Literary Supplement "Liberating!" -- Harpers "Andreski has written a new book that is certain to enrage his colleagues ... He documents his charges and spares few of the luminaries of social science in the process." -- TIME Magazine


Book cover of Biology as Ideology: The Doctrine of DNA

Aubrey Clayton Why did I love this book?

People need less Dawkins in their lives and more Lewontin, whose thought-provoking, accessible writing about evolutionary biology stands in fierce opposition to the trend toward genetic determinism that seems to be the rage nowadays. We are not simply our genes, Lewontin says, because the effects DNA has on our lives are mediated by social and environmental factors, many of which we can influence. While it’s nominally about biology, I also read this as a critique of causal inference, generally. What we consider a “cause” reveals our ideological commitments to certain aspects of the world being maintained, and we should be careful what causal lessons we draw from data.

By Richard C. Lewontin,

Why should I read it?

1 author picked Biology as Ideology as one of their favorite books, and they share why you should read it.

What is this book about?

Following in the fashion of Stephen Jay Gould and Peter Medawar, one of the world's leading scientists examines how "pure science" is in fact shaped and guided by social and political needs and assumptions.


Book cover of The Golem: What You Should Know about Science

Aubrey Clayton Why did I love this book?

The thing you should know about science is that it’s a human enterprise. As a result, it’s dependent on human factors like social consensus and prejudice. In this series of case studies of famously expensive and difficult-to-replicate experiments probing the limits of scientific understanding from biology to theoretical physics, Collins and Pinch show how scientific knowledge gathering is rarely straightforward because there are always alternative explanations available for the data. Was the phenomenon real or was the experiment set up badly? We can never know for sure, but we decide collectively what we believe. Scientists are experts participating in human culture, they argue, not mysterious clergy issuing declarations of absolute truth.

By Harry M. Collins, Trevor Pinch,

Why should I read it?

1 author picked The Golem as one of their favorite books, and they share why you should read it.

What is this book about?

Harry Collins and Trevor Pinch liken science to the Golem, a creature from Jewish mythology, powerful yet potentially dangerous, a gentle, helpful creature that may yet run amok at any moment. Through a series of intriguing case studies the authors debunk the traditional view that science is the straightforward result of competent theorisation, observation and experimentation. The very well-received first edition generated much debate, reflected in a substantial new Afterword in this second edition, which seeks to place the book in what have become known as 'the science wars'.


Book cover of Superior: The Return of Race Science

Aubrey Clayton Why did I love this book?

The fact that race is a social construct and not a biological reality seems to be a lesson that we are destined to learn and re-learn many times. Saini uses a personal, journalistic style to tell the story of the pernicious myth of biological race in the sciences, drawing a continuous line from scientific racists like Francis Galton in the 1800s to present-day medicine and right-wing politics. The story is alternately funny and horrifying, with incredibly timely significance. It should be read by all data-adjacent individuals as a cautionary tale about avoiding the mistakes of the past and present. 

By Angela Saini,

Why should I read it?

3 authors picked Superior as one of their favorite books, and they share why you should read it.

What is this book about?

Financial Times Book of the Year Telegraph Top 50 Books of the Year Guardian Book of the Year New Statesman Book of the Year

'Roundly debunks racism's core lie - that inequality is to do with genetics, rather than political power' Reni Eddo-Lodge

Where did the idea of race come from, and what does it mean? In an age of identity politics, DNA ancestry testing and the rise of the far-right, a belief in biological differences between populations is experiencing a resurgence. The truth is: race is a social construct. Our problem is we find this hard to believe.

In…


Book cover of Data Feminism

Aubrey Clayton Why did I love this book?

If you’ve never thought of “intersectional feminism” or “the gender binary” as essentially data-scientific terms, please allow this book to correct that. Data science is a locus of power, and that power can be wielded in the service of oppression or liberation. This book raises essential questions about the predominantly white, male, technocratic interests served by the traditional narratives of data analysis and what feminism and data science have to offer each other. Bottom line: the data doesn’t speak for itself, never has, and never will.

By Catherine D'Ignazio, Lauren F. Klein,

Why should I read it?

1 author picked Data Feminism as one of their favorite books, and they share why you should read it.

What is this book about?

A new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism.

Today, data science is a form of power. It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In…


You might also like...

Native Nations: A Millennium in North America

By Kathleen DuVal,

Book cover of Native Nations: A Millennium in North America

Kathleen DuVal Author Of Independence Lost: Lives on the Edge of the American Revolution

New book alert!

Why am I passionate about this?

I’m a professional historian and life-long lover of early American history. My fascination with the American Revolution began during the bicentennial in 1976, when my family traveled across the country for celebrations in Williamsburg and Philadelphia. That history, though, seemed disconnected to the place I grew up—Arkansas—so when I went to graduate school in history, I researched in French and Spanish archives to learn about their eighteenth-century interactions with Arkansas’s Native nations, the Osages and Quapaws. Now I teach early American history and Native American history at UNC-Chapel Hill and have written several books on how Native American, European, and African people interacted across North America.

Kathleen's book list on the American Revolution beyond the Founding Fathers

What is my book about?

A magisterial history of Indigenous North America that places the power of Native nations at its center, telling their story from the rise of ancient cities more than a thousand years ago to fights for sovereignty that continue today

Native Nations: A Millennium in North America

By Kathleen DuVal,

What is this book about?

Long before the colonization of North America, Indigenous Americans built diverse civilizations and adapted to a changing world in ways that reverberated globally. And, as award-winning historian Kathleen DuVal vividly recounts, when Europeans did arrive, no civilization came to a halt because of a few wandering explorers, even when the strangers came well armed.

A millennium ago, North American cities rivaled urban centers around the world in size. Then, following a period of climate change and instability, numerous smaller nations emerged, moving away from rather than toward urbanization. From this urban past, egalitarian government structures, diplomacy, and complex economies spread…


5 book lists we think you will like!

Interested in data science, eugenics, and social science?

10,000+ authors have recommended their favorite books and what they love about them. Browse their picks for the best books about data science, eugenics, and social science.

Data Science Explore 24 books about data science
Eugenics Explore 21 books about eugenics
Social Science Explore 81 books about social science