66 books like Kafka

By Neha Narkhede, Gwen Shapira, Todd Palino

Here are 66 books that Kafka fans have personally recommended if you like Kafka. Shepherd is a community of 10,000+ authors and super readers sharing their favorite books with the world.

Shepherd is reader supported. When you buy books, we may earn an affiliate commission.

Book cover of Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Tomasz Lelek Author Of Software Mistakes and Tradeoffs: How to make good programming decisions

From my list on big data processing ecosystem.

Why am I passionate about this?

I am motivated by working on products that many people use. I've been a part of companies that deliver products impacting millions of people. To achieve it, I am working in the Big Data ecosystem and striving to simplify it by contributing to Dremio's Data LakeHouse solution. I worked on projects using Spark, HDFS, Cassandra, and Kafka technologies. I have been working in the software engineering industry for ten years now, and I've tried to share my experience and lessons learned in the Software Mistakes and Tradeoffs book, hoping that it will allow current and the next generation of engineers to create better software, leading to more happy users.

Tomasz's book list on big data processing ecosystem

Tomasz Lelek Why did Tomasz love this book?

Designing Data-Intensive Applications is the best book if you want to learn about the main principles behind every system that is able to store and process big amounts of data.

You'll learn about distributed storage systems, their tradeoffs (availability, consistency, fault-tolerance), streaming processing systems, and main algorithms.

Those are the critical concepts behind almost every successful company that needs to create scalable solutions. 

By Martin Kleppmann,

Why should I read it?

1 author picked Designing Data-Intensive Applications as one of their favorite books, and they share why you should read it.

What is this book about?

Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords? In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain…


Book cover of Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e: Concepts, Tools, and Techniques to Build Intelligent Systems

Tomasz Lelek Author Of Software Mistakes and Tradeoffs: How to make good programming decisions

From my list on big data processing ecosystem.

Why am I passionate about this?

I am motivated by working on products that many people use. I've been a part of companies that deliver products impacting millions of people. To achieve it, I am working in the Big Data ecosystem and striving to simplify it by contributing to Dremio's Data LakeHouse solution. I worked on projects using Spark, HDFS, Cassandra, and Kafka technologies. I have been working in the software engineering industry for ten years now, and I've tried to share my experience and lessons learned in the Software Mistakes and Tradeoffs book, hoping that it will allow current and the next generation of engineers to create better software, leading to more happy users.

Tomasz's book list on big data processing ecosystem

Tomasz Lelek Why did Tomasz love this book?

The Hands-on Machine Learning book presents an end-to-end approach to many problems that can be solved with machine learning.

Every concept and topic is backed up with a running code that you can experiment with and adapt to your real-world problems.

Thanks to this book, you will be able to understand the state of the art of today's machine learning and feel comfortable using the most up-to-date ML methods.

By Géron Aurélien,

Why should I read it?

1 author picked Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e as one of their favorite books, and they share why you should read it.

What is this book about?

Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This best-selling book uses concrete examples, minimal theory, and production-ready Python frameworks--scikit-learn, Keras, and TensorFlow--to help you gain an intuitive understanding of the concepts and tools for building intelligent systems.

With this updated third edition, author Aurelien Geron explores a range of techniques, starting with simple linear regression and progressing to deep neural networks. Numerous code examples and exercises throughout…


Book cover of Advanced Analytics with Spark: Patterns for Learning from Data at Scale

Tomasz Lelek Author Of Software Mistakes and Tradeoffs: How to make good programming decisions

From my list on big data processing ecosystem.

Why am I passionate about this?

I am motivated by working on products that many people use. I've been a part of companies that deliver products impacting millions of people. To achieve it, I am working in the Big Data ecosystem and striving to simplify it by contributing to Dremio's Data LakeHouse solution. I worked on projects using Spark, HDFS, Cassandra, and Kafka technologies. I have been working in the software engineering industry for ten years now, and I've tried to share my experience and lessons learned in the Software Mistakes and Tradeoffs book, hoping that it will allow current and the next generation of engineers to create better software, leading to more happy users.

Tomasz's book list on big data processing ecosystem

Tomasz Lelek Why did Tomasz love this book?

Apache Spark has a very high point of entry for newcomers to the Big Data ecosystem.

However, it is a key tool that almost everyone is using for running distributed processing. I recommend everyone to read this book before delving into production solutions based on Apache Spark.

This book will allow you to alleviate many spark problems, such as serialization, memory utilization, and parallelization of processing.

By Sandy Ryza, Uri Laserson, Sean Owen , Josh Wills

Why should I read it?

1 author picked Advanced Analytics with Spark as one of their favorite books, and they share why you should read it.

What is this book about?

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques-classification, collaborative filtering, and anomaly detection among others-to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find these patterns useful for…


Book cover of Database Internals: A Deep-Dive Into How Distributed Data Systems Work

Tomasz Lelek Author Of Software Mistakes and Tradeoffs: How to make good programming decisions

From my list on big data processing ecosystem.

Why am I passionate about this?

I am motivated by working on products that many people use. I've been a part of companies that deliver products impacting millions of people. To achieve it, I am working in the Big Data ecosystem and striving to simplify it by contributing to Dremio's Data LakeHouse solution. I worked on projects using Spark, HDFS, Cassandra, and Kafka technologies. I have been working in the software engineering industry for ten years now, and I've tried to share my experience and lessons learned in the Software Mistakes and Tradeoffs book, hoping that it will allow current and the next generation of engineers to create better software, leading to more happy users.

Tomasz's book list on big data processing ecosystem

Tomasz Lelek Why did Tomasz love this book?

The Database Internals will allow you to go one step further in your understanding of how distributed databases work.

The author has a lot of experience with one of the most successful distributed databases - Apache Cassandra and shares his knowledge about low-level details and internals of distributed databases.

By Alex Petrov,

Why should I read it?

1 author picked Database Internals as one of their favorite books, and they share why you should read it.

What is this book about?

When it comes to choosing, using, and maintaining a database, understanding its internals is essential. But with so many distributed databases and tools available today, it's often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals.

Throughout the book, you'll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You'll discover that the most significant distinctions among many…


Book cover of Predict and Surveil: Data, Discretion, and the Future of Policing

Luke Hunt Author Of Police Deception and Dishonesty: The Logic of Lying

From my list on the cluster-f*ck we call policing.

Why am I passionate about this?

I’m an Associate Professor in the University of Alabama’s Department of Philosophy. I worked as an FBI Special Agent before making the natural transition to academic philosophy. Being a professor was always a close second to Quantico, but that scene in Point Break in which Keanu Reeves and Patrick Swayze fight Anthony Kiedis on the beach made it seem like the FBI would be more fun than academia. In my current position as a professor at the University of Alabama, I teach in my department’s Jurisprudence Specialization. My primary research interests are at the intersection of philosophy of law, political philosophy, and criminal justice. I’ve written three books on policing.

Luke's book list on the cluster-f*ck we call policing

Luke Hunt Why did Luke love this book?

I love this book because it reminds us of the many ways that technology can affect justice.

It is tempting to think sophisticated tactics such as “predictive policing” can solve all problems relating to human bias. However, Brayne shows that data and algorithms do not eliminate bias and discretion. Instead, high-tech police tools simply make bias less overt and visible, which erodes the public’s ability to hold the police accountable.

I especially enjoyed how the book flips the script, considering diverse ways to use these tools to help the public. For example, how can municipalities use technology to analyze the underlying factors that contribute to policing problems in the first place?

By Sarah Brayne,

Why should I read it?

1 author picked Predict and Surveil as one of their favorite books, and they share why you should read it.

What is this book about?

The scope of criminal justice surveillance, from the police to the prisons, has expanded rapidly in recent decades. At the same time, the use of big data has spread across a range of fields, including finance, politics, health, and marketing. While law enforcement's use of big data is hotly contested, very little is known about how the police actually use it in daily operations and with what consequences.

In Predict and Surveil, Sarah Brayne offers an unprecedented, inside look at how police use big data and new surveillance technologies, leveraging on-the-ground fieldwork with one of the most technologically advanced law…


Book cover of Privacy Is Power: Why and How You Should Take Back Control of Your Data

Susie Alegre Author Of Freedom to Think: Protecting a Fundamental Human Right in the Digital Age

From my list on how technology affects your human rights.

Why am I passionate about this?

I’ve always been passionate about social justice as a writer and as an international human rights lawyer. I had worked on human rights, surveillance, and privacy for decades around the world, but it was when I first read about Cambridge Analytica back in 2017 that it felt personal – privacy is the gateway to our right to freedom of thought and opinion and Big Tech is increasingly acting as the gatekeeper to all our human rights. These books have all helped me to understand what the risks are and how to tackle them.

Susie's book list on how technology affects your human rights

Susie Alegre Why did Susie love this book?

Privacy Is Power gets to the heart of why we should all be worried about encroachments on our privacy. 

Carissa Veliz is a philosopher and a talented writer who brings complex and profound ideas to life on the page. Some writing about technology can feel dry and detached, but Veliz makes you understand viscerally how the impact of technology is a human, not a technological issue. 

By Carissa Veliz,

Why should I read it?

2 authors picked Privacy Is Power as one of their favorite books, and they share why you should read it.

What is this book about?

An Economist BEST BOOK OF THE YEAR

As the data economy grows in power, Carissa Veliz exposes how our privacy is eroded by big tech and governments, why that matters and what we can do about it.

The moment you check your phone in the morning you are giving away your data. Before you've even switched off your alarm, a whole host of organisations have been alerted to when you woke up, where you slept, and with whom. As you check the weather, scroll through your 'suggested friends' on Facebook, you continually compromise your privacy.

Without your permission, or even…


Book cover of Information is Beautiful

Roger Highfield Author Of The Dance of Life: Symmetry, Cells and How We Become Human

From my list on what big data is and how it impacts us.

Why am I passionate about this?

I’m the Science Director of the Science Museum Group, based at the Science Museum in London, and visiting professor at the Dunn School, University of Oxford, and Department of Chemistry, University College London. Every time I write a book I swear that it will be my last and yet I'm now working on my ninth, after earlier forays into the physics of Christmas and the love life of Albert Einstein. Working with Peter Coveney of UCL, we're exploring ideas about computation and complexity we tackled in our two earlier books, along with the revolutionary implications of creating digital twins of people from the colossal amount of patient data now flowing from labs worldwide.

Roger's book list on what big data is and how it impacts us

Roger Highfield Why did Roger love this book?

Big data can be beautiful and visualisations make for a wonderful coffee-table book. In Information is Beautiful, David McCandless turns dry-as-dust data into pop art to show the kind of world we live in, linking politics to life expectancy, women’s education to GDP growth, and more. Through colourful graphics, we get vivid and novel perspectives on current obsessions, from maps of cliches to the most fashionable colours. A testament to how the power of big data comes from being able to distill information to reveal hidden patterns and discern trends. 

By David McCandless,

Why should I read it?

1 author picked Information is Beautiful as one of their favorite books, and they share why you should read it.

What is this book about?

A visual guide to the way the world really works

Every day, every hour, every minute we are bombarded by information - from television, from newspapers, from the internet, we're steeped in it, maybe even lost in it. We need a new way to relate to it, to discover the beauty and the fun of information for information's sake.
No dry facts, theories or statistics. Instead, Information is Beautiful contains visually stunning displays of information that blend the facts with their connections, their context and their relationships - making information meaningful, entertaining and beautiful.
This is information like you have…


Book cover of An Ugly Truth: Inside Facebook's Battle for Domination

Roger Highfield Author Of The Dance of Life: Symmetry, Cells and How We Become Human

From my list on what big data is and how it impacts us.

Why am I passionate about this?

I’m the Science Director of the Science Museum Group, based at the Science Museum in London, and visiting professor at the Dunn School, University of Oxford, and Department of Chemistry, University College London. Every time I write a book I swear that it will be my last and yet I'm now working on my ninth, after earlier forays into the physics of Christmas and the love life of Albert Einstein. Working with Peter Coveney of UCL, we're exploring ideas about computation and complexity we tackled in our two earlier books, along with the revolutionary implications of creating digital twins of people from the colossal amount of patient data now flowing from labs worldwide.

Roger's book list on what big data is and how it impacts us

Roger Highfield Why did Roger love this book?

‘They trust me….dumb f*cks.’ This telling exchange from the Harvard days of Facebook co-founder and CEO, Mark Zuckerberg appears in An Ugly Truth, which shines a harsh light on the tech behemoth that, ultimately, is built on the data of billions of people. As Meta, Zuckerberg’s new business incarnation, wafts into the virtual worlds of the metaverse, the story of Facebook is far from over, which makes this engaging book a tad unsatisfying. Nonetheless, it is a vivid example of how with Big Data comes Big Responsibility.

By Sheera Frenkel, Cecilia Kang,

Why should I read it?

1 author picked An Ugly Truth as one of their favorite books, and they share why you should read it.

What is this book about?

'An explosive new book' Daily Mail

'[A] careful, comprehensive interrogation of every major Facebook scandal. An Ugly Truth provides the kind of satisfaction you might get if you hired a private investigator to track a cheating spouse: it confirms your worst suspicions and then gives you all the dates and details you need to cut through the company's spin' New York Times

__________________________________________

Award-winning New York Times reporters Sheera Frenkel and Cecilia Kang unveil the tech story of our times in this riveting, behind-the-scenes expose that offers the definitive account of Facebook's fall from grace. Once one of Silicon Valley's…


Book cover of Forewarned: A Sceptic's Guide to Prediction

David F. Hendry Author Of Forecasting: An Essential Introduction

From my list on getting an insight into forecasting.

Why am I passionate about this?

Accurate and precise forecasting is essential for successful planning and policy from economics to epidemiology. We have been keen to understand why so many forecasts turn out to be highly inaccurate since making dreadful forecasts ourselves, and advising UK government agencies (Treasury, Parliament, Bank of England) during turbulent periods. As simple extrapolation often beats model-based forecasting, we have been developing improved methods that draw on the best aspects of both, and have published more than 60 articles and 6 books attracting more than 6000 citations by other scholars. Our recommended books cover a wide range of forecasting methods—suggesting there is no optimal way to look into the future.

David's book list on getting an insight into forecasting

David F. Hendry Why did David love this book?

When can we trust a forecast? Given how often forecasts end up being very wide of the mark, a degree of scepticism might well be warranted. Paul Goodwin provides an entertaining account of forecasting, arguing that intuition may serve us well in some settings, but that computer-based analysis of big data might be expected to prevail in others.        

By Paul Goodwin,

Why should I read it?

1 author picked Forewarned as one of their favorite books, and they share why you should read it.

What is this book about?

Whether it's an unforeseen financial crash, a shock election result or a washout summer that threatens to ruin a holiday in the sun, forecasts are part and parcel of our everyday lives. We rely wholeheartedly on them, and become outraged when things don't go exactly to plan.

But should we really put so much trust in predictions? Perhaps gut instincts can trump years of methodically compiled expert knowledge? And when exactly is a forecast not a forecast? Forewarned will answer all of these intriguing questions, and many more.

Packed with fun anecdotes and startling facts, Forewarned is a myth-busting guide…


Book cover of Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition

David Theo Goldberg Author Of The Threat of Race: Reflections on Racial Neoliberalism

From my list on spotlighting race and neoliberalization.

Why am I passionate about this?

I grew up and completed the formative years of my college education in Cape Town, South Africa, while active also in anti-apartheid struggles. My Ph.D. dissertation in the 1980s focused on the elaboration of key racial ideas in the modern history of philosophy. I have published extensively on race and racism in the U.S. and globally, in books, articles, and public media. My interests have especially focused on the transforming logics and expressions of racism over time, and its updating to discipline and constrain its conventional targets anew and new targets more or less conventionally. My interest has always been to understand racism in order to face it down.

David's book list on spotlighting race and neoliberalization

David Theo Goldberg Why did David love this book?

Digital technology, like technology generally, is commonly assumed to be value neutral. Wendy Chun reveals that structurally embedded in digital operating systems and data collection are values that reproduce and extend existing modes of discriminating while also originating new ones. In prompting and promoting the grouping together of people who are alike—in habits, culture, looks, and preferences—the logic of the algorithm reproduces and amplifies discriminatory trends. Chun reveals how the logics of the digital reinforce the restructuring of racism by the neoliberal turn that my own book lays out.

By Wendy Hui Kyong Chun, Alex Barnett (illustrator),

Why should I read it?

1 author picked Discriminating Data as one of their favorite books, and they share why you should read it.

What is this book about?

How big data and machine learning encode discrimination and create agitated clusters of comforting rage.

In Discriminating Data, Wendy Hui Kyong Chun reveals how polarization is a goal—not an error—within big data and machine learning. These methods, she argues, encode segregation, eugenics, and identity politics through their default assumptions and conditions. Correlation, which grounds big data’s predictive potential, stems from twentieth-century eugenic attempts to “breed” a better future. Recommender systems foster angry clusters of sameness through homophily. Users are “trained” to become authentically predictable via a politics and technology of recognition. Machine learning and data analytics thus seek to disrupt…


5 book lists we think you will like!

Interested in big data, the ecosystem, and data processing?

10,000+ authors have recommended their favorite books and what they love about them. Browse their picks for the best books about big data, the ecosystem, and data processing.

Big Data Explore 29 books about big data
The Ecosystem Explore 20 books about the ecosystem
Data Processing Explore 25 books about data processing