100 handpicked books like Database Internals (picked by fans)

Designing Data-Intensive Applications

From my list on practical, hands-on books on DevOps and software delivery.

By Yevgeniy Brikman Author

Why did Yevgeniy love this book?

This is the best overview of data storage and distributed systems—two key concepts for building almost any piece of software today—that I've seen anywhere. Martin does a wonderful job of taking a massive body of research and distilling complicated concepts and difficult trade-offs down to a level anyone can understand.

I learned a lot about replication, partitioning, linearizability, locking, write skew, phantoms, transactions, event logs, and more. I'm also a big fan of the final chapter, The Future of Data Systems, which covers ideas such as "unbundling the database", end-to-end event streams, and an important discussion on ethics in programming and data systems.

Designing Data-Intensive Applications

By Martin Kleppmann,

Why should I read it?

2 authors picked Designing Data-Intensive Applications as one of their favorite books, and they share why you should read it.

What is this book about?

Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords? In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain…

Explore

Topics

Big data

Genres

Design

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e

By Géron Aurélien,

From my list on big data processing ecosystem.

By Tomasz Lelek Author

Why did Tomasz love this book?

The Hands-on Machine Learning book presents an end-to-end approach to many problems that can be solved with machine learning.

Every concept and topic is backed up with a running code that you can experiment with and adapt to your real-world problems.

Thanks to this book, you will be able to understand the state of the art of today's machine learning and feel comfortable using the most up-to-date ML methods.

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e

By Géron Aurélien,

Why should I read it?

1 author picked Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e as one of their favorite books, and they share why you should read it.

What is this book about?

Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This best-selling book uses concrete examples, minimal theory, and production-ready Python frameworks--scikit-learn, Keras, and TensorFlow--to help you gain an intuitive understanding of the concepts and tools for building intelligent systems.

With this updated third edition, author Aurelien Geron explores a range of techniques, starting with simple linear regression and progressing to deep neural networks. Numerous code examples and exercises throughout…

Explore

Topics

Genres

Coming soon!

Kafka

By Neha Narkhede, Gwen Shapira, Todd Palino

From my list on big data processing ecosystem.

By Tomasz Lelek Author

Why did Tomasz love this book?

Apache Kafka is the backbone of almost every streaming-based system today.

The solutions created and implemented in Kafka are the key concepts in every streaming system that you will work with.

This book will allow you to fully understand the Kafka architecture, its internals, and APIs and allow you to become an expert in this technology.

Kafka

By Neha Narkhede, Gwen Shapira, Todd Palino

Why should I read it?

1 author picked Kafka as one of their favorite books, and they share why you should read it.

What is this book about?

Every enterprise application creates data, whether it's log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you're an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds.

Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you'll learn Kafka's…

Explore

Topics

Genres

Design

Advanced Analytics with Spark

By Sandy Ryza, Uri Laserson, Sean Owen , Josh Wills

From my list on big data processing ecosystem.

By Tomasz Lelek Author

Why did Tomasz love this book?

Apache Spark has a very high point of entry for newcomers to the Big Data ecosystem.

However, it is a key tool that almost everyone is using for running distributed processing. I recommend everyone to read this book before delving into production solutions based on Apache Spark.

This book will allow you to alleviate many spark problems, such as serialization, memory utilization, and parallelization of processing.

Advanced Analytics with Spark

By Sandy Ryza, Uri Laserson, Sean Owen , Josh Wills

Why should I read it?

1 author picked Advanced Analytics with Spark as one of their favorite books, and they share why you should read it.

What is this book about?

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques-classification, collaborative filtering, and anomaly detection among others-to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find these patterns useful for…

Explore

Topics

Genres

Coming soon!

Kafka in Action

By Dylan Scott, Viktor Gamov, Dave Klein

From my list on mastering Java and Spring-based microservices.

By Magnus Larsson Author

Why did Magnus love this book?

Apache Kafka is the industry standard for real-time event streaming, an essential component for large-scale, high-performance microservice ecosystems.

Despite being new to Kafka when I read this book, it quickly brought me up to speed on key concepts that underpin its scalability and real-time capabilities, such as the commit log, topic partitions, and consumer groups. The book also introduces other critical Kafka features like the schema registry, Kafka Connect, and stream processing with Kafka Streams and ksqlDB. The practical examples provided were straightforward to apply and adapt to my own use cases.

Kafka in Action

By Dylan Scott, Viktor Gamov, Dave Klein

Why should I read it?

1 author picked Kafka in Action as one of their favorite books, and they share why you should read it.

What is this book about?

Kafka in Action is a practical, hands-on guide to building Kafka-based data pipelines. Filled with real-world use cases and scenarios, this book probes Kafka's most common use cases, ranging from simple logging through managing streaming data systems for message routing, analytics, and more.

In systems that handle big data, streaming data, or fast data, it's important to get your data pipelines right. Apache Kafka is a wicked-fast distributed streaming platform that operates as more than just a persistent log or a flexible message queue.

Key Features

* Understanding Kafka's concepts

* Implementing Kafka as a message queue

* Setting up…

Explore

Topics

Coming soon!

Genres

Design

Master Your Data with Power Query in Excel and Power BI

By Ken Puls, Miguel Escobar,

From my list on to go from Excel to Power Query and Power BI.

By Bill Jelen Author

Why did Bill love this book?

Microsoft quietly slipped the Get & Transform tools onto the Data tab in Excel in 2016. These tools are incredibly powerful – you clean your data once and Excel will remember how to clean your data every month, every week, every day, every hour. Ken Puls and Miguel Escobar will show you all of the best tricks for using these tools.

Master Your Data with Power Query in Excel and Power BI

By Ken Puls, Miguel Escobar,

Why should I read it?

1 author picked Master Your Data with Power Query in Excel and Power BI as one of their favorite books, and they share why you should read it.

What is this book about?

Power Query is the amazing new data cleansing tool in both Excel and Power BI Desktop. Do you find yourself performing the same data cleansing steps day after day? Power Query will make it faster to clean your data the first time. While Power Query is powerful, the interface is subtle—there are tools hiding in plain sight that are easy to miss. Go beyond the obvious and take Power Query to new levels with this book.

Explore

Topics

Genres

Coming soon!

R for Data Science

By Hadley Wickham, Garrett Grolemund,

From my list on intro to programming and data science with R.

By Tilman M. Davies Author

Why did Tilman love this book?

For those intending to use R with an eye on the popular 'Tidyverse' suite of packages – which facilitate the handling, manipulation, and visualisation of data sets – it's hard to go past this book. From the founding contributors of the RStudio/Tidyverse worlds, this is a great way to learn about this dialect of R against the overarching backdrop of statistical data analysis and data science.

R for Data Science

By Hadley Wickham, Garrett Grolemund,

Why should I read it?

1 author picked R for Data Science as one of their favorite books, and they share why you should read it.

What is this book about?

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along…

Explore

Topics

Genres

Coming soon!

Information is Beautiful

By David McCandless,

From my list on what big data is and how it impacts us.

By Roger Highfield Author

Why did Roger love this book?

Big data can be beautiful and visualisations make for a wonderful coffee-table book. In Information is Beautiful, David McCandless turns dry-as-dust data into pop art to show the kind of world we live in, linking politics to life expectancy, women’s education to GDP growth, and more. Through colourful graphics, we get vivid and novel perspectives on current obsessions, from maps of cliches to the most fashionable colours. A testament to how the power of big data comes from being able to distill information to reveal hidden patterns and discern trends.

Information is Beautiful

By David McCandless,

Why should I read it?

1 author picked Information is Beautiful as one of their favorite books, and they share why you should read it.

What is this book about?

A visual guide to the way the world really works

Every day, every hour, every minute we are bombarded by information - from television, from newspapers, from the internet, we're steeped in it, maybe even lost in it. We need a new way to relate to it, to discover the beauty and the fun of information for information's sake.
No dry facts, theories or statistics. Instead, Information is Beautiful contains visually stunning displays of information that blend the facts with their connections, their context and their relationships - making information meaningful, entertaining and beautiful.
This is information like you have…

Explore

Topics

Big data
God

Genres

Coming soon!

An Ugly Truth

By Sheera Frenkel, Cecilia Kang,

From my list on what big data is and how it impacts us.

By Roger Highfield Author

Why did Roger love this book?

‘They trust me….dumb f*cks.’ This telling exchange from the Harvard days of Facebook co-founder and CEO, Mark Zuckerberg appears in An Ugly Truth, which shines a harsh light on the tech behemoth that, ultimately, is built on the data of billions of people. As Meta, Zuckerberg’s new business incarnation, wafts into the virtual worlds of the metaverse, the story of Facebook is far from over, which makes this engaging book a tad unsatisfying. Nonetheless, it is a vivid example of how with Big Data comes Big Responsibility.

An Ugly Truth

By Sheera Frenkel, Cecilia Kang,

Why should I read it?

1 author picked An Ugly Truth as one of their favorite books, and they share why you should read it.

What is this book about?

'An explosive new book' Daily Mail

'[A] careful, comprehensive interrogation of every major Facebook scandal. An Ugly Truth provides the kind of satisfaction you might get if you hired a private investigator to track a cheating spouse: it confirms your worst suspicions and then gives you all the dates and details you need to cut through the company's spin' New York Times

__________________________________________

Award-winning New York Times reporters Sheera Frenkel and Cecilia Kang unveil the tech story of our times in this riveting, behind-the-scenes expose that offers the definitive account of Facebook's fall from grace. Once one of Silicon Valley's…

Explore

Topics

Genres

Forewarned

By Paul Goodwin,

From my list on getting an insight into forecasting.

By David F. Hendry Author

Why did David love this book?

When can we trust a forecast? Given how often forecasts end up being very wide of the mark, a degree of scepticism might well be warranted. Paul Goodwin provides an entertaining account of forecasting, arguing that intuition may serve us well in some settings, but that computer-based analysis of big data might be expected to prevail in others.

Forewarned

By Paul Goodwin,

Why should I read it?

1 author picked Forewarned as one of their favorite books, and they share why you should read it.

What is this book about?

Whether it's an unforeseen financial crash, a shock election result or a washout summer that threatens to ruin a holiday in the sun, forecasts are part and parcel of our everyday lives. We rely wholeheartedly on them, and become outraged when things don't go exactly to plan.

But should we really put so much trust in predictions? Perhaps gut instincts can trump years of methodically compiled expert knowledge? And when exactly is a forecast not a forecast? Forewarned will answer all of these intriguing questions, and many more.

Packed with fun anecdotes and startling facts, Forewarned is a myth-busting guide…

Explore

Topics

Genres

Business

32 books like Database Internals

Designing Data-Intensive Applications

Why did Yevgeniy love this book?

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 3e

Why did Tomasz love this book?

Kafka

Why did Tomasz love this book?

Advanced Analytics with Spark

Why did Tomasz love this book?

Kafka in Action

Why did Magnus love this book?

Master Your Data with Power Query in Excel and Power BI

Why did Bill love this book?

R for Data Science

Why did Tilman love this book?

Information is Beautiful

Why did Roger love this book?

An Ugly Truth

Why did Roger love this book?

Forewarned

Why did David love this book?

5 book lists we think you will like!

Interested in big data, data mining, and artificial intelligence?

32 books like Database Internals

Why am I passionate about this?

Yevgeniy's book list on practical, hands-on books on DevOps and software delivery

Why did Yevgeniy love this book?

Why am I passionate about this?

Tomasz's book list on big data processing ecosystem

Why did Tomasz love this book?

Why am I passionate about this?

Tomasz's book list on big data processing ecosystem

Why did Tomasz love this book?

Why am I passionate about this?

Tomasz's book list on big data processing ecosystem

Why did Tomasz love this book?

Why am I passionate about this?

Magnus' book list on mastering Java and Spring-based microservices

Why did Magnus love this book?

Why am I passionate about this?

Bill's book list on to go from Excel to Power Query and Power BI

Why did Bill love this book?

Why am I passionate about this?

Tilman's book list on intro to programming and data science with R

Why did Tilman love this book?

Why am I passionate about this?

Roger's book list on what big data is and how it impacts us

Why did Roger love this book?

Why am I passionate about this?

Roger's book list on what big data is and how it impacts us

Why did Roger love this book?

Why am I passionate about this?

David's book list on getting an insight into forecasting

Why did David love this book?

Share your top 3 reads of 2024!

1,187

5 book lists we think you will like!

Interested in big data, data mining, and artificial intelligence?