Data science collection books

Data science collection books all of which are free. However, I have listed a few optional books that will provide additional context for those who are interested.

Introduction to Statistical Learning (Free online PDF) This book is a great reference for the machine learning and some of the statistics material in the class

Data Science from Scratch (Available as eBook for Berkeley students) This more applied book covers many of the topics in this class using Python but doesn’t go into sufficient depth for some of the more mathematical material.

Doing Data Science (Available as eBook for Berkeley students) This books provides a unique case-study view of data science but uses R and not Python.

Python for Data Analysis (Available as eBook for Berkeley students). This book provides a good reference for the Pandas library.

Storytelling With Data: A Data Visualization Guide for Business Professionals

Storytelling With Data” is designed to help readers build effective data-driven narratives. Cole Nussbaumer Knaflic, the author of the popular blog, explains approaches to getting rid of unnecessary data that obscures clear communication, converting complicated information into a concise summary, and using design principles to create impactful data visualizations.


R for Data Science Great resource by Hadley Wickham Chief scientist at RStudio.


Visualization: ggplot2D3.js – Data-Driven DocumentsPython plotting – Matplotlib 2.1.2 documentation

Related to Statistics: Classic Statistical Learning books:

Introduction to Statistical Learning

The Elements of Statistical Learning

Related to Information Retrieval:

Introduction to Information Retrieval : Very important book to understand web crawling, data collection, data storage, feature engineering and Text analysis.

Mining of Massive Datasets : If you want to be a data scientist you will run into problem of computation. This is very important resource to understand how to process massive datasets.


To Learn Data Mining and Machine Learning Techniques:

Data Mining: Concepts and Techniques

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management

Data Mining: Practical Machine Learning Tools and Techniques

Moving forward to Machine Learning (ML) and Deep Learning (DL):

ML and DL:

Neural networks and deep learning

Deep Learning: Ian Goodfellow

Java based DL: Deeplearning4j Documentation

TensorFlow: Tutorials | TensorFlow :

Interviews with Data Scientists

The Data Analytics Handbook ( Download )


Build a Data Science Team

Data Driven: Creating a Data Culture by Hilary Mason and DJ Patil (free)

Understanding the Chief Data Officer (free)

Building Data Science Teams by DJ Patil (free)

Leave a Reply

Your email address will not be published. Required fields are marked *