18.5.15

TOC of Learning Spark

Preface | 5
Audience | 5
How This Book is Organized | 6
Supporting Books | 6
Code Examples | 7
Early Release Status and Feedback | 7

Chapter 1. Introduction to Data Analysis with Spark | 8
What is Apache Spark? | 8
A Unified Stack | 8
Who Uses Spark, and For What? | 11
A Brief History of Spark | 13
Spark Versions and Releases | 13
Spark and Hadoop | 14

Chapter 2. Downloading and Getting Started | 15
Downloading Spark | 15
Introduction to Spark?s Python and Scala Shells | 16
Introduction to Core Spark Concepts | 20
Standalone Applications | 23
Conclusion | 25

Chapter 3. Programming with RDDs | 26
RDD Basics | 26
Creating RDDs | 28
RDD Operations | 28
Passing Functions to Spark | 32
Common Transformations and Actions | 36
Persistence (Caching) | 46
Conclusion | 48

Chapter 4. Working with Key-Value Pairs | 49
Motivation | 49
Creating Pair RDDs | 49
Transformations on Pair RDDs | 50
Actions Available on Pair RDDs | 60
Data Partitioning | 61
Conclusion | 70

Chapter 5. Loading and Saving Your Data | 71
Motivation | 71
Choosing a Format | 71
Formats | 72
File Systems | 88
Compression | 89
Databases | 91
Conclusion | 93

About the Authors | 95

No comments: