Enterprise Data Workflows with Cascading 1st Edition by Paco Nathan – Ebook PDF Instant Download/Delivery: 1449358721, 9781449358723
Full download Enterprise Data Workflows with Cascading 1st Edition after payment
Product details:
ISBN 10: 1449358721
ISBN 13: 9781449358723
Author: Paco Nathan
Enterprise Data Workflows with Cascading 1st Table of contents:
Chapter 1. Getting Started
Programming Environment Setup
Example 1: Simplest Possible App in Cascading
Build and Run
Cascading Taxonomy
Example 2: The Ubiquitous Word Count
Flow Diagrams
Predictability at Scale
Chapter 2. Extending Pipe Assemblies
Example 3: Customized Operations
Scrubbing Tokens
Example 4: Replicated Joins
Stop Words and Replicated Joins
Comparing with Apache Pig
Comparing with Apache Hive
Chapter 3. Test-Driven Development
Example 5: TF-IDF Implementation
Example 6: TF-IDF with Testing
A Word or Two About Testing
Chapter 4. Scalding—A Scala DSL for Cascading
Why Use Scalding?
Getting Started with Scalding
Example 3 in Scalding: Word Count with Customized Operations
A Word or Two about Functional Programming
Example 4 in Scalding: Replicated Joins
Build Scalding Apps with Gradle
Running on Amazon AWS
Chapter 5. Cascalog—A Clojure DSL for Cascading
Why Use Cascalog?
Getting Started with Cascalog
Example 1 in Cascalog: Simplest Possible App
Example 4 in Cascalog: Replicated Joins
Example 6 in Cascalog: TF-IDF with Testing
Cascalog Technology and Uses
Chapter 6. Beyond MapReduce
Applications and Organizations
Lingual, a DSL for ANSI SQL
Using the SQL Command Shell
Using the JDBC Driver
Integrating with Desktop Tools
Pattern, a DSL for Predictive Model Markup Language
Getting Started with Pattern
Predefined App for PMML
Integrating Pattern into Cascading Apps
Customer Experiments
Technology Roadmap for Pattern
Chapter 7. The Workflow Abstraction
Key Insights
Pattern Language
Literate Programming
Separation of Concerns
Functional Relational Programming
Enterprise vs. Start-Ups
Chapter 8. Case Study: City of Palo Alto Open Data
Why Open Data?
City of Palo Alto
Moving from Raw Sources to Data Products
Calibrating Metrics for the Recommender
Spatial Indexing
Personalization
Recommendations
Build and Run
Key Points of the Recommender Workflow
Appendix A. Troubleshooting Workflows
Build and Runtime Problems
Anti-Patterns
Workflow Bottlenecks
Other Resources
People also search for Enterprise Data Workflows with Cascading 1st:
Tags: Paco Nathan, Enterprise Data, Cascading



