Predictive Analytics and Data Mining

Concepts and Practice with RapidMiner

1st Edition - November 27, 2014
Authors: Vijay Kotu, Bala Deshpande
Language: English
Paperback ISBN:
9 7 8 - 0 - 1 2 - 8 0 1 4 6 0 - 8
eBook ISBN:
9 7 8 - 0 - 1 2 - 8 0 1 6 5 0 - 3

Put Predictive Analytics into ActionLearn the basics of Predictive Analysis and Data Mining through an easy to understand conceptual framework and immediately practice the conce… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

Resources

Textbook support for instructors(opens in new tab/window)

Put Predictive Analytics into ActionLearn the basics of Predictive Analysis and Data Mining through an easy to understand conceptual framework and immediately practice the concepts learned using the open source RapidMiner tool. Whether you are brand new to Data Mining or working on your tenth project, this book will show you how to analyze data, uncover hidden patterns and relationships to aid important decisions and predictions. Data Mining has become an essential tool for any enterprise that collects, stores and processes data as part of its operations. This book is ideal for business users, data analysts, business analysts, business intelligence and data warehousing professionals and for anyone who wants to learn Data Mining.You’ll be able to:1. Gain the necessary knowledge of different data mining techniques, so that you can select the right technique for a given data problem and create a general purpose analytics process.2. Get up and running fast with more than two dozen commonly used powerful algorithms for predictive analytics using practical use cases.3. Implement a simple step-by-step process for predicting an outcome or discovering hidden relationships from the data using RapidMiner, an open source GUI based data mining tool

Predictive analytics and Data Mining techniques covered: Exploratory Data Analysis, Visualization, Decision trees, Rule induction, k-Nearest Neighbors, Naïve Bayesian, Artificial Neural Networks, Support Vector machines, Ensemble models, Bagging, Boosting, Random Forests, Linear regression, Logistic regression, Association analysis using Apriori and FP Growth, K-Means clustering, Density based clustering, Self Organizing Maps, Text Mining, Time series forecasting, Anomaly detection and Feature selection. Implementation files can be downloaded from the book companion site at www.LearnPredictiveAnalytics.com

Dedication
Foreword
Preface
Acknowledgments
Chapter 1. Introduction
- 1.1. What Data Mining Is
- 1.2. What Data Mining is Not
- 1.3. The Case for Data Mining
- 1.4. Types of Data Mining
- 1.5. Data Mining Algorithms
- 1.6. Roadmap for Upcoming Chapters
Chapter 2. Data Mining Process
- 2.1. Prior Knowledge
- 2.2. Data Preparation
- 2.3. Modeling
- 2.4. Application
- 2.5. Knowledge
- What’s Next?
Chapter 3. Data Exploration
- 3.1. Objectives of Data Exploration
- 3.2. Data Sets
- 3.3. Descriptive Statistics
- 3.4. Data Visualization
- 3.5. Roadmap for Data Exploration
Chapter 4. Classification
- 4.1. Decision Trees
- 4.2. Rule Induction
- 4.3. k-Nearest Neighbors
- 4.4. Naïve Bayesian
- 4.5. Artificial Neural Networks
- 4.6. Support Vector Machines
- 4.7. Ensemble Learners
Chapter 5. Regression Methods
- 5.1. Linear Regression
- 5.2. Logistic Regression
- Conclusion
Chapter 6. Association Analysis
- 6.1. Concepts of Mining Association Rules
- 6.2. Apriori Algorithm
- 6.3. FP-Growth Algorithm
- Conclusion
Chapter 7. Clustering
- Clustering to Describe the Data
- Clustering for Preprocessing
- 7.1. Types of Clustering Techniques
- 7.2. k-Means Clustering
- 7.3. DBSCAN Clustering
Chapter 8. Model Evaluation
- 8.1. Confusion Matrix (or Truth Table)
- 8.2. Receiver Operator Characteristic (ROC) Curves and Area under the Curve (AUC)
- 8.3. Lift Curves
- 8.4. Evaluating The Predictions: Implementation
- Conclusion
Chapter 9. Text Mining
- 9.1. How Text Mining Works
- 9.2. Implementing Text Mining with Clustering and Classification
- Conclusion
Chapter 10. Time Series Forecasting
- 10.1. Data-Driven Approaches
- 10.2. Model-Driven Forecasting Methods
- Conclusion
Chapter 11. Anomaly Detection
- 11.1. Anomaly Detection Concepts
- 11.2. Distance-Based Outlier Detection
- 11.3. Density-Based Outlier Detection
- 11.4. Local Outlier Factor
- Conclusion
Chapter 12. Feature Selection
- 12.1. Classifying Feature Selection Methods
- 12.2. Principal Component Analysis
- 12.3. Information Theory–Based Filtering for Numeric Data
- 12.4. Chi-Square-Based Filtering for Categorical Data
- 12.5. Wrapper-Type Feature Selection
- Conclusion
Chapter 13. Getting Started with RapidMiner
- 13.1. User Interface and Terminology
- 13.2. Data Importing and Exporting Tools
- 13.3. Data Visualization Tools
- 13.4. Data Transformation Tools
- 13.5. Sampling and Missing Value Tools
- 13.6. Optimization Tools
- Conclusion
Comparison of Data Mining Algorithms
Index
About the Authors

Purchase options

Save 50% on book bundles

Institutional subscription on ScienceDirect

Resources

Vijay Kotu

Bala Deshpande