
Data Mining
Practical Machine Learning Tools and Techniques
Resources
Description
Key Features
- Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects
- Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods
- Includes a downloadable WEKA software toolkit, a comprehensive collection of machine learning algorithms for data mining tasks-in an easy-to-use interactive interface
- Includes open-access online courses that introduce practical applications of the material in the book
Readership
Data analysts, data scientists, data architects. Business analysts, computer science students taking courses in data mining and machine learning
Table of Contents
Part I: Introduction to data mining
Chapter 1. What’s it all about?
- Abstract
- 1.1 Data Mining and Machine Learning
- 1.2 Simple Examples: The Weather Problem and Others
- 1.3 Fielded Applications
- 1.4 The Data Mining Process
- 1.5 Machine Learning and Statistics
- 1.6 Generalization as Search
- 1.7 Data Mining and Ethics
- 1.8 Further Reading and Bibliographic Notes
Chapter 2. Input: Concepts, instances, attributes
- Abstract
- 2.1 What’s a Concept?
- 2.2 What’s in an Example?
- 2.3 What’s in an Attribute?
- 2.4 Preparing the Input
- 2.5 Further Reading and Bibliographic Notes
Chapter 3. Output: Knowledge representation
- Abstract
- 3.1 Tables
- 3.2 Linear Models
- 3.3 Trees
- 3.4 Rules
- 3.5 Instance-Based Representation
- 3.6 Clusters
- 3.7 Further Reading and Bibliographic Notes
Chapter 4. Algorithms: The basic methods
- Abstracts
- 4.1 Inferring Rudimentary Rules
- 4.2 Simple Probabilistic Modeling
- 4.3 Divide-and-Conquer: Constructing Decision Trees
- 4.4 Covering Algorithms: Constructing Rules
- 4.5 Mining Association Rules
- 4.6 Linear Models
- 4.7 Instance-Based Learning
- 4.8 Clustering
- 4.9 Multi-instance Learning
- 4.10 Further Reading and Bibliographic Notes
- 4.11 Weka Implementations
Chapter 5. Credibility: Evaluating what’s been learned
- Abstract
- 5.1 Training and Testing
- 5.2 Predicting Performance
- 5.3 Cross-Validation
- 5.4 Other Estimates
- 5.5 Hyperparameter Selection
- 5.6 Comparing Data Mining Schemes
- 5.7 Predicting Probabilities
- 5.8 Counting the Cost
- 5.9 Evaluating Numeric Prediction
- 5.10 The MDL Principle
- 5.11 Applying the MDL Principle to Clustering
- 5.12 Using a Validation Set for Model Selection
- 5.13 Further Reading and Bibliographic Notes
Part II: More advanced machine learning schemes
Chapter 6. Trees and rules
- Abstract
- 6.1 Decision Trees
- 6.2 Classification Rules
- 6.3 Association Rules
- 6.4 Weka Implementations
Chapter 7. Extending instance-based and linear models
- Abstract
- 7.1 Instance-Based Learning
- 7.2 Extending Linear Models
- 7.3 Numeric Prediction With Local Linear Models
- 7.4 Weka Implementations
Chapter 8. Data transformations
- Abstracts
- 8.1 Attribute Selection
- 8.2 Discretizing Numeric Attributes
- 8.3 Projections
- 8.4 Sampling
- 8.5 Cleansing
- 8.6 Transforming Multiple Classes to Binary Ones
- 8.7 Calibrating Class Probabilities
- 8.8 Further Reading and Bibliographic Notes
- 8.9 Weka Implementations
Chapter 9. Probabilistic methods
- Abstract
- 9.1 Foundations
- 9.2 Bayesian Networks
- 9.3 Clustering and Probability Density Estimation
- 9.4 Hidden Variable Models
- 9.5 Bayesian Estimation and Prediction
- 9.6 Graphical Models and Factor Graphs
- 9.7 Conditional Probability Models
- 9.8 Sequential and Temporal Models
- 9.9 Further Reading and Bibliographic Notes
- 9.10 Weka Implementations
Chapter 10. Deep learning
- Abstract
- 10.1 Deep Feedforward Networks
- 10.2 Training and Evaluating Deep Networks
- 10.3 Convolutional Neural Networks
- 10.4 Autoencoders
- 10.5 Stochastic Deep Networks
- 10.6 Recurrent Neural Networks
- 10.7 Further Reading and Bibliographic Notes
- 10.8 Deep Learning Software and Network Implementations
- 10.9 WEKA Implementations
Chapter 11. Beyond supervised and unsupervised learning
- Abstract
- 11.1 Semisupervised Learning
- 11.2 Multi-instance Learning
- 11.3 Further Reading and Bibliographic Notes
- 11.4 WEKA Implementations
Chapter 12. Ensemble learning
- Abstract
- 12.1 Combining Multiple Models
- 12.2 Bagging
- 12.3 Randomization
- 12.4 Boosting
- 12.5 Additive Regression
- 12.6 Interpretable Ensembles
- 12.7 Stacking
- 12.8 Further Reading and Bibliographic Notes
- 12.9 WEKA Implementations
Chapter 13. Moving on: applications and beyond
- Abstract
- 13.1 Applying Machine Learning
- 13.2 Learning From Massive Datasets
- 13.3 Data Stream Learning
- 13.4 Incorporating Domain Knowledge
- 13.5 Text Mining
- 13.6 Web Mining
- 13.7 Images and Speech
- 13.8 Adversarial Situations
- 13.9 Ubiquitous Data Mining
- 13.10 Further Reading and Bibliographic Notes
- 13.11 WEKA Implementations
Appendix A. Theoretical foundations
- A.1 Matrix Algebra
- A.2 Fundamental Elements of Probabilistic Methods
Appendix B. The WEKA workbench
- B.1 What’s in WEKA?
- B.2 The package management system
- B.3 The Explorer
- B.4 The Knowledge Flow Interface
- B.5 The Experimenter
Product details
- No. of pages: 654
- Language: English
- Copyright: © Morgan Kaufmann 2016
- Published: October 1, 2016
- Imprint: Morgan Kaufmann
- Paperback ISBN: 9780128042915
- eBook ISBN: 9780128043578
About the Authors
Ian H. Witten
Affiliations and Expertise
Eibe Frank
Affiliations and Expertise
Mark A. Hall
Affiliations and Expertise
Christopher J. Pal
Affiliations and Expertise
Ratings and Reviews
Latest reviews
(Total rating for all reviews)
RobertoBatista Mon Oct 22 2018
A good introduction to Data Mining
This book contains what is necessary for you to get the track on Data Mining.
VictoriaNemzer Sun Aug 26 2018
Very thick book, but examples
Very thick book, but examples are perfect