1 What is Data Mining?
1.1 Big Data
1.1.1 The Data Warehouse
1.2 Types of Data-Mining Problems
1.3 The Pedigree of Data Mining
1.3.3 Machine Learning
1.4 Is Big Better?
1.4.1 Strong Statistical Evaluation
1.4.2 More Intensive Search
1.4.3 More Controlled Experiments
1.4.4 Is Big Necessary
1.5 The Tasks of Predictive Data Mining
1.5.1 Data Preparation
1.5.2 Data Reduction
1.5.3 Data Modeling and Prediction
1.5.4 Case and Solution Analyses
1.6 Data Mining: Art or Science
1.7 An Overview of the Book
1.8 Bibliographic and Historical Remarks
2 Statistical Evaluation for Big Data
2.1 The Idealized Model
2.1.1 Classical Statistical Comparison and Evaluation
2.2 It's Big but Is It Biased
2.2.1 Objective Versus Survey Data
2.2.2 Significance and Predictive Value
18.104.22.168 Too Many Comparisons?
2.3 Classical Types of Statistical Prediction
2.3.1 Predicting True-or-False: Classification
22.214.171.124 Error Rates
2.3.2 Forecasting Numbers: Regression
126.96.36.199 Distance Measures
2.4 Measuring Predictive Performance
2.4.1 Independent Testing
188.8.131.52 Random Training and Testing
184.108.40.206 How Accurate Is the Error Estimate?
220.127.116.11 Comparing Results for Error Measures
18.104.22.168 Ideal or Real-World Sampling?
22.214.171.124 Training and Testing from Different Time Periods
2.5 Too Much Searching and Testing?
2.6 Why Are Errors Made?
- No. of pages:
- © 1997
1st August 1997
- Electronic ISBN:
- Print ISBN:
Nitin Indurkhya is on the faculty at the Basser Department of Computer Science, University of Sydney, Australia. He has published extensively on Data Mining and Machine Learning and has considerable experience with industrial data-mining applications in Australia, Japan and the USA.
"I enjoy reading PREDICTIVE DATA MINING. It presents an excellent perspective on the theory and practice of data mining. It can help educate statisticians to build alliances between statisticians and
--Emanuel Parzen, Distinguished Professor of Statistics, Texas A&M University