Data Mining: Concepts and Techniques
By- Jiawei Han, University of Illinois, Urbana Champaign
- Micheline Kamber, Simon Fraser University, Burnaby, Canada
- Jian Pei, Simon Fraser University, Burnaby, Canada
The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, its still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge.
Since the previous editions publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply todays most powerful data mining techniques to meet real business challenges.
Audience
Data warehouse engineers, data mining professionals, database researchers, statisticians, data analysts, data modelers, and other data professionals working on data mining at the R&D and implementation levels. And upper-level undergrads and graduate students in data mining at computer science programs.
Hardbound, 744 Pages
Published: June 2011
Imprint: Morgan Kaufmann
ISBN: 978-0-12-381479-1
Reviews
-
"[A] well-written textbook (2nd ed., 2006; 1st ed., 2001) on data mining or knowledge discovery. The text is supported by a strong outline. The authors preserve much of the introductory material, but add the latest techniques and developments in data mining, thus making this a comprehensive resource for both beginners and practitioners. The focus is data-all aspects. The presentation is broad, encyclopedic, and comprehensive, with ample references for interested readers to pursue in-depth research on any technique. Summing Up: Highly recommended. Upper-division undergraduates through professionals/practitioners."--CHOICE"This interesting and comprehensive introduction to data mining emphasizes the interest in multidimensional data mining--the integration of online analytical processing (OLAP) and data mining. Some chapters cover basic methods, and others focus on advanced techniques. The structure, along with the didactic presentation, makes the book suitable for both beginners and specialized readers."--
ACMs Computing Reviews.com We are living in the data deluge age. TheData Mining: Concepts and Techniques shows us how to find useful knowledge in all that data. Thise 3rd editionThird Edition significantly expands the core chapters on data preprocessing, frequent pattern mining, classification, and clustering. The bookIt also comprehensively covers OLAP and outlier detection, and examines mining networks, complex data types, and important application areas. The book, with its companion website, would make a great textbook for analytics, data mining, and knowledge discovery courses.--Gregory Piatetsky, President, KDnuggets Jiawei, Micheline, and Jian give an encyclopaedic coverage of all the related methods, from the classic topics of clustering and classification, to database methods (association rules, data cubes) to more recent and advanced topics (SVD/PCA , wavelets, support vector machines) . Overall, it is an excellent book on classic and modern data mining methods alike, and it is ideal not only for teaching, but as a reference book.-From the foreword by Christos Faloutsos, Carnegie Mellon University "A very good textbook on data mining, this third edition reflects the changes that are occurring in the data mining field. It adds cited material from about 2006, a new section on visualization, and pattern mining with the more recent cluster methods. Its a well-written text, with all of the supporting materials an instructor is likely to want, including Web material support, extensive problem sets, and solution manuals. Though it serves as a data mining text, readers with little experience in the area will find it readable and enlightening. That being said, readers are expected to have some coding experience, as well as database design and statistics analysis knowledge Two additional items are worthy of note: the texts bibliography is an excellent reference list for mining research; and the index is very complete, which makes it easy to locate information. Also, researchers and analysts from other disciplines--for example, epidemiologists, financial analysts, and psychometric researchers--may find the material very useful."--Computing Reviews "Han (engineering, U. of Illinois-Urbana-Champaign), Micheline Kamber, and Jian Pei (both computer science, Simon Fraser U., British Columbia) present a textbook for an advanced undergraduate or beginning graduate course introducing data mining. Students should have some background in statistics, database systems, and machine learning and some experience programming. Among the topics are getting to know the data, data warehousing and online analytical processing, data cube technology, cluster analysis, detecting outliers, and trends and research frontiers. Chapter-end exercises are included."--SciTech Book News "This book is an extensive and detailed guide to the principal ideas, techniques and technologies of data mining. The book is organised in 13 substantial chapters, each of which is essentially standalone, but with useful references to the books coverage of underlying concepts. A broad range of topics are covered, from an initial overview of the field of data mining and its fundamental concepts, to data preparation, data warehousing, OLAP, pattern discovery and data classification. The final chapter describes the current state of data mining research and active research areas." --BCS.org
Contents
Chapter 1. Introduction
1 What Motivated Data Mining? Why Is It Important?
2 So, What Is Data Mining?
3 Data Mining--On What Kind of Data?4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined?
5 Are All of the Patterns Interesting?6 Classification of Data Mining Systems
7 Data Mining Task Primitives8 Integration of a Data Mining System with a Database or Data Warehouse System
9 Major Issues in Data Mining10 Summary
ExercisesBibliographic Notes
Chapter 2. Getting to Know Your Data
1. Types of Data Sets and Attribute Values2. Basic Statistical Descriptions of Data
3. Data Visualization4. Measuring Data Similarity
5. SummaryExercises
Bibliographic NotesChapter 3. Preprocessing1. Data Quality
2. Major Tasks in Data Preprocessing3. Data Reduction
4. Data Transformation and Data Discretization5. Data Cleaning and Data Integration
6. SummaryExercises
Bibliographic NotesChapter 4. Data Warehousing and On-Line Analytical Processing1. Data Warehouse: Basic Concepts
2. Data Warehouse Modeling: Data Cube and OLAP3. Data Warehouse Design and Usage
4. Data Warehouse Implementation5. Data Generalization by Attribute-Oriented Induction
6. SummaryExercises
Bibliographic NotesChapter 5. Data Cube Technology1. Efficient Methods for Data Cube Computation
2. Exploration and Discovery in Multidimensional Databases3.. Summary
ExercisesBibliographic Notes
Chapter 6. Mining Frequent Patterns, Associations and Correlations: Concepts and
Methods1. Basic Concepts
2. E±cient and Scalable Frequent Itemset Mining Methods3. Are All the Pattern Interesting?|Pattern Evaluation Methods
4. Applications of frequent pattern and associations5. Summary
ExercisesChapter 7. Advanced Frequent Pattern Mining1. Frequent Pattern and Association Mining: A Road Map
2. Mining Various Kinds of Association Rules3. Constraint-Based Frequent Pattern Mining
4. Extended Applications of Frequent Patterns5. Summary
ExercisesBibliographic Notes
Chapter 8. Classification: Basic Concepts
1. Classification: Basic Concepts2. Decision Tree Induction
3. Bayes Classi¯cation Methods4. Rule-Based Classi¯cation
5. Model Evaluation and Selection6. Techniques to Improve Classi¯cation Accuracy: Ensemble Methods
7. Handling Di®erent Kinds of Cases in Classi¯cation8. Summary
ExercisesBibliographic Notes
Chapter 9. Classification: Advanced Methods
1. Bayesian Belief Networks2. Classi¯cation by Neural Networks
3. Support Vector Machines4. Pattern-Based Classi¯cation
5. Lazy Learners (or Learning from Your Neighbors)6. Other Classi¯cation Methods
7. SummaryExercises
Bibliographic NotesChapter 10. Cluster Analysis: Basic Concepts and Methods1. Cluster Analysis: Basic Concepts
2. Clustering structures3. Major Clustering Approaches
4. Partitioning Methods5. Hierarchical Methods
6. Density-Based Methods7. Model-Based Clustering: The Expectation-Maximization Method
8. Other Clustering Techniques9. Summary
ExercisesBibliographic Notes
Chapter 11. Advanced Cluster Analysis
1. Clustering High-Dimensional Data2. Constraint-Based and User-Guided Cluster Analysis
3. Link-Based Cluster Analysis4. Semi-Supervised Clustering and Classi¯cation
5. Bi-Clustering6. Collaborative ¯ltering
7. SummaryExercises
Bibliographic NotesChapter 12. Outlier Analysis1. Why outlier analysis? Identifying and handling of outliers
2. Distribution-Based Outlier Detection: A Statistics-Based Approach3. Classi¯cation-Based Outlier Detection
4. Clustering-Based Outlier Detection5. Deviation-Based Outlier Detection
6. Isolation-Based Method: From Isolation Tree to Isolation Forest7. Summary
ExercisesBibliographic Notes
Chapter 13. Trends and Research Frontiers in Data Mining
1. Mining Complex Types of Data2. Advanced Data Mining Applications
3. Data Mining System Products and Research Prototypes4. Social Impacts of Data Mining
5. Trends in Data Mining6. Summary
ExercisesBibliographic Notes
Appendix A: An Introduction to Microsoft's OLE DB for Data Mining

