Secure CheckoutPersonal information is secured with SSL technology.
Free ShippingFree global shipping
No minimum order.
Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research.
The book is targeted at information systems practitioners, programmers, consultants, developers, information technology managers, specification writers, data analysts, data modelers, database R&D professionals, data warehouse engineers, data mining professionals. The book will also be useful for professors and students of upper-level undergraduate and graduate-level data mining and machine learning courses who want to incorporate data mining as part of their data management knowledge base and expertise.
- Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects
- Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods
- Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks—in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization
Information systems practitioners, programmers, consultants, developers, information technology managers, specification writers, data analysts, data modelers, database R&D professionals, data warehouse engineers, data mining professionals, as well as professors and students of upper-level undergraduate and graduate-level data mining and machine learning courses who want to incorporate data mining as part of their data management knowledge base and expertise.
- LIST OF FIGURES
- LIST OF TABLES
- ABOUT THE AUTHORS
- PART I. Introduction to Data Mining
- CHAPTER 1. What’s It All About?
- 1.1. Data mining and machine learning
- 1.2. Simple examples: the weather and other problems
- 1.3. Fielded applications
- 1.4. Machine learning and statistics
- 1.5. Generalization as search
- 1.6. Data mining and ethics
- 1.7. Further reading
- CHAPTER 2. Input
- 2.1. What’s a concept?
- 2.2. What’s in an example?
- 2.3. What’s in an attribute?
- 2.4. Preparing the input
- 2.5. Further reading
- CHAPTER 3. Output
- 3.1. Tables
- 3.2. Linear models
- 3.3. Trees
- 3.4. Rules
- 3.5. Instance-based representation
- 3.6. Clusters
- 3.7. Further Reading
- CHAPTER 4. Algorithms
- 4.1. InFerring rudimentary rules
- 4.2. Statistical modeling
- 4.3. Divide-and-conquer: constructing decision trees
- 4.4. Covering algorithms: constructing rules
- 4.5. Mining association rules
- 4.6. Linear models
- 4.7. Instance-based learning
- 4.8. Clustering
- 4.9. Multi-instance learning
- 4.10. Further reading
- 4.11. Weka implementations
- CHAPTER 5. Credibility
- 5.1. Training and testing
- 5.2. Predicting performance
- 5.3. Cross-validation
- 5.4. Other estimates
- 5.5. Comparing data mining schemes
- 5.6. Predicting probabilities
- 5.7. Counting the cost
- 5.8. Evaluating numeric prediction
- 5.9. Minimum description length principle
- 5.10. Applying the MDL principle to clustering
- 5.11. Further reading
- PART II. Advanced Data Mining
- CHAPTER 6. Implementations
- 6.1. Decision trees
- 6.2. Classification rules
- 6.3. Association rules
- 6.4. Extending linear models
- 6.5. Instance-based learning
- 6.6. Numeric prediction with local linear models
- 6.7. Bayesian networks
- 6.8. Clustering
- 6.9. Semisupervised learning
- 6.10. Multi-instance learning
- 6.11. Weka implementations
- CHAPTER 7. Data Transformations
- 7.1. Attribute selection
- 7.2. Discretizing numeric attributes
- 7.3. Projections
- 7.4. Sampling
- 7.5. Cleansing
- 7.6. Transforming multiple classes to binary ones
- 7.7. Calibrating class probabilities
- 7.8. Further reading
- 7.9. Weka implementations
- CHAPTER 8. Ensemble Learning
- 8.1. Combining multiple models
- 8.2. Bagging
- 8.3. Randomization
- 8.4. Boosting
- 8.5. Additive regression
- 8.6. Interpretable ensembles
- 8.7. Stacking
- 8.8. Further reading
- 8.9. Weka implementations
- CHAPTER 9. Moving on
- 9.1. Applying data mining
- 9.2. Learning from massive datasets
- 9.3. Data stream learning
- 9.4. Incorporating domain knowledge
- 9.5. Text mining
- 9.6. Web mining
- 9.7. Adversarial situations
- 9.8. Ubiquitous data mining
- 9.9. Further reading
- PART III. The Weka Data Mining Workbench
- CHAPTER 10. Introduction to Weka
- 10.1. What’s in weka?
- 10.2. How do you use it?
- 10.3. What else can you do?
- 10.4. How do you get it?
- CHAPTER 11. The Explorer
- 11.1. Getting started
- 11.2. Exploring the explorer
- 11.3. Filtering algorithms
- 11.4. Learning algorithms
- 11.5. Metalearning algorithms
- 11.6. Clustering algorithms
- 11.7. Association-rule learners
- 11.8. Attribute selection
- CHAPTER 12. The Knowledge Flow Interface
- 12.1. Getting started
- 12.2. Components
- 12.3. Configuring and connecting the components
- 12.4. Incremental learning
- CHAPTER 13. The Experimenter
- 13.1. Getting started
- 13.2. Simple setup
- 13.3. Advanced setup
- 13.4. The analyze panel
- 13.5. Distributing processing over several machines
- CHAPTER 14. The Command-Line Interface
- 14.1. Getting started
- 14.2. The structure of weka
- 14.3. Command-line options
- CHAPTER 15. Embedded Machine Learning
- 15.1. A simple data mining application
- CHAPTER 16. Writing New Learning Schemes
- 16.1. An example classifier
- 16.2. Conventions for implementing classifiers
- CHAPTER 17. Tutorial Exercises for the Weka Explorer
- 17.1. Introduction to the explorer interface
- 17.2. Nearest-neighbor learning and decision trees
- 17.3. Classification boundaries
- 17.4. Preprocessing and parameter tuning
- 17.5. Document classification
- 17.6. Mining association rules
- No. of pages:
- © Morgan Kaufmann 2011
- 6th January 2011
- Morgan Kaufmann
- Paperback ISBN:
- eBook ISBN:
Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann.
Professor, Computer Science Department, University of Waikato, New Zealand.
Eibe Frank lives in New Zealand with his Samoan spouse and two lovely boys, but originally hails from Germany, where he received his first degree in computer science from the University of Karlsruhe. He moved to New Zealand to pursue his Ph.D. in machine learning under the supervision of Ian H. Witten, and joined the Department of Computer Science at the University of Waikato as a lecturer on completion of his studies. He is now an associate professor at the same institution. As an early adopter of the Java programming language, he laid the groundwork for the Weka software described in this book. He has contributed a number of publications on machine learning and data mining to the literature and has refereed for many conferences and journals in these areas.>
Associate Professor, Department of Computer Science, University of Waikato, Hamilton, New Zealand
Mark A. Hall holds a bachelor’s degree in computing and mathematical sciences and a Ph.D. in computer science, both from the University of Waikato. Throughout his time at Waikato, as a student and lecturer in computer science and more recently as a software developer and data mining consultant for Pentaho, an open-source business intelligence software company, Mark has been a core contributor to the Weka software described in this book. He has published a number of articles on machine learning and data mining and has refereed for conferences and journals in these areas.
Honorary Research Associate, Computer Science Department, University of Waikato, New Zealand
"...offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations."
"Co-author Witten is the author of other well-known books on data mining, and he and his co-authors of this book excel in statistics, computer science, and mathematics. Their in- depth backgrounds and insights are the strengths that have permitted them to avoid heavy mathematical derivations in explaining machine learning algorithms so they can help readers from different fields understand algorithms. I strongly recommend this book to all newcomers to data mining, especially to those who wish to understand the fundamentals of machine learning algorithms."--INFORMS Journal of Computing
"The third edition of this practical guide to machine learning and data mining is fully updated to account for technological advances since its previous printing in 2005 and is now even more closely aligned with the use of the Weka open source machine learning, data mining and data modeling application. Beginning with an introduction to data mining, the volume explores basic inputs, outputs and algorithms, the implementation of machine learning schemes and in-depth exploration of the many uses of the Weka data analysis software. Numerous illustration, tables and equations are included throughout and additional resources are available through a companion website. Witten, Frank and Hall are academics with the department of computer science at the University of Waikato, New Zealand, the home of the Weka software project."--Book News, Reference & Research
"I would recommend this book to anyone who is getting started in either data mining or machine learning and wants to learn how the fundamental algorithms work. I liked that the book slowly teaches you the different algorithms piece by piece and that there are also a lot of examples. I plan on taking a machine learning course this upcoming fall semester and feel that the book gave me great insight that the course will be based on mathematics more than I had originally expected. My favorite part of the book was the last chapter where it explains how you can solve different practical data mining scenarios using the different algorithms. If there were more chapters like the last one, the book would have been perfect. This book might not be that useful if you do not plan on using the Weka software or if you are already familiar with the various machine learning algorithms. Overall, Data Mining: Practical Machine Learning Tools and Techniques is a great book to learn about the core concepts of data mining and the Weka software suite."--ACM SIGSOFT Software Engineering Notes
"This book is a must-read for every aspiring data mining analyst. Its many examples and the technical background it imparts would be a unique and welcome addition to the bookshelf of any graduate or advanced undergraduate student. The book is written for both academic and application-oriented readers, and I strongly recommend it to any reader working in the area of machine learning and data mining."--Computing Reviews.com
Elsevier.com visitor survey
We are always looking for ways to improve customer experience on Elsevier.com.
We would like to ask you for a moment of your time to fill in a short questionnaire, at the end of your visit.
If you decide to participate, a new browser tab will open so you can complete the survey after you have completed your visit to this website.
Thanks in advance for your time.