Data Mining: Know It All

1st Edition - October 31, 2008
Authors: Soumen Chakrabarti, Earl Cox, Eibe Frank, Ralf Hartmut Güting, Jiawei Han, Xia Jiang, Micheline Kamber, Sam S. Lightstone, Thomas P. Nadeau, Richard E. Neapolitan, Dorian Pyle, Mamdouh Refaat, Markus Schneider, Toby J. Teorey, Ian H. Witten
Language: English
Hardback ISBN:
9 7 8 - 0 - 1 2 - 3 7 4 6 2 9 - 0
eBook ISBN:
9 7 8 - 0 - 0 8 - 0 8 7 7 8 8 - 4

This book brings all of the elements of data mining together in a single volume, saving the reader the time and expense of making multiple purchases. It consolidates both… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

This book brings all of the elements of data mining together in a single volume, saving the reader the time and expense of making multiple purchases. It consolidates both introductory and advanced topics, thereby covering the gamut of data mining and machine learning tactics ? from data integration and pre-processing, to fundamental algorithms, to optimization techniques and web mining methodology. The proposed book expertly combines the finest data mining material from the Morgan Kaufmann portfolio. Individual chapters are derived from a select group of MK books authored by the best and brightest in the field. These chapters are combined into one comprehensive volume in a way that allows it to be used as a reference work for those interested in new and developing aspects of data mining. This book represents a quick and efficient way to unite valuable content from leading data mining experts, thereby creating a definitive, one-stop-shopping opportunity for customers to receive the information they would otherwise need to round up from separate sources.

Soumen Chakrabarti

Soumen Chakrabarti is assistant Professor in Computer Science and Engineering at the Indian Institute of Technology, Bombay. Prior to joining IIT, he worked on hypertext databases and data mining at IBM Almaden Research Center. He has developed three systems and holds five patents in this area. Chakrabarti has served as a vice-chair and program committee member for many conferences, including WWW, SIGIR, ICDE, and KDD, and as a guest editor of the IEEE TKDE special issue on mining and searching the Web. His work on focused crawling received the Best Paper award at the 8th International World Wide Web Conference (1999). He holds a Ph.D. from the University of California, Berkeley.

Affiliations and expertise

Assistant Professor of Computer Science, Indian Institute of Technology, Bombay, India

Earl Cox

Earl founded and serves as President of, Scianta Intelligence, a next generation machine intelligence and knowledge exploration company. He is a futurist, author, management consultant, and educator involved in discovering the epistemology of advanced intelligent systems, the redefinition of the machine mind, and, as a pioneer of Internet-based technologies, the way in which evolving inter-connected virtual worlds will affect the sociology of business and culture in the near and far future.

Earl has over thirty years experience in managing and participating in the software development process at the system as well as tightly integrated application level. In the area of advanced machine intelligence technologies, Earl is a recognized expert in fuzzy logic, and adaptive fuzzy systems as they are applied to information and decision theory. He has pioneered the integration of fuzzy neural systems with genetic algorithms and case-based reasoning. As an industry observer and futurist, Earl has written and talked extensively on the philosophy of the Response to Change, the nature of Emergent Intelligence, and the Meaning of Information Entropy in Mind and Machine.

Affiliations and expertise

President, Scianta Intelligence, LLC, Chapel Hill, NC, USA

Eibe Frank

Eibe Frank lives in New Zealand with his Samoan spouse and two lovely boys, but originally hails from Germany, where he received his first degree in computer science from the University of Karlsruhe. He moved to New Zealand to pursue his Ph.D. in machine learning under the supervision of Ian H. Witten and joined the Department of Computer Science at the University of Waikato as a lecturer on completion of his studies. He is now an associate professor at the same institution. As an early adopter of the Java programming language, he laid the groundwork for the Weka software described in this book. He has contributed a number of publications on machine learning and data mining to the literature and has refereed for many conferences and journals in these areas.

Affiliations and expertise

Computer Science Department, University of Waikato, New Zealand.

Jiawei Han

Jiawei Han is Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Well known for his research in the areas of data mining and database systems, he has received many awards for his contributions in the field, including the 2004 ACM SIGKDD Innovations Award. He has served as Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data, and on editorial boards of several journals, including IEEE Transactions on Knowledge and Data Engineering and Data Mining and Knowledge Discovery.

Affiliations and expertise

Professor, Department of Computer ScienceUniversity of Illinois, Urbana Champaign, USA

Xia Jiang

Affiliations and expertise

University of Pittsburgh, PA, USA

Micheline Kamber

Micheline Kamber is a researcher with a passion for writing in easy-to-understand terms. She has a master's degree in computer science (specializing in artificial intelligence) from Concordia University, Canada.

Affiliations and expertise

Simon Fraser University, Burnaby, Canada

Sam S. Lightstone

Sam Lightstone is a Senior Technical Staff Member and Development Manager with IBM’s DB2 product development team. His work includes numerous topics in autonomic computing and relational database management systems. He is cofounder and leader of DB2’s autonomic computing R&D effort. He is Chair of the IEEE Data Engineering Workgroup on Self Managing Database Systems and a member of the IEEE Computer Society Task Force on Autonomous and Autonomic Computing. In 2003 he was elected to the Canadian Technical Excellence Council, the Canadian affiliate of the IBM Academy of Technology. He is an IBM Master Inventor with over 25 patents and patents pending; he has published widely on autonomic computing for relational database systems. He has been with IBM since 1991.

Affiliations and expertise

Senior Technical Staff Member and Development Manager, IBM, Toronto, Canada

Richard E. Neapolitan

Richard E. Neapolitan is professor and Chair of Computer Science at Northeastern Illinois University. He has previously written four books including the seminal 1990 Bayesian network text Probabilistic Reasoning in Expert Systems. More recently, he wrote the 2004 text Learning Bayesian Networks, the textbook Foundations of Algorithms, which has been translated to three languages and is one of the most widely-used algorithms texts world-wide, and the 2007 text Probabilistic Methods for Financial and Marketing Informatics (Morgan Kaufmann Publishers).

Affiliations and expertise

Professor and Chair of Computer Science, Northeastern Illinois University, Chicago, USA

Dorian Pyle

Dorian Pyle is Chief Scientist and Founder of PTI (www.pti.com), which develops and markets Powerhouse™ predictive and explanatory analytics software. Dorian has over 20 years experience in artificial intelligence and machine learning techniques which are used in what is known today as “data mining” or “predictive analytics”. He has applied this knowledge as a consultant with Knowledge Stream Partners, Xchange, Naviant, Thinking Machines, and Data Miners and with various companies directly involved in credit card marketing for banks and with manufacturing companies using industrial automation. In 1976 he was involved in building artificially intelligent machine learning systems utilizing the pioneering technologies that are currently known as neural computing and associative memories. He is current in and familiar with using the most advanced technologies in data mining including: entropic analysis (information theory), chaotic and fractal decomposition, neural technologies, evolution and genetic optimization, algebra evolvers, case-based reasoning, concept induction and other advanced statistical techniques.

Affiliations and expertise

Chief Scientist and Founder of PTI, Leominster, MA, USA

Mamdouh Refaat

Mamdouh Refaat is a data mining and business analytics consultant advising major organizations in North America and Europe. He has held several positions in consulting organizations and software vendors, including the director of consulting services at ANGOSS Software Corporation, a global data mining software and service provider. During his career, Mamdouh has managed numerous data mining consulting projects in marketing, CRM, and credit risk for Fortune 500 organizations in North America and Europe. In addition, he has delivered over 50 professional training courses in data mining and business analytics. Mamdouh holds a Ph.D. in Engineering from the University of Toronto, and an MBA from the University of Leeds.

During his career, Mamdouh has managed numerous data mining consulting projects in marketing, CRM, and credit risk for Fortune 500 organizations in North America and Europe. In addition, he has delivered over 50 professional training courses in data mining and business analytics.

Mamdouh holds a PhD in Engineering from the University of Toronto, and an MBA from the University of Leeds.

Affiliations and expertise

Data Mining and Business Analytics Consultant

Markus Schneider

Markus Schneider is an Assistant Professor in the Computer Science Department of the University of Florida and holds a doctoral degree in Computer Science from the University of Hagen, Germany. He is author of a monograph in the area of spatial databases and of a German textbook on implementation concepts for database systems, and has published about 40 articles on database systems. He is on the editorial board of GeoInformatica.

Affiliations and expertise

Assistant Professor, Computer Science Department, University of Florida, Gainesville, FL, USA

Toby J. Teorey

Toby J. Teorey is a professor in the Electrical Engineering and Computer Science Department at the University of Michigan, Ann Arbor. He received his B.S. and M.S. degrees in electrical engineering from the University of Arizona, Tucson, and a Ph.D. in computer sciences from the University of Wisconsin, Madison. He was general chair of the 1981 ACM SIGMOD Conference and program chair for the 1991 Entity-Relationship Conference. Professor Teorey’s current research focuses on database design and data warehousing, OLAP, advanced database systems, and performance of computer networks. He is a member of the ACM and the IEEE Computer Society.

Affiliations and expertise

Professor, Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, USA

Ian H. Witten

Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography.

Affiliations and expertise

Computer Science Department, University of Waikato, New Zealand