Java Data Mining: Strategy, Standard, and Practice

1st Edition

A Practical Guide for Architecture, Design, and Implementation

Authors: Mark Hornick Erik Marcadé Sunil Venkayala
Paperback ISBN: 9780123704528
eBook ISBN: 9780080495910
Imprint: Morgan Kaufmann
Published Date: 7th November 2006
Page Count: 544
49.95 + applicable tax
39.99 + applicable tax
64.95 + applicable tax
Unavailable
Compatible Not compatible
VitalSource PC, Mac, iPhone & iPad Amazon Kindle eReader
ePub & PDF Apple & PC desktop. Mobile devices (Apple & Android) Amazon Kindle eReader
Mobi Amazon Kindle eReader Anything else

Institutional Access


Table of Contents

Preface Guide to Readers Part I - Strategy 1. Overview of Data Mining 1.1. Why is data mining relevant today? 1.2. Introducing Data Mining 1.3. The Value of Data Mining 1.4. Summary 1.5. References 2. Solving Problems in Industry 2.1. Cross-industry data mining solutions 2.2. Data Mining in Industries 2.3. Summary 2.4. References 3. Data Mining Process 3.1. A standardized data mining process 3.2. Data Analysis and Preparation…a more detailed view 3.3. Data mining modeling, analysis, and scoring processes 3.4. The Role of databases and data warehouses in Data Mining 3.5. Data mining in enterprise software architectures 3.6. Advances in automated data mining 3.7. Summary 3.8. References 4. Mining Functions and Algorithms 4.1. Data mining functions 4.2. Classification 4.3. Regression 4.4. Attribute Importance 4.5. Association 4.6. Clustering 4.7. Summary 4.8. References 5. JDM Strategy 5.1. What is the JDM strategy? 5.2. Role of Standards 5.3. Summary 5.4. References 6. Getting Started 6.1. Business Understanding 6.2. Data Understanding 6.3. Data Preparation 6.4. Modeling 6.5. Evaluation 6.6. Deployment 6.7. Summary 6.8. References Part II - Standard 7. Java Data Mining Concepts 7.1. Classification problem

Description

Whether you are a software developer, systems architect, data analyst, or business analyst, if you want to take advantage of data mining in the development of advanced analytic applications, Java Data Mining, JDM, the new standard now implemented in core DBMS and data mining/analysis software, is a key solution component. This book is the essential guide to the usage of the JDM standard interface, written by contributors to the JDM standard.

Key Features

  • Data mining introduction - an overview of data mining and the problems it can address across industries; JDM's place in strategic solutions to data mining-related problems
  • JDM essentials - concepts, design approach and design issues, with detailed code examples in Java; a Web Services interface to enable JDM functionality in an SOA environment; and illustration of JDM XML Schema for JDM objects
  • JDM in practice - the use of JDM from vendor implementations and approaches to customer applications, integration, and usage; impact of data mining on IT infrastructure; a how-to guide for building applications that use the JDM API
  • Free, downloadable KJDM source code referenced in the book available here

Readership

This book is for software developers and applications architects interested in or who need data mining analysis as part of their application. It can be used by both novice and advanced java developers as a reference for incorporating data mining into applications, leveraging the sample code provided. For example, a Java developer may know he wants to classify a customer's interest in a product, but doesn't know how to get started. This book provides a quick start for using data mining in a practical context. On the other hand, experienced data miners who use Java will also gain benefits by seeing working code of how to use JSM to accomplish mining task


Details

No. of pages:
544
Language:
English
Copyright:
© Morgan Kaufmann 2007
Published:
Imprint:
Morgan Kaufmann
eBook ISBN:
9780080495910
Paperback ISBN:
9780123704528

Reviews

"This is not only a great introduction to JDM, but also a great introduction for a practitioner to data mining in general. This is a “must-have" for anyone developing large-scale data mining applications in Java." --Robert Grossman, Open Data Group and University of Illinois at Chicago

"It pleases me that the Java Community ProcessSM(JCPSM) Program could host the development of the Data Mining standard, JSR 73, whose evolution and usability are presented so compellingly in Java Data Mining: Standard, Strategy, and Practice. The authors have taken a unique approach to describing a broad range of aspects from strategies to problem solving with data mining technology in a variety of industries. The book is a ”must-read” for those who want to introduce themselves to Java data mining (JDM) and fully realize the strategic importance of this technology in an ever competitive environment." --Onno Kluyt, senior director, JCP Program at Sun Microsystems, Inc., and chair of the JCP

"Java is now ubiquitous and over the past few years the Java world has shifted focus on--among other things--new frameworks, such as the Java Data Mining (JDM) framework. JDM addresses a clear need for standardization in data mining operations, yet to those approaching both Java and data mining the mountain seems as Everest. Hornick, Marcadé, and Venkayala could not have written this book at a better time. To the expert it is reference and map of the landscape, and to the novice it will be a constant guide and companion to each journey in JDM. This book is approachable, usable, practical, and necessary for any Java data mining software architect, developer, or analyst." --Frank Byrum, Chief Scientist, CorMine Intelligent Data, LLC


About the Authors

Mark Hornick Author

Mark Hornick has lead the Java Data Mining (JSR-73) expert group since its inception in July of 2000, and now leads the JSR-247 expert group working towards JDM 2.0. Mr. Hornick brings nearly 20 years experience in the design and implementation of advanced distributed systems, including in-database data mining, distributed object management, and Java APIs. Mr. Hornick is a senior manager in Oracle’s Data Mining Technologies group. Mr. Hornick joined Oracle through Oracle’s acquisition of Thinking Machines Corporation in 1999. Prior to Thinking Machines, where he served as architect for TMC’s next generation data mining software, Mr. Hornick was a Principal Investigator at GTE Laboratories, involved in advanced telecommunications network management software, distributed transaction management research, and distributed object management research. Mr. Hornick has contributed to several other data mining standards, including the Data Mining Group’s PMML, ISO SQL/MM for Data Mining, and the Object Management Group’s Common Warehouse Metadata. He has given talks at the International Conference on Knowledge Discovery and Databases, JavaOne, JavaPro Live!, and The ServerSide Symposium on data mining standards and JDM. He has also published various papers and articles over his career. Mr. Hornick holds a bachelor degree from Rutgers University in Computer Science, and a masters degree from Brown University, also in Computer science where he specialized in distributed object databases.

Affiliations and Expertise

Sr. Manager, Data Mining Technologies, Oracle Corporation, Burlington, MA

Erik Marcadé Author

With over 17 years of experience in the neural network industry, Erik Marcade, founder and chief technical officer for KXEN, is responsible for software development and information technologies. Prior to founding KXEN, Mr. Marcade developed real-time software expertise at Cadence Design Systems, accountable for advancing real-time software systems as well as managing “system-on-a-chip” projects. Before joining Cadence, Mr. Marcade spearheaded a project to restructure the marketing database of the largest French automobile manufacturer for Atos, a leading European information technology services company. In 1990, Mr. Marcade co-founded Mimetics, a French company that processes and sells development environment, optical character recognition (OCR) products and services using neural network technology. Prior to Mimetics, Mr. Marcade joined Thomson-CSF Weapon System Division as a software engineer and project manager working on the application of artificial intelligence for projects in weapons allocation, target detection and tracking, geo-strategic assessment, and software quality control. He contributed to the creation of Thomson Research Laboratories in Palo Alto, CA (Pacific Rim Operation—PRO) as senior software engineer. There he collaborated with Stanford University on the automatic landing and flare system for Boeing, and Kestrel Institute, a non-profit computer science research organization. He returned to France to head Esprit projects on neural networks development. Mr. Marcade holds an engineering degree from Ecole de l’Aeronautique et de l’Espace, specializing in process control, signal processing, computer science, and artificial intelligence

Affiliations and Expertise

Founder and Chief Technical Officer, KXEN, Paris, France

Sunil Venkayala Author

J2EE and XML group leader and Principal Member of Technical Staff at Oracle Data Mining Technologies. Expert group member of Java Data Mining (JDM) standard developed under JSR-73. More than five years experience in developing applications using predictive technologies available in the Oracle Database. More than seven years of experience in working with Java and Internet technologies. Authored JDM article in Java Developer Journal. Holds a B.S in Engineering and Masters in Industrial Management from Indian Institute Of Technology, Kanpur.

Affiliations and Expertise

Principal Member of Technical Staff, Oracle, Burlington, MA