By
J. Quinlan
Description
Classifier systems play a major role in machine learning and knowledge-based systems, and Ross Quinlan's work on ID3 and C4.5 is widely
acknowledged to have made some of the most significant contributions to their development. This book is a complete guide to the C4.5
system as implemented in C for the UNIX environment. It contains a comprehensive guide to the system's use , the source code (about 8,800
lines), and implementation notes. The source code and sample datasets are also available for download (see below).
C4.5 starts
with large sets of cases belonging to known classes. The cases, described by any mixture of nominal and numeric properties, are scrutinized
for patterns that allow the classes to be reliably discriminated. These patterns are then expressed as models, in the form of decision
trees or sets of if-then rules, that can be used to classify new cases, with emphasis on making the models understandable as well as
accurate. The system has been applied successfully to tasks involving tens of thousands of cases described by hundreds of properties.
The book starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such
as missing data and over hitting. Advantages and disadvantages of the C4.5 approach are discussed and illustrated with several case studies.
This book and software should be of interest to developers of classification-based intelligent systems and to students in machine
learning and expert systems courses.