Cluster Analysis for Applications

Probability and Mathematical Statistics: A Series of Monographs and Textbooks

1st Edition - November 28, 1973
Author: Michael R. Anderberg
Language: English
eBook ISBN:
9 7 8 - 1 - 4 8 3 1 - 9 1 3 9 - 3

Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis. Comprised of 10 chapters, this book begins with an introduction to the subject of cluster analysis and its uses as well as category sorting problems and the need for cluster analysis algorithms. The next three chapters give a detailed account of variables and association measures, with emphasis on strategies for dealing with problems containing variables of mixed types. Subsequent chapters focus on the central techniques of cluster analysis with particular reference to computational considerations; interpretation of clustering results; and techniques and strategies for making the most effective use of cluster analysis. The final chapter suggests an approach for the evaluation of alternative clustering methods. The presentation is capped with a complete set of implementing computer programs listed in the Appendices to make the use of cluster analysis as painless and free of mechanical error as is possible. This monograph is intended for students and workers who have encountered the notion of cluster analysis.

Preface

Acknowledgements

Chapter 1. The Broad View of Cluster Analysis

1.1 Category Sorting Problems

1.2 Need for Cluster Analysis Algorithms

1.3 Uses of Cluster Analysis

1.4 Literature of Cluster Analysis

1.5 Purpose of This Book

Chapter 2. Conceptual Problems in Cluster Analysis

2.1 Elements of a Cluster Analysis

2.2 Illustrative Example

2.3 Some Philosophical Observations

2.4 A Note on Optimality and Intuition

Chapter 3. Variables and Scales

3.1 Classification of Variables

3.2 Scale Conversions

3.3 The Application of Scale Conversions

Chapter 4. Measures of Association among Variables

4.1 Measures between Ratio and Interval Variables

4.2 Measures between Nominal Variables

4.3 Measures between Binary Variables

4.4 Strategies for Mixed Variable Data Sets

Chapter 5. Measures of Association among Data Units

5.1 Metric Measures for Interval Variables

5.2 Nonmetric Measures for Interval Variables

5.3 Measures Using Binary Variables

5.4 Measures Using Nominal Variables

5.5 Mixed Variable Strategies

Chapter 6. Hierarchical Clustering Methods

6.1 The Central Agglomerative Procedure

6.2 The Stored Matrix Approach

6.3 The Stored Data Approach

6.4 The Sorted Matrix Approach

6.5 Other Approaches

Chapter 7. Nonhierarchical Clustering Methods

7.1 Initial Configurations

7.2 Nearest Centroid Sorting—Fixed Number of Clusters

7.3 Nearest Centroid Sorting—Variable Number of Clusters

7.4 Other Approaches to Nonhierarchical Clustering

Chapter 8. Promoting Interpretation of Clustering Results

8.1 Aids to Interpreting Hierarchical Classifications

8.2 An Aid to Interpreting a Partition of Data Units into Clusters

Chapter 9. Strategies for Using Cluster Analysis

9.1 Sequential Clustering of Data Units

9.2 Complementary Use of Several Clustering Methods

9.3 Cluster Analysis as an Adjunct to Other Statistical Methods

9.4 Clustering with Respect to an External Criterion

9.5 The Need for Research on Strategies

Chapter 10. Comparative Evaluation of Cluster Analysis Methods

10.1 An Approach to the Evaluation of Clustering Methods

10.2 Quantitative Assessment of Performance for Clustering Methods

10.3 List of Candidate Characteristic for Problems and Methods

10.4 The Evaluation Task Lying Ahead

Appendix A. Correlation and Nominal Variables

A.1 The Fundamental Analysis

A.2 The Problem of Isolated Cells

A.3 Deflating the Squared Correlation

Appendix B. Programs for Scale Conversions

B.1 Partitions of the Truncated Normal Distribution

B.2 Iterative Improvement of a Partition

Program CUTS

Function ERF

Program DIVIDE

Subroutine TEST

Subroutine SORT

Function PSUMSQ

Appendix C. Programs for Association Measures among Nominal and Interval Variables

C.1 General Design Features

C.2 Deck Setup and Utilization

Subroutine GCORR

Subroutine INPTR

Subroutine NCAT

Subroutine EIGEN

Subroutine VSORT

Function CORXX

Function CORKX

Function CORKK

Appendix D. Programs for Association Measures Involving Binary Variables

D.1 Bit-Level Storage

D.2 Computing Association Measures

D.3 Use of the Program

Program BINARY

Subroutine BDATA

Function Subprogram KOUNT

Function BASSN

Appendix E. Programs for Hierarchical Cluster Analysis

E.1 Stored Similarity Matrix Approach

E.2 Stored Data Approach

E.3 Sorted Matrix Approach

Subroutine CNTRL

Subroutine CLSTR

Function LFIND

Subroutine METHOD

Subroutine MANAGE

Subroutine GROUP

Subroutine PROC

Subroutine ALLINI

Subroutine PREP

Appendix F. Programs for Nonhierarchical Clustering

Subroutine EXEC

Subroutine RESULT

Subroutine KMEAN

Appendix G. Programs to Aid Interpretation of Clustering Results

G.1 A Program for Manipulating Hierarchical Trees

G.2 Permuting the Similarity Matrix

G.3 Error Sum of Squares Analysis

G.4 Analysis of a Given Partition

Subroutine DETAIL

Subroutine READCM

Subroutine TREE

Program PERMUTE

Subroutine MTXIN

Function LFIND

Program ERROR

Program POSTDU

Appendix H. Relations Among Cluster Analysis Programs

References

Index