R and Data Mining

R and Data Mining

Examples and Case Studies

1st Edition - December 11, 2012

Write a review

  • Author: Yanchang Zhao
  • Hardcover ISBN: 9780123969637
  • eBook ISBN: 9780123972712

Purchase options

Purchase options
DRM-free (Mobi, EPub, PDF)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order


R and Data Mining introduces researchers, post-graduate students, and analysts to data mining using R, a free software environment for statistical computing and graphics. The book provides practical methods for using R in applications from academia to industry to extract knowledge from vast amounts of data. Readers will find this book a valuable guide to the use of R in tasks such as classification and prediction, clustering, outlier detection, association rules, sequence analysis, text mining, social network analysis, sentiment analysis, and more.Data mining techniques are growing in popularity in a broad range of areas, from banking to insurance, retail, telecom, medicine, research, and government. This book focuses on the modeling phase of the data mining process, also addressing data exploration and model evaluation.With three in-depth case studies, a quick reference guide, bibliography, and links to a wealth of online resources, R and Data Mining is a valuable, practical guide to a powerful method of analysis.

Key Features

  • Presents an introduction into using R for data mining applications, covering most popular data mining techniques
  • Provides code examples and data so that readers can easily learn the techniques
  • Features case studies in real-world applications to help readers apply the techniques in their work


Researchers in academia and industry working in the field of data mining, postgraduate students who are interested in data mining, as well as data miners and analysts from industry. Since data mining techniques are widely used in government agencies, banks, insurance, retail, telecom, medicine and research, the book will be interesting to many areas

Table of Contents

  • Dedication

    List of Figures

    List of Abbreviations

    Chapter 1. Introduction

    1.1 Data Mining

    1.2 R

    1.3 Datasets


    Chapter 2. Data Import and Export

    2.1 Save and Load R Data

    2.2 Import from and Export to .CSV Files

    2.3 Import Data from SAS

    2.4 Import/Export via ODBC


    Chapter 3. Data Exploration

    3.1 Have a Look at Data

    3.2 Explore Individual Variables

    3.3 Explore Multiple Variables

    3.4 More Explorations

    3.5 Save Charts into Files


    Chapter 4. Decision Trees and Random Forest

    4.1 Decision Trees with Package party

    4.2 Decision Trees with Package rpart

    4.3 Random Forest


    Chapter 5. Regression

    5.1 Linear Regression

    5.2 Logistic Regression

    5.3 Generalized Linear Regression

    5.4 Non-Linear Regression

    Chapter 6. Clustering

    6.1 The k-Means Clustering

    6.2 The k-Medoids Clustering

    6.3 Hierarchical Clustering

    6.4 Density-Based Clustering


    Chapter 7. Outlier Detection

    7.1 Univariate Outlier Detection

    7.2 Outlier Detection with LOF

    7.3 Outlier Detection by Clustering

    7.4 Outlier Detection from Time Series

    7.5 Discussions


    Chapter 8. Time Series Analysis and Mining

    8.1 Time Series Data in R

    8.2 Time Series Decomposition

    8.3 Time Series Forecasting

    8.4 Time Series Clustering

    8.5 Time Series Classification

    8.6 Discussions

    8.7 Further Readings


    Chapter 9. Association Rules

    9.1 Basics of Association Rules

    9.2 The Titanic Dataset

    9.3 Association Rule Mining

    9.4 Removing Redundancy

    9.5 Interpreting Rules

    9.6 Visualizing Association Rules

    9.7 Discussions and Further Readings


    Chapter 10. Text Mining

    10.1 Retrieving Text from Twitter

    10.2 Transforming Text

    10.3 Stemming Words

    10.4 Building a Term-Document Matrix

    10.5 Frequent Terms and Associations

    10.6 Word Cloud

    10.7 Clustering Words

    10.8 Clustering Tweets

    10.9 Packages, Further Readings, and Discussions


    Chapter 11. Social Network Analysis

    11.1 Network of Terms

    11.2 Network of Tweets

    11.3 Two-Mode Network

    11.4 Discussions and Further Readings


    Chapter 12. Case Study I: Analysis and Forecasting of House Price Indices

    12.1 Importing HPI Data

    12.2 Exploration of HPI Data

    12.3 Trend and Seasonal Components of HPI

    12.4 HPI Forecasting

    12.5 The Estimated Price of a Property

    12.6 Discussion

    Chapter 13. Case Study II: Customer Response Prediction and Profit Optimization

    13.1 Introduction

    13.2 The Data of KDD Cup 1998

    13.3 Data Exploration

    13.4 Training Decision Trees

    13.5 Model Evaluation

    13.6 Selecting the Best Tree

    13.7 Scoring

    13.8 Discussions and Conclusions


    Chapter 14. Case Study III: Predictive Modeling of Big Data with Limited Memory

    14.1 Introduction

    14.2 Methodology

    14.3 Data and Variables

    14.4 Random Forest

    14.5 Memory Issue

    14.6 Train Models on Sample Data

    14.7 Build Models with Selected Variables

    14.8 Scoring

    14.9 Print Rules

    14.10 Conclusions and Discussion

    Chapter 15. Online Resources

    15.1 R Reference Cards

    15.2 R

    15.3 Data Mining

    15.4 Data Mining with R

    15.5 Classification/Prediction with R

    15.6 Time Series Analysis with R

    15.7 Association Rule Mining with R

    15.8 Spatial Data Analysis with R

    15.9 Text Mining with R

    15.10 Social Network Analysis with R

    15.11 Data Cleansing and Transformation with R

    15.12 Big Data and Parallel Computing with R

    R Reference Card for Data Mining


    General Index

    Package Index

    Function Index

Product details

  • No. of pages: 256
  • Language: English
  • Copyright: © Academic Press 2012
  • Published: December 11, 2012
  • Imprint: Academic Press
  • Hardcover ISBN: 9780123969637
  • eBook ISBN: 9780123972712

About the Author

Yanchang Zhao

A Senior Data Mining Analyst in Australia Government since 2009.

Before joining public sector, he was an Australian Postdoctoral Fellow (Industry) in the Faculty of Engineering & Information Technology at University of Technology, Sydney, Australia. His research interests include clustering, association rules, time series, outlier detection and data mining applications and he has over forty papers published in journals and conference proceedings. He is a member of the IEEE and a member of the Institute of Analytics Professionals of Australia, and served as program committee member for more than thirty international conferences.

Affiliations and Expertise

Senior Data Mining Specialist, Australia

Ratings and Reviews

Write a review

There are currently no reviews for "R and Data Mining"