1 Introduction 1.1 Prediction Versus Interpretation 1.2 Key Ingredients of Predictive Models 1.3 Terminology 1.4 Example Data Sets and Typical Data Scenarios 1.5 Overview 1.6 Notation Part Ⅰ General Strategies 2 A Short Tour of the Predictive Modeling Process 2.1 Case Study: Predicting Fuel Economy 2.2 Themes 2.3 Summary 3 Data Pre-processing 3.1 Case Study: Cell Segmentation in High-Content Screening 3.2 Data Transformations for Individual Predictors 3.3 Data Transformations for Multiple Predictors 3.4 Dealing with Missing Values 3.5 Removing Predictors 3.6 Adding Predictors 3.7 Binning Predictors 3.8 Computing Exercises 4 Over-Fitting and Model Tuning 4.1 The Problem of Over-Fitting 4.2 Model Tuning 4.3 Data Splitting 4.4 Resampling Techniques 4.5 Case Study: Credit Scoring 4.6 Choosing Final Tuning Parameters 4.7 Data Splitting Recommendations 4.8 Choosing Between Models 4.9 Computing Exercises Part Ⅱ Regression Models 5 Measuring Performance in Regression Models 5.1 Quantitative Measures of Performance 5.2 The Variance-Bias Trade-off 5.3 Computing 6 Linear Regression and Its Cousins 6.1 Case Study: Quantitative Structure-Activity Relationshir Modeling 6.2 Linear Regression 6.3 Partial Least Squares 6.4 Penalized Models 6.5 Computing Exercises 7 Nonlinear Regression Models 7.1 Neural Networks 7.2 Multivariate Adaptive Regression Splines 7.3 Support Vector Machines 7.4 K-Nearest Neighbors
7.5 Computing Exercises 8 Regression Trees and Rule-Based Models 8.1 Basic Regression Trees 8.2 Regression Model Trees 8.3 Rule-Based Models 8.4 Bagged Trees 8.5 Random Forests 8.6 Boosting 8.7 Cubist 8.8 Computing Exercises 9 A Summary of Solubility Models 10 Case Study: Compressive Strength of Concrete Mixtures 10.1 Model Building Strategy 10.2 Model Performance 10.3 Optimizing Compressive Strength 10.4 Computing Part Ⅲ Classification Models 11 Measuring Performance in Classification Models 11.1 Class Predictions 11.2 Evaluating Predicted Classes 11.3 Evaluating Class Probabilities 11.4 Computing 12 Discriminant Analysis and Other Linear Classification Models 12.1 Case Study: Predicting Successful Grant Applications 12.2 Logistic Regression 12.3 Linear Discriminant Analysis 12.4 Partial Least Squares Discriminant Analysis 12.5 Penalized Models 12.6 Nearest Shrunken Centroids 12.7 Computing Exercises 13 Nonlinear Classification Models 13.1 Nonlinear Discriminant Analysis 13.2 Neural Networks 13.3 Flexible Discriminant Analysis 13.4 Support Vector Machines 13.5 K-Nearest Neighbors 13.6 Naive Bayes 13.7 Computing Exercises 14 Classification Trees and Rule-Based Models 14.1 Basic Classification Trees 14.2 Rule-Based Models 14.3 Bagged Trees 14.4 Random Forests 14.5 Boosting 14.6 C5.0 14.7 Comparing Two Encodings of Categorical Predictors
14.8 Computing Exercises 15 A Summary of Grant Application Models 16 Remedies for Severe Class Imbalance 16.1 Case Study: Predicting Caravan Policy Ownership 16.2 The Effect of Class Imbalance 16.3 Model Tuning 16.4 Alternate Cutoffs 16.5 Adjusting Prior Probabilities 16.6 Unequal Case Weights 16.7 Sampling Methods 16.8 Cost-Sensitive Training 16.9 Computing Exercises 17 Case Study: Job Scheduling 17.1 Data Splitting and Model Strategy 17.2 Results 17.3 Computing Part Ⅳ Other Considerations 18 Measuring Predictor Importance 18.1 Numeric Outcomes 18.2 Categorical Outcomes 18.3 Other Approaches 18.4 Computing Exercises 19 An Introduction to Feature Selection 19.11 Consequences of Using Non-informative Predictors 19.12 Approaches for Reducing the Number of Predictor 19.13 Wrapper Methods 19.14 Filter Methods 19.15 Selection Bias 19.16 Case Study: Predicting Cognitive Impairment 19.17 Computing Exercises 20 Factors That Can Affect Model Performance 20.1 Type Ⅲ Errors 20.2 Measurement Error in the Outcome 20.3 Measurement Error in the Predictors 20.4 Discretizing Continuous Outcomes 20.5 When Should You Trust Your Model's Prediction? 20.6 The Impact of a Large Sample 20.7 Computing Exercises Appendix A A Summary of Various Models B An Introduction to R B.1 Start-Up and Getting Help B.2 Packages B.3 Creating Objects B.4 Data Types and Basic Structures
B.5 Working with Rectangular Data Sets B.6 Objects and Classes B.7 R Functions B.8 The Three Faces of = B.9 The AppliedPredictiveModeling Package B.10 The caret Package B.11 Software Used in this Text C Interesting Web Sites References Indicies Computing General