{"product_id":"data-mining-and-business-analytics-with-r-isbn-9781118447147","title":"Data Mining and Business Analytics with R","description":"\u003cp\u003eCollecting, analyzing, and extracting valuable information from a large amount of data requires easily accessible, robust, computational and analytical tools. \u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e utilizes the open source software R for the analysis, exploration, and simplification of large high-dimensional data sets. As a result, readers are provided with the needed guidance to model and interpret complicated data and become adept at building powerful models for prediction and classification.\u003c\/p\u003e \u003cp\u003eHighlighting both underlying concepts and practical computational skills, \u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e begins with coverage of standard linear regression and the importance of parsimony in statistical modeling. The book includes important topics such as penalty-based variable selection (LASSO); logistic regression; regression and classification trees; clustering; principal components and partial least squares; and the analysis of text and network data. In addition, the book presents:\u003c\/p\u003e \u003cul\u003e \u003cli\u003eA thorough discussion and extensive demonstration of the theory behind the most useful data mining tools\u003c\/li\u003e \u003cli\u003eIllustrations of how to use the outlined concepts in real-world situations\u003c\/li\u003e \u003cli\u003eReadily available additional data sets and related R code allowing readers to apply their own analyses to the discussed materials\u003c\/li\u003e \u003cli\u003eNumerous exercises to help readers with computing skills and deepen their understanding of the material\u003c\/li\u003e \u003c\/ul\u003e \u003cp\u003e\u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e is an excellent graduate-level textbook for courses on data mining and business analytics. The book is also a valuable reference for practitioners who collect and analyze data in the fields of finance, operations management, marketing, and the information sciences.\u003c\/p\u003e  \u003cp\u003ePreface ix\u003c\/p\u003e \u003cp\u003eAcknowledgments xi\u003c\/p\u003e \u003cp\u003e\u003cb\u003e1. Introduction 1\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003eReference 6\u003c\/p\u003e \u003cp\u003e\u003cb\u003e2. Processing the Information and Getting to Know Your Data 7\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e2.1 Example 1: 2006 Birth Data 7\u003c\/p\u003e \u003cp\u003e2.2 Example 2: Alumni Donations 17\u003c\/p\u003e \u003cp\u003e2.3 Example 3: Orange Juice 31\u003c\/p\u003e \u003cp\u003eReferences 39\u003c\/p\u003e \u003cp\u003e\u003cb\u003e3. Standard Linear Regression 40\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e3.1 Estimation in R 43\u003c\/p\u003e \u003cp\u003e3.2 Example 1: Fuel Efficiency of Automobiles 43\u003c\/p\u003e \u003cp\u003e3.3 Example 2: Toyota Used-Car Prices 47\u003c\/p\u003e \u003cp\u003eAppendix 3.A The Effects of Model Overfitting on the Average Mean Square Error of the Regression Prediction 53\u003c\/p\u003e \u003cp\u003eReferences 54\u003c\/p\u003e \u003cp\u003e\u003cb\u003e4. Local Polynomial Regression: a Nonparametric Regression Approach 55\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e4.1 Model Selection 56\u003c\/p\u003e \u003cp\u003e4.2 Application to Density Estimation and the Smoothing of Histograms 58\u003c\/p\u003e \u003cp\u003e4.3 Extension to the Multiple Regression Model 58\u003c\/p\u003e \u003cp\u003e4.4 Examples and Software 58\u003c\/p\u003e \u003cp\u003eReferences 65\u003c\/p\u003e \u003cp\u003e\u003cb\u003e5. Importance of Parsimony in Statistical Modeling 67\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e5.1 How Do We Guard Against False Discovery 67\u003c\/p\u003e \u003cp\u003eReferences 70\u003c\/p\u003e \u003cp\u003e\u003cb\u003e6. Penalty-Based Variable Selection in Regression Models with Many Parameters (LASSO) 71\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e6.1 Example 1: Prostate Cancer 74\u003c\/p\u003e \u003cp\u003e6.2 Example 2: Orange Juice 78\u003c\/p\u003e \u003cp\u003eReferences 82\u003c\/p\u003e \u003cp\u003e\u003cb\u003e7. Logistic Regression 83\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e7.1 Building a Linear Model for Binary Response Data 83\u003c\/p\u003e \u003cp\u003e7.2 Interpretation of the Regression Coefficients in a Logistic Regression Model 85\u003c\/p\u003e \u003cp\u003e7.3 Statistical Inference 85\u003c\/p\u003e \u003cp\u003e7.4 Classification of New Cases 86\u003c\/p\u003e \u003cp\u003e7.5 Estimation in R 87\u003c\/p\u003e \u003cp\u003e7.6 Example 1: Death Penalty Data 87\u003c\/p\u003e \u003cp\u003e7.7 Example 2: Delayed Airplanes 92\u003c\/p\u003e \u003cp\u003e7.8 Example 3: Loan Acceptance 100\u003c\/p\u003e \u003cp\u003e7.9 Example 4: German Credit Data 103\u003c\/p\u003e \u003cp\u003eReferences 107\u003c\/p\u003e \u003cp\u003e\u003cb\u003e8. Binary Classification, Probabilities, and Evaluating Classification Performance 108\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e8.1 Binary Classification 108\u003c\/p\u003e \u003cp\u003e8.2 Using Probabilities to Make Decisions 108\u003c\/p\u003e \u003cp\u003e8.3 Sensitivity and Specificity 109\u003c\/p\u003e \u003cp\u003e8.4 Example: German Credit Data 109\u003c\/p\u003e \u003cp\u003e\u003cb\u003e9. Classification Using a Nearest Neighbor Analysis 115\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e9.1 The k-Nearest Neighbor Algorithm 116\u003c\/p\u003e \u003cp\u003e9.2 Example 1: Forensic Glass 117\u003c\/p\u003e \u003cp\u003e9.3 Example 2: German Credit Data 122\u003c\/p\u003e \u003cp\u003eReference 125\u003c\/p\u003e \u003cp\u003e\u003cb\u003e10. The Na¨ýve Bayesian Analysis: a Model for Predicting a Categorical Response from Mostly Categorical\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003ePredictor Variables 126\u003c\/p\u003e \u003cp\u003e10.1 Example: Delayed Airplanes 127\u003c\/p\u003e \u003cp\u003eReference 131\u003c\/p\u003e \u003cp\u003e\u003cb\u003e11. Multinomial Logistic Regression 132\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e11.1 Computer Software 134\u003c\/p\u003e \u003cp\u003e11.2 Example 1: Forensic Glass 134\u003c\/p\u003e \u003cp\u003e11.3 Example 2: Forensic Glass Revisited 141\u003c\/p\u003e \u003cp\u003eAppendix 11.A Specification of a Simple Triplet Matrix 147\u003c\/p\u003e \u003cp\u003eReferences 149\u003c\/p\u003e \u003cp\u003e\u003cb\u003e12. More on Classification and a Discussion on Discriminant Analysis 150\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e12.1 Fisher’s Linear Discriminant Function 153\u003c\/p\u003e \u003cp\u003e12.2 Example 1: German Credit Data 154\u003c\/p\u003e \u003cp\u003e12.3 Example 2: Fisher Iris Data 156\u003c\/p\u003e \u003cp\u003e12.4 Example 3: Forensic Glass Data 157\u003c\/p\u003e \u003cp\u003e12.5 Example 4: MBA Admission Data 159\u003c\/p\u003e \u003cp\u003eReference 160\u003c\/p\u003e \u003cp\u003e\u003cb\u003e13. Decision Trees 161\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e13.1 Example 1: Prostate Cancer 167\u003c\/p\u003e \u003cp\u003e13.2 Example 2: Motorcycle Acceleration 179\u003c\/p\u003e \u003cp\u003e13.3 Example 3: Fisher Iris Data Revisited 182\u003c\/p\u003e \u003cp\u003e\u003cb\u003e14. Further Discussion on Regression and Classification Trees, Computer Software, and Other Useful Classification Methods 185\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e14.1 R Packages for Tree Construction 185\u003c\/p\u003e \u003cp\u003e14.2 Chi-Square Automatic Interaction Detection (CHAID) 186\u003c\/p\u003e \u003cp\u003e14.3 Ensemble Methods: Bagging, Boosting, and Random Forests 188\u003c\/p\u003e \u003cp\u003e14.4 Support Vector Machines (SVM) 192\u003c\/p\u003e \u003cp\u003e14.5 Neural Networks 192\u003c\/p\u003e \u003cp\u003e14.6 The R Package Rattle: A Useful Graphical User Interface for Data Mining 193\u003c\/p\u003e \u003cp\u003eReferences 195\u003c\/p\u003e \u003cp\u003e\u003cb\u003e15. Clustering 196\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e15.1 k-Means Clustering 196\u003c\/p\u003e \u003cp\u003e15.2 Another Way to Look at Clustering: Applying the Expectation-Maximization (EM) Algorithm to Mixtures of Normal Distributions 204\u003c\/p\u003e \u003cp\u003e15.3 Hierarchical Clustering Procedures 212\u003c\/p\u003e \u003cp\u003eReferences 219\u003c\/p\u003e \u003cp\u003e\u003cb\u003e16. Market Basket Analysis: Association Rules and Lift 220\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e16.1 Example 1: Online Radio 222\u003c\/p\u003e \u003cp\u003e16.2 Example 2: Predicting Income 227\u003c\/p\u003e \u003cp\u003eReferences 234\u003c\/p\u003e \u003cp\u003e\u003cb\u003e17. Dimension Reduction: Factor Models and Principal Components 235\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e17.1 Example 1: European Protein Consumption 238\u003c\/p\u003e \u003cp\u003e17.2 Example 2: Monthly US Unemployment Rates 243\u003c\/p\u003e \u003cp\u003e\u003cb\u003e18. Reducing the Dimension in Regressions with Multicollinear Inputs: Principal Components Regression and Partial Least Squares 247\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e18.1 Three Examples 249\u003c\/p\u003e \u003cp\u003eReferences 257\u003c\/p\u003e \u003cp\u003e\u003cb\u003e19. Text as Data: Text Mining and Sentiment Analysis 258\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e19.1 Inverse Multinomial Logistic Regression 259\u003c\/p\u003e \u003cp\u003e19.2 Example 1: Restaurant Reviews 261\u003c\/p\u003e \u003cp\u003e19.3 Example 2: Political Sentiment 266\u003c\/p\u003e \u003cp\u003eAppendix 19.A Relationship Between the Gentzkow Shapiro Estimate of “Slant” and Partial Least Squares 268\u003c\/p\u003e \u003cp\u003eReferences 271\u003c\/p\u003e \u003cp\u003e\u003cb\u003e20. Network Data 272\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e20.1 Example 1: Marriage and Power in Fifteenth Century Florence 274\u003c\/p\u003e \u003cp\u003e20.2 Example 2: Connections in a Friendship Network 278\u003c\/p\u003e \u003cp\u003eReferences 292\u003c\/p\u003e \u003cp\u003eAppendix A: Exercises 293\u003c\/p\u003e \u003cp\u003eExercise 1 294\u003c\/p\u003e \u003cp\u003eExercise 2 294\u003c\/p\u003e \u003cp\u003eExercise 3 296\u003c\/p\u003e \u003cp\u003eExercise 4 298\u003c\/p\u003e \u003cp\u003eExercise 5 299\u003c\/p\u003e \u003cp\u003eExercise 6 300\u003c\/p\u003e \u003cp\u003eExercise 7 301\u003c\/p\u003e \u003cp\u003eAppendix B: References 338\u003c\/p\u003e \u003cp\u003eIndex 341\u003c\/p\u003e \u003cp\u003e\"I first taught a Ph.D. level course in business applications of data mining 10 years ago. I regularly search the web, looking for business-oriented data mining books, and this is the first one I have found that is suitable for an MS in business analytics. I plan to use it. Anyone who teaches such a class and is inclined toward R should consider this text.\" (\u003ci\u003eJournal of the American Statistical Association\u003c\/i\u003e, 1 January 2014)\u003c\/p\u003e \u003cp\u003e\u003cb\u003eJOHANNES LEDOLTER,\u003c\/b\u003e PhD, is Professor in both the Department of Management Sciences and the Department of Statistics and Actuarial Science at the University of Iowa. He is a Fellow of the American Statistical Association and the American Society for Quality, and an Elected Member of the International Statistical Institute. Dr. Ledolter is the coauthor of \u003ci\u003eStatistical Methods for Forecasting, Achieving Quality Through Continual Improvement,\u003c\/i\u003e and \u003ci\u003eStatistical Quality Control: Strategies and Tools for Continual Improvement,\u003c\/i\u003e all published by Wiley.\u003c\/p\u003e  \u003cp\u003eShowcases \u003cb\u003eR's\u003c\/b\u003e critical role in the world of business\u003c\/p\u003e \u003cp\u003eCollecting, analyzing, and extracting valuable information from a large amount of data requires easily accessible robust computational and analytical tools. \u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e utilizes the open source software R for the analysis, exploration, and simplification of large high-dimensional data sets. As a result, readers are provided with the needed guidance to model and interpret complicated data and become adept at building powerful models for prediction and classification.\u003c\/p\u003e \u003cp\u003eHighlighting both underlying concepts and practical computational skills, \u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e begins with coverage of standard linear regression and the importance of parsimony in statistical modeling. The book includes important topics such as penalty-based variable selection (LASSO); logistic regression; regression and classification trees; clustering; principal components and partial least squares; and the analysis of text and network data. In addition, the book presents:\u003c\/p\u003e \u003cul\u003e \u003cli\u003eA thorough discussion and extensive demonstration of the theory behind the most useful data mining tools\u003c\/li\u003e \u003cli\u003eIllustrations of how to use the outlined concepts in real-world situations\u003c\/li\u003e \u003cli\u003eReadily available additional data sets and related R code allowing readers to apply their own analyses to the discussed materials\u003c\/li\u003e \u003cli\u003eNumerous exercises to help readers with computing skills and deepen their understanding of the material\u003c\/li\u003e \u003c\/ul\u003e \u003cp\u003e\u003ci\u003eData Mining and Business Analytics with R\u003c\/i\u003e is an excellent graduate-level textbook for courses on data mining and business analytics. The book is also a valuable reference for practitioners who collect and analyze data in the fields of finance, operations management, marketing, and the information sciences.\u003c\/p\u003e","brand":"Wiley","offers":[{"title":"Default Title","offer_id":47989024620773,"sku":"NP9781118447147","price":109.0,"currency_code":"USD","in_stock":false}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/1842\/7735\/files\/9781118447147.jpg?v=1761782485","url":"https:\/\/k12savings.com\/products\/data-mining-and-business-analytics-with-r-isbn-9781118447147","provider":"K12savings","version":"1.0","type":"link"}