|aStatistical and machine-learning data mining :|btechniques for better predictive modeling and analysis of big data /|cBruce Ratner.
|aBoca Raton, FL :|bCRC Press,|cc2017
|axxxiii, 655 pages :|billustrations ;|c26 cm.
|aPreface -- Preface to second edition -- Acknowledgments -- About the author -- Introduction -- Science dealing with data: statistic and data science -- Basic data mining methods for variable assessment -- Chaid-based data mining for paired-variable assessment -- The importance of straight data : simplicity and desirability for good model-building practice -- Symmetrizing ranked data : a statistical data mining method for improving the predictive power of data -- Principal component analysis : a statistical data mining -- Method for many-variable assessment -- Market share estimation : data mining for an exception case -- The correlation coefficient : its values range between plus/minus 1, or do they? -- Logistic regression : the workhorse of response modeling -- Predicting share of wallet without survey data -- Ordinary regression: the workhorse of profit modeling -- Variable selection methods in regression: ignorable problem, notable solution -- Chaid for interpreting a logistic regression model -- The importance of the regression coefficient -- The average correlation: a statistical data mining measure -- For assessment of competing predictive models and the importance of the predictor variables -- Chaid for specifying a model with interaction variables -- Market segmentation classification modeling with logistic regression -- Market segmentation based on time-series data using latent class analysis -- Market segmentation: an easy way to understand the segments -- Chaid as a method for filling in missing values -- Model building with big complete and incomplete data -- Art, science, numbers, and poetry -- Identifying your best customers: descriptive, predictive, and look-alike profiling -- Assessment of marketing models -- Decile analysis: perspective and performance -- Net T-C lift model : assessing the net effects of test and control campaigns -- Bootstrapping in marketing -- Validating the logistic regression model : try bootstrapping -- Visualization of marketing models data mining to uncover innards of a model -- The predictive contribution coefficient : a measure of predictive importance -- Regression modeling involves art, science, and poetry, too -- Genetic and statistic regression models : a comparison -- Data reuse : a powerful data mining effect of the GenIQ model -- A data mining method for moderating outliers instead -- Of discarding them -- Overfitting : old problem, new solution -- The importance of straight data : revisited -- The geniq model : its definition and an application -- Finding the best variables for marketing models -- Interpretation of coefficient-free models -- Text mining : primer, illustration, and txtdm software -- Some of my favorite statistical subroutines -- Index.
The third edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. is a compilation of new and creative data mining techniques, which address the scaling-up of the framework of classical and modern statistical methodology, for predictive modeling and analysis of big data. SM-DM provides proper solutions to common problems facing the newly minted data scientist in the data mining discipline. Its presentation focuses on the needs of the data scientists (commonly known as statisticians, data miners and data analysts), delivering practical yet powerful, simple yet insightful quantitative techniques, most of which use the "old" statistical methodologies improved upon by the new machine learning influence.