posted by Mosaic Data Science
A propensity model is a statistical scorecard that is used to predict the behavior of your customer. Propensity models can be used to identify those most likely to respond to an offer, or to focus retention efforts on those most likely to churn.
After a data science consultant or predictive analytics firm receives the data, they should begin to work collaboratively with project stakeholders to set up data ingestion and integration procedures and to perform exploratory data analysis. The exploratory analysis should cover data quality and completeness along with correlation analyses to identify relevant demographic and behavioral factors that influence customer propensities. Stakeholders should be involved to incorporate subject matter expertise into the analysis to ensure that known factors are covered.
A data scientist should use the insights from this exploratory analysis to drive feature engineering and model development activities. It is advantageous to follow an agile approach to model development. The data integration and modeling workflow should be implemented in an analytics platform to allow for multiple modeling approaches to be efficiently compared against each other. The data science consultant should focus on machine learning models for classification such as logistic regression, random forests, naïve Bayes classifiers, or support vector machines (SVMs). Value should be placed on model simplicity – model complexity should only be increased if there is sufficient performance improvement. The relative importance of model interpretability (the ability for human subject matter experts to understand the internal model logic) needs to be accounted for and should balance the objectives of model performance and interpretability based on that input.
A data scientist should select model performance metrics for the comparison that best align with the business objectives related to the model outputs. Typical metrics include accuracy, specificity/sensitivity, and receiver operating characteristic (ROC) metrics such as area under curve (AUC). By using more targeted metrics, models can be selected and optimized based on fitness for the ultimate business objectives for which the models are being used. A data scientist’s performance assessment should use well accepted methods such as cross-validation to ensure that performance metrics are representative of expected model performance on new data. Models should be selected and optimized independently using a common workflow architecture and data integration stream. The data scientist should include in the workflow for each model automated analysis outputs that will allow stakeholders to quickly assess model performance following a data refresh or a change to the underlying model.
Once the propensity models have been built, the data scientist can use the selected model features as inputs for customer segmentation models that could provide insights to stakeholders on the makeup of their customer base. Data scientists should leverage clustering models, such as k-means or hierarchical clustering, to identify relevant customer groups.
Mosaic can bring these capabilities to your organization, Contact us Here and mention this blog post