Variable selection is perhaps the most challenging activity in the data science lifecycle. Our blog highlights a repeatable approach to variable engineering.
n this post we describe some common ways to transform individual variables, and explore how doing so may benefit an analysis.
Most data science algorithms do not tolerate nulls (missing values). So, one must do something to eliminate them, before or while analyzing a data set.
This data science design pattern blog post focuses on kernel smoothing.
We hope that this blog will become a clearinghouse within the data science community for these data science design patterns, thereby extending the design-pattern tradition in software development and enterprise architecture to data science.