Data Science and Design Thinking
Mosaic Data Science | A View of Data Science and Design Thinking
Did you like math when you were in school? I did. Math class was so crisp and clean. The problems were clearly defined, there was a single correct answer, and the teachers usually gave me a nice set of steps to follow to find the answer. Data science involves a lot of math, and data scientists might be tempted to view their craft like math class. Give me a prediction problem, and I’ll try all the models and optimize all the hyper-parameters to find the single correct answer – the model with the lowest mean absolute percent error on the hold out test data set or whatever. Data science challenges or contests, as great as they are, promote this view of data science.
Sometimes, when trying to explain what I do as a data scientist, I say something like: “Remember ‘story problems from math class? I solve story problems. Except the problems involve lots and lots of data, and usually there’s no single, clear-cut right answer.” While data science can be like solving huge story problems, it is actually even more ambiguous than that because we may not even have the right story problem, let alone know what data and type of data science or tool might help solve it. Design is a discipline that embraces such ambiguity and uncertainty, approaches it with humility, and brings a set of techniques that can enable efficient learning of the story problem itself along with a way to solve it. Data science and design thinking are powerful and, in many ways, complementary tools that Mosaic Data Science, a leading machine learning consulting company, can bring to bear on the story problems faced by your business.
Definitions of Data Science and Design Thinking
Both design and data science are notoriously difficult to define. Design is sometimes pigeon-holed as just what you do when you need a user interface for a new product, but it is much more. I’ll define it the craft of identifying a human need and building something new that addresses it. Design does not start with a solution (“Let’s use deep learning!”) but rather with human needs, which it assumes might be poorly understood at the outset and therefore seeks to explore and understand. Efficient learning via iteration and experimentation is fundamental to design. While experimentation is often quantitative and indeed at the heart of science, most of the varied tools traditionally involved in the craft of design are more qualitative in nature.
Data science has been defined in various ways, often involving Venn diagrams and unicorns. I’ll define data science as the craft of solving real-world problems with data, math, and computing. We sometimes teach data science as though the real-world story problem and data show up in a nice tidy format, but this is hardly the case.
Why do design and data science belong together?
Many have promoted the merits of combining design and data science [e.g., 1, 2, 3, 4, 5]. Design giants like IDEO have invested in and promoted incorporating data science into their design process. Tech giants like Google leverage design techniques to build new products or services that also depend on data science or Artificial Intelligence (AI) techniques. These benefits are not limited to customer-facing products or services, as McKinsey has also found them to be valuable when seeking improvements to internal business processes.
Why do these two crafts work so well together? I posit two main reasons.
- They are both fundamentally exploratory. While I’ve decried the tidiness of data science education, experienced data scientists know that even if we assume the story problem we start with is spot-on, we don’t know exactly what data we’ll need to use, nor can we know up front how well data science techniques for learning from the data will work. Design explores to learn about both fundamental human needs (the story problem) and practical ways to address them. This shared exploration ethic can facilitate collaboration across these two particular disciplines.
- They use different types of techniques that can complement each other. Data science doesn’t traditionally include tools for understanding human needs and adjusting the story problem accordingly, but many design tools support this. The design process, on the other hand, traditionally relies mostly on qualitative methods that data science can complement. Moreover, data science techniques such as those referred to as Artificial Intelligence (AI) offer entirely new ways of addressing human needs to explore in the design process, and design can help ensure techniques like AI are deployed effectively.
Causal inference in the “smart office”
While working at Steelcase to develop new “smart office” products, I had the opportunity to experience the power of the integration of design and data science. Design thinking is used widely at Steelcase and in its development of smart office products and services. I will only briefly describe a couple of examples of how we were able to use it while doing causal inference analysis for a sensor-enabled workplace analytics service.
Previous user interviews suggested that office facility managers didn’t want to just know which spaces were used more and which were used less, but why. Understanding causality would enable them to build spaces desired by employees and that hopefully enabled better work. To learn more about this need and how to address it, we began by manually conducting and sharing causal inference analyses with early customers. We shared these in person and took care to elicit feedback to guide later iterations. These early analyses relied on a variety of statistical techniques and visualizations, but also on qualitative information such as photographs and interviews. These iterations helped us hone in on this need and how we might address it, but we faced new challenges when our business scaled and we could no longer conduct such analyses manually. Identifying a single set of statistical tools and data visualizations to execute causal inference analyses for a wide range of customers would be difficult. We started by having a cross-disciplinary team hand sketch interfaces and collected feedback on these before we cracked a statistics text or developed a visualization of any real data. In the end, we arrived at a solution that relied on a more versatile data visualization than on the relatively sophisticated models we developed during the earlier manual analyses.
Proof-of-concepts as design
Faithful readers of MDS material will notice that we typically do not promote our design capabilities. I speculate that this is not because we don’t value or leverage design techniques, but because we use similar techniques from different disciplines like engineering. I’ve seen this most notably in how we use proofs-of-concept. While these could be misused as a solution-first approach that is at odds with design’s human needs-first ethic, at MDS I’ve seen them used instead as design-esque prototyping that seeks to elicit feedback that can guide further development. For example, in recent work with a hospital, we developed and collected feedback on basic visualization prototypes very early in our proof-of-concept effort and even the final proof-of-concept tool is being deployed in a limited way to learn from users before pursuing additional solutions such as more fully-featured user interfaces, more sophisticated mathematical techniques, or more data.
Jumping to solutions is tempting, particularly when wielding data and potent data science tools. Questioning the story problem is messy and can feel like backtracking. Data science and design thinking can help data scientists make sure they are solving the right story problem, rather than finding a sophisticated answer to a poorly-posed story problem.