The Executive Role in a Data-Driven Organization
March 10, 2014 | posted by Mosaic Data Science
Executives know that one must effect a variety of organizational changes in a timely fashion, to support a technology change. Otherwise, the organization may resist or reject the change. These changes may involve the formal and informal reward systems, organization structure, resource allocations, and cultural norms.
Sample Size Matters
February 4, 2014 | posted by Mosaic Data Science
Given the current shortage of data scientists in the U.S. labor market, some argue that employers should simply train internal IT staff to program in a language such as Python or R having strong data-analysis capabilities, and then have these programmers do the company’s data science. Or they may hire analysts with statistical training, but little or no background in optimization. (We discuss this risk in our white paper “Standing up a Data Science Group.”)
This post illustrates an important risk in this homegrown approach to data science. The programmers or statisticians may, in some sense, perform a correct statistical analysis. They may nevertheless fail to arrive at a good solution to an important optimization problem. And it is almost always the optimization problem that the business really cares about. Treating an optimization problem as a purely statistical problem can cost a business millions in lost revenue or cost reductions, in the name of minimizing data science labor expense.
Small Data, Big ROI
February 2, 2014 | posted by Mosaic Data Science
Welcome to Mosaic Data Science, and thanks for reading our blog! We’ll frequently opine here about various technical and managerial data science topics, so visit often.
The phrase ‘big data’ has become enormously popular in the business press. Like many business buzz phrases, it has lost much of its original meaning. More often these days when a business writer says “big data” they mean data science, or data science applied to a large data set. Some traditional-BI vendors try to capitalize on the buzz by identifying new features of their offerings as supporting “big data,” even though they work in the traditional relational-database paradigm, which big data by definition does not fit.
The phrase does have a clear (and useful) original definition. Big data is data that is too big to be stored economically in a relational database. Just what that means depends on whose budget we’re talking about, and what year. Regardless, many new data-storage technologies have been invented out of the need to store data that’s too expensive to manage with a relational database. There’s just too much of it.