A key component of Digital Maturity is the capability to exploit data do derive knowledge and support business success through predictive analytics, which is why many organisation are now creating Data Science departments focused on this job. But how can we create a successful Data Science team able to be successful? How do we make sure we’re not just following the latest hype? How can we position these roles within the framework of our organisations?
One of the critical aspects that are driving Digital transformation is the availability of data. The quantity of data that is produced every day somehow already passes human capability to understand. But what do you do with this data? This is the crucial question that most organisations are trying to answer.
Data Science, AI, Machine Learning, invariably these seem the keywords used to address the challenge. Unfortunately without much thought into what these mean in terms of transformation for your organisation.
Plus, many organizations react by saying, we already have our “data” people. We have a successful Business Intelligence team. Why do we need Data Scientists?
The question needs a careful examination, as too often this is just dismissed with a shrug. Some organisation are simply adding data scientists to their existing BI teams, not understanding that the work of the two is profoundly different.
It is true that our organisations have been collecting data for years now, and that have also developed analytical capabilities. However, we need to be cautious and distinguish the different levels of data analysis.
Most of our processes are based on data collected in a structured format. Whether it’s an excel spreadsheet that is abused as a database or a large ERP, whenever data is extracted from a single source and processed there, we are talking about databases. There’s not much of analytical power that is needed. For sure, we can create some great looking presentations, but to be fair counting the headcount of an organisation or summing up its revenues, is not an analytical function.
With the development of ERPs and the creation of more complex databases, appeared the need to find relationships among different sets of data. This is where Business Intelligence started, often by creating a subset of data that migrated in a central repository, a “Datawarehouse”, where then they can be combined in different “views”. I could, therefore, start looking at revenues per employee, maybe adding visual cuts by country and department. The primary tool that BI has developed are dashboards, allowing the manager to have fresh reports every day coming from multiple sources.
Within BI, however, we have defined KPIs, that are assessed looking at past data. We have streamlined and generally accepted algorithms. We have a shared understanding of the organisation and a “trust” element for the reports.
Data Science tries to do something completely different: it looks at data that are not structured, without already defined algorithms, and often looks at predictive features of these data.
The difference is in the type of questions that they address: BI provides new values of previously known things, using some formula that is available. Data Science works with the unknown, answering data questions that nobody have answered before and, therefore, without formula in hand.Maxim Scherbak, Is Data Science a science?
We cannot state that Data Science is entirely new. But in the past, this was relegated to most scientific organisations. Thanks to the development of technology giants who have started to make their living based on data, the concept of Data Science has expanded to the commercial business. Thanks to Google, Facebook and other large players, vast amounts of data have been generated, within significant parallel investments in technology to automate the understanding of data.
Impact on Organisations
As we approach Organisational Design for Digital Maturity, we need to take into consideration how we should design Data Science and Analytics teams. We need, however, to understand how to build this capability, what importance should it have, and also what the contribution of these teams will have within the organisation.
The reason why this is important is that there is a general perception that businesses are doing pretty bad in their implementation of Data Science projects. One of the reason is that many still think that Data Science is a “BI on steroids”. We’ve seen this not to be true.
Another reason is that there’s a confusion between the Data Science itself and the Technology supporting it. There is an oversimplification about the relationship between machine learning and its impact on business values, which leads many organisations to take the wrong decisions in the way they set the organisations.
A third and final reason is linked to the “hype” effect triggered by the topic of Data Science. Just looking at any job-board, the question, if there’s hype, is a valid one. Many studies seem to express that the need for data scientist and analysts is growing. However, most of these are sponsored by technology companies, so let me for a moment doubt their validity. Others challenge this up to the point that some are already seeing the extinction of data science job titles in companies.
As organizational experts, we need to interpret these two extremes and work to identify what our organisation needs.
The first important aspect to mention is that an organisation, even a large one, rarely needs an army of real Data Scientists. The goal of Data Science, as mentioned, is to make sense of the unknown. This requires time in terms of research, but once the answer is found, the goal is to automate as much as possible, the work (AI and Machine Learning come into play here). A small core team of Data Scientists will be sufficient for most of your needs.
What is important are the other two roles in a Data Science Organisation. The first is that of the Data Engineer. We may need up to 5 data engineer per data scientist to ensure the business derives value from the organisation. The p[roblem is that often there is not enough knowledge of the workaround data, and we confuse the role of the two roles.
The third role, which is the most important one, is that of the Business Analyst. There’s not one business title for this role (which in some cases can be seen as a skillset rather than as a specific role). Many organisation shrug this role off as they think it relates to an old way of doing business. But the reality is that you need somebody that can translate a business need into a data science problem and be able to convert back the data science results into business value terms.
Building Data Science capabilities and skills
Whatever organizational solutions you may get to, you need to ensure the proper skills are internalized within the organisation. As the role of Data Scientist is new, often it is even painful to define what this role does, and which skills it needs.
Drew Conway has created a famous Venn diagram that illustrates Data Science as an overlap area between Math & Statistical Knowledge, Hacking Skills and Substantive Expertise. There are multiple representations today of this diagram; however they are often useless in an HR context, as they include too many aspects that cannot merely be framed into the context of skills.
Dominik Haitz has proposed a new version of the diagram, much more suitable for what he names the Third Wave Data Scientist.
The core of this new wave, the author argues, is that data scientists are hired to create business value. What was not necessarily true at the beginning (where the focus was often on pure experimentation) is now a required asset. Soft Skills also become a prime component of the model, as being able to communicate with others in the organization becomes a critical asset for any Data Scientist.
Wrapping up, I believe that the creation of a Data Science organisation is one of the most challenging aspects of the Org Design work done under Digital Transformation. As you noticed, I’ve not even approached the topic of where this organisation should sit and how it should be structured. There’s no universal answer to these questions at this time. But for sure I hope I have offered some elements to reason on where and how to frame this beyond a “we need data scientists” question.