Data Science has become an umbrella category for a number of roles. It is being increasingly recognised that the ‘unicorn’ data scientist who is a master of all the in-demand skills is largely a myth, and that data science roles need to be further specified to serve business needs.
Unfortunately, job titles are still sometimes used inconsistently, which can be confusing if you are looking to pursue a new direction or find a new job. Here is a quick and handy guide to the top job roles, and a breakdown of what each one involves.
This is probably the most common overarching job title. A data scientist’s core responsibility will be to provide actionable business insight from a dataset. Stitch Fix’s Director of data science, Michael Hochster, suggests that the roles tend to fall into two camps: analytical or building. The first focuses on the statistical interpretation of data, whereas the latter builds models based on data. The extent you’ll be expected to diversify your talents will depend on the structure and size of the data science team (i.e. your role is likely to become more focused on true data science in a larger team). Regardless, a data scientist will need to be comfortable with a range of machine learning and data mining techniques. Key skills include expertise in programming languages such as R, MatLab, SQL and Python and a strong background in computer science or related field.
Business analyst / Data Journalist / Data Translator
These titles refer to a number of related roles which all focus on communicating data insights and putting them in context of the wider business goals. People in these roles will need a talent for creating a data story and presenting it to people without data expertise or even without much IT knowledge. The focus will be on understanding how data trends can be leveraged to drive the business forward, whereas a data scientist may want to know the root cause of such trends. Though a statistical and coding knowledge are important, it is communication, a business background and an in-depth industry knowledge that are the key skills to these roles.
Data Analyst / Statistician
Now, an analyst and a statistician may rely on similar methods to analyse datasets, but they are very different roles. Harvard Business Review describe the difference as a narrow and deep approach (statisticians) vs wide and shallow (analysts). Statisticians can estimate how data insights might hold up in a variety of circumstances by incorporating error into a model — so they are useful for in-depth insights and minimising the risk of reaching an incorrect conclusion. On the other hand, a data analyst can code at lightning speed and discover potential insights extremely quickly — they can then point statisticians in the right direction. Utilising both a statistician and an analyst can make for a highly efficient system.
Data Engineer / machine learning engineer
Engineering roles often fall under data science (and these roles can often be combined), but when there is a specific engineering role, these people are likely to work with data at an early stage. As a data engineer, you will build and optimise data and data pipeline architecture. Think of engineering as creating the infrastructure necessary for further analysis. As a result, a technical computer science background and exemplary coding skills are essential to this role. It will also be an advantage to gain experience with big data tools, SQL and NoSQL databases, data pipeline tools and cloud services. Machine learning engineering is an additional specialism, which focuses on identifying and applying appropriate models to big datasets.
Data Architect / Data Administrator / Data Manager
Big data just keeps growing, and in response many companies are hiring specialists to manage these large datasets. If you are working in one of these roles, you will be expected to create systems that enable integration, centralisation and protection of datasets. You ensure the data engineers and data scientists have an efficient dataset to work with, that it is safely backed up and can be easily recovered. Therefore, you will need to be comfortable in data modelling techniques, data warehousing and security procedures.