As we at Lantern get many questions from those looking to enter the data field, the panel were asked to give their take on the differences between typical roles in the industry and the traits that are sought after by hiring managers. Tanaby took the lead on this question and Alireza provided some additional notes.
Data Engineers mainly work with the platform and creation of the platform as well as preparing the data. This goes all the way from massaging the data to transforming the environments that hold the data. The goal of these tasks is to make it easier for data scientists to get the data and work with the data.
A Data Scientist starts by working directly with the data, spending 50-60% of their time working to transform and clean the data and to prepare it for modelling. Then they move onto the modelling phase which includes training models, validating the model, and tuning parameters of the model. On iterations of tuning parameters and testing the model, once it surpasses the required threshold it becomes ready for deployment. The deployment environment that is typically used nowadays is a cloud environment, either in-house or through third-party services like Amazon Web Services or Google Cloud Platform or Microsoft Azure.
The responsibility of preparing these environments and choosing the right technologies falls to DevOps Engineers who make the whole platform ready for data scientists to use. Overall a Data Scientist needs extensive knowledge of machine learning and data analysis, as well as having broad knowledge for all the other parts and being able to perform the basic tasks of Data Engineers and DevOps Engineers.
There are many roles that often get lumped into the broad use of data science and tech jobs. Prospective professionals typically aim for jobs in data science without a clear idea of what they would like and what is available. Individuals should keep an open mind in their job hunting for more specialty roles and overlapping roles that are crucial components of the data science process.