What are the most requested technical skills in the data job market?Insights from 35k+ datajobs ads
Insights from the data skills radar, scanning daily data jobs ads

The tech scene is evolving fast. Blazingly fast. There are so many new projects, frameworks, and cloud API services popping up, it is just too difficult to constantly stay up-to-date for a software engineer. To help our community stay on track, we (Adriaan Slechten, Grégoire Hornung, Vincent Claes, and myself) have developed a dashboard that monitors the technical skills that are currently trending. The dashboard is available here and has a particular focus on the data skills.
In this blog post, we will go through the dashboard's backstage and share some insights and comments on these based on our experience.
How we did it?
We scraped on different top job ads websites worldwide, cleaned a bit the data, and processed it using a simple term-frequency matrice model.
We mostly used serverless services on AWS (Glue/Fargate/S3) and GCP BigQuery as our data warehouse.
Some points to be aware of:
Scraping is done through location and a pre-defined list of job titles.
We mostly focused on data profiles.
We picked the most prominent cities around the world and the most common data jobs title.
We have some datasets that we either pull directly for the relevant source (like programming language from the Github open dataset) or maintain manually to enrich the final output and categorize the technical skills.
The dashboard is refreshed every day.
Insights
AWS and Azure lead the way, but GCP has a good presence for data jobs
AWS is the most popular (more than 50% of total), followed by Azure and then GCP.
What’s interesting to notice is that GCP has clearly a bit more presence for data profile than standard Software Engineer jobs.
Some of the factors can be :
Google BigQuery has far more features than Redshift, and it’s fully serverless.
AutoML features
Pricing: you can basically start for free on GCP side and scale as you need.
Azure is second, probably mainly because of the initial footprint of Microsoft at the big corporate. Adding a contract for cloud products is then just an amendment :-)
The top 3 languages slightly differ around the continent.
As a Software engineer exposed to global information on the internet, it was quite surprising that trends are different from continents. And it does make sense as their legacy software comes at other times from different places.
Europe and North America are more in popular languages such as python. South America and Asia seem to have traditional backend tech like java.
ML Engineer and Data engineer: spot the 7 differences
The programming languages ranking is the same, and we can find a lot of similarities in terms of technical skills: Spark, SQL, K8s, etc.
What’s most different is that Tensorflow and Pytorch are sitting on the #1 and #2 of ML Engineer profile where it does not even make it to the top 10 in Data engineer jobs.
Data Scientist, get your SQL knowledge up to the level.
SQL is ranking #1 for Data Scientist. Being a Data Scientist is not only about working on fancy AI topics like machine learning or deep learning. Knowing your basics still matters. Besides, it’s still happening that a Data Scientist’s ending more doing basic reporting and SQL pipelines because the company's data maturity is not yet ready for machine learning.
R is in #2 as programming language after the always winner Python.
Also worth mentioning that the skills again differ around the globe. For example, SAS has a good presence in the US, while in the EU, it’s not making in the top 10.
Data Engineer, Scala is roughly top 3
Let’s face it, the success of Scala within the Data Engineer ecosystem is due to Spark. But Databricks (founder of Spark) shared recently during their summit keynote that the actual current usage through Spark is mostly through their Python API.
Even if performance and native functional programming beat the python spark API, there are multiple reasons why Scala is losing speed against python :
Most of the Data engineer tools (Pandas, Airflow, Cloud APIs, etc.) sit in python (and Go for K8s, Terraform…)
Data engineer work and discuss a lot with Data Analyst / Data Scientist / Machine Learning Engineer. The common denominator within these profiles is python (and SQL).
The learning curve for Scala is quite steep for starters.
What’s next
Even if these insights are interesting, we would like to add KPI around the evolution of each skill (winners/losers) in the future to be able to see the trends.
Is there anything you would like to see on the Radar? Let us know here!
Please note that these findings should not be taken for granted as 100% correct. It instead gives you knowledge of the trends. We also, on purpose, picked up Data profiles to confirm these based on our professional experience.
Mehdi OUAZZA aka mehdio 🧢
Thanks for reading! 🤗 🙌 If you enjoyed this, follow me on 🎥 Youtube, ✍️ Medium, or 🔗LinkedIn for more data/code content!
Support my writing ✍️ by joining Medium through this link